[PATCH 2/2] (g)unzip: Optimize inflate_codes()

Joakim Tjernlund joakim.tjernlund at transmode.se
Sat Feb 13 09:22:09 UTC 2010


Denys Vlasenko <vda.linux at googlemail.com> wrote on 2010/02/13 04:47:19:
>
> On Friday 12 February 2010 00:46, Joakim Tjernlund wrote:
> > Denys Vlasenko <vda.linux at googlemail.com> wrote on 2010/02/12 00:28:44:
> > >
> > > On Friday 12 February 2010 00:14, Joakim Tjernlund wrote:
> > > >
> > > > Denys Vlasenko <vda.linux at googlemail.com> wrote on 2010/02/11 23:52:30:
> > > >
> > > > > From: Denys Vlasenko <vda.linux at googlemail.com>
> > > > > To: Joakim Tjernlund <joakim.tjernlund at transmode.se>
> > > > > Cc: busybox at busybox.net, Rob Landley <rob at landley.net>
> > > > > Date: 2010/02/11 23:52
> > > > > Subject: Re: [PATCH 2/2] (g)unzip: Optimize inflate_codes()
> > > > >
> > > > > On Thursday 11 February 2010 08:42, Joakim Tjernlund wrote:
> > > > > > >
> > > > > > > Great. But looks like you forgot to send the patch without this debug.
> > > > > > > The only patch I found has this:
> > > > > > >
> > > > > > > +                                       /* Align out addr */
> > > > > > > +                                       if (e < 3)
> > > > > > > +                                               fprintf(stderr, "error
> > > > > len:%d\n", e);
> > > > > >
> > > > > > Hehe, here we go then. Looking at the gzip code I think it is crap though.
> > > > > > The upstream gzip code is old and unoptimized. One should just scrap
> > > > > > it and redo it with zlib instead.
> > > > > >
> > > >
> > > > >
> > > > > It segfaults on 277 Mb gz file (a source tree of old openoffice version):
> > > >
> > > > what arch?
> > >
> > > x86-32
> > >
> > > > >
> > > > > # time ./busybox_old gunzip <OOo_2.0.2_src.tar.gz >/dev/null
> > > > >
> > > > > real    0m13.616s
> > > > > user    0m13.493s
> > > > > sys     0m0.116s
> > > > > # time ./busybox gunzip <OOo_2.0.2_src.tar.gz >/dev/null
> > > > > /bin/bash: line 1: 15635 Segmentation fault      ./busybox gunzip < OOo_2.0.
> > > > > 2_src.tar.gz > /dev/null
> > > > >
> > > > > real    0m2.198s
> > > > > user    0m2.186s
> > > > > sys     0m0.011s
> > > >
> > > > Maybe you could stick that debug printout back?
> > >
> > > # ./busybox gunzip <OOo_2.0.2_src.tar.gz | md5sum
> > > gunzip: error len:2
> > > gunzip: error len:2
> > > 85bff4b99d1516ace2e8d11bcc4718f5  -
> >
> > I was afraid this could happen. Either this .gz is compressed with an old
> > faulty gzip or my patch is buggy :(
> > Could you try to recompress the archive with latest gzip?
>
> Recompressed with gzip 1.3.12, which made archive bigger (!!?):

Oops, but that is not my fault :)

>
> # ls -l OOo_2.0.2_src.tar.gz OOo_2.0.2_src.tar1.gz
> -rw-r--r--    1 root     root     290748396 Feb 11 23:49 OOo_2.0.2_src.tar.gz
> -rw-r--r--    1 root     root     290774340 Feb 13 04:28 OOo_2.0.2_src.tar1.gz
>
> # ./busybox_old gunzip <../OOo_2.0.2_src.tar.gz | md5sum
> a1a4e036091b51aa986e4745909e7fac  -
> # ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz | md5sum
> a1a4e036091b51aa986e4745909e7fac  -
>
> Still segfaults:
>
> # ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null
> /bin/bash: line 1: 29005 Segmentation fault      ./busybox gunzip < ../OOo_2.
> 0.2_src.tar1.gz > /dev/null

I see, gzip is different than zlib so I wasn't sure.

>
>
> > I guess we must fix it either way, does this work for you?

> With this fix it works:
>
> # ./busybox gunzip <../OOo_2.0.2_src.tar1.gz | md5sum
> a1a4e036091b51aa986e4745909e7fac  -
>
> I don't see any speed difference, though:
>
> # (time ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
> real    0m13.443s
> # (time ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
> real    0m13.380s
> # (time ./busybox gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
> real    0m13.442s
> # (time ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
> real    0m13.477s
> # (time ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
> real    0m13.405s
> # (time ./busybox_old gunzip <../OOo_2.0.2_src.tar1.gz >/dev/null) 2>&1 | grep real
> real    0m13.379s

I think than modern HW will not benefit as much as slower embedded HW
do you have any?
It can also be that gzip is so unoptimized to begin with that my
patch won't matter much.
What version of gzip is bb gzip based on? I got the impression that
bb gzip also have some local hacks.

 Jocke



More information about the busybox mailing list