[PATCH 2/2] (g)unzip: Optimize inflate_codes()
Denys Vlasenko
vda.linux at googlemail.com
Thu Feb 11 22:52:30 UTC 2010
On Thursday 11 February 2010 08:42, Joakim Tjernlund wrote:
> >
> > Great. But looks like you forgot to send the patch without this debug.
> > The only patch I found has this:
> >
> > + /* Align out addr */
> > + if (e < 3)
> > + fprintf(stderr, "error len:%d\n", e);
>
> Hehe, here we go then. Looking at the gzip code I think it is crap though.
> The upstream gzip code is old and unoptimized. One should just scrap
> it and redo it with zlib instead.
>
> Jocke
>
> From cb65f50ec0599ec40677b655149491a2918133bc Mon Sep 17 00:00:00 2001
> From: Joakim Tjernlund <Joakim.Tjernlund at transmode.se>
> Date: Mon, 8 Feb 2010 18:46:38 +0100
> Subject: [PATCH] (g)unzip: Optimize inflate_codes()
>
> Ported the recent optimization from the Linux kernel.
> This will not perform as god as the kernel version as the
> code structure in busybox is different and I had to adopt
> the optimization to it.
>
> This has seen very little testing and is a RFC only at this point.
> The inflate speed increase in the kernel was 12-15% on ppc.
> ---
>
> V2: Optimize size a bit:
> function old new delta
> inflate_codes 624 735 +111
>
> archival/libunarchive/decompress_unzip.c | 42 ++++++++++++++++++++++++++++-
> 1 files changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/archival/libunarchive/decompress_unzip.c b/archival/libunarchive/decompress_unzip.c
> index c616202..111d7fc 100644
> --- a/archival/libunarchive/decompress_unzip.c
> +++ b/archival/libunarchive/decompress_unzip.c
> @@ -589,11 +589,49 @@ static NOINLINE int inflate_codes(STATE_PARAM_ONLY)
> w += e;
> dd += e;
> } else {
> + unsigned short *sout;
> + unsigned int loops;
> + union uu {
> + unsigned short us;
> + unsigned char b[2];
> + } mm;
> /* do it slow to avoid memcpy() overlap */
> /* !NOMEMCPY */
> - do {
> + /* minimum length is three */
> + /* Align out addr */
> + if (w & 1) {
> + gunzip_window[w++] = gunzip_window[dd++];
> + --e;
> + }
> + /* Use pre increment as it is faster on some arch's */
> + sout = (unsigned short *) (gunzip_window + w - 2);
> + if (delta > 2) {
> + unsigned short *sfrom;
> +
> + sfrom = (unsigned short *) (gunzip_window + dd - 2);
> + loops = e >> 1;
> + do
> + move_from_unaligned16(*++sout, ++sfrom);
> + while (--loops);
> + } else {
> + unsigned short pat16;
> +
> + pat16 = *sout;
> + if (delta == 1) {
> + /* copy one char pattern to both bytes */
> + mm.us = pat16;
> + mm.b[0] = mm.b[1];
> + pat16 = mm.us;
> + }
It segfaults on 277 Mb gz file (a source tree of old openoffice version):
# time ./busybox_old gunzip <OOo_2.0.2_src.tar.gz >/dev/null
real 0m13.616s
user 0m13.493s
sys 0m0.116s
# time ./busybox gunzip <OOo_2.0.2_src.tar.gz >/dev/null
/bin/bash: line 1: 15635 Segmentation fault ./busybox gunzip < OOo_2.0.2_src.tar.gz > /dev/null
real 0m2.198s
user 0m2.186s
sys 0m0.011s
--
vda
More information about the busybox
mailing list