tar and decompression (Re: [BusyBox] My brain hurts again.)

Rob Landley rob at landley.net
Wed Jun 30 05:20:39 UTC 2004


On Tuesday 29 June 2004 21:44, Steve Dover wrote:

> > P.S.  Lots of decision making code is trying to figure out if we should
> > use seek_by_char or seek_by_jump.  I'd like to point out that
> > seek_by_char always works, and that seek_by_jump is a speed optimization
> > that only works for uncompressed tarballs that are read from a file and
> > not a pipe.  Is this common enough with a big enough payoff to really
> > worry about at all?
>
> My vote, (I'm old fashioned) is to go with what works and saves code
> space.  While (being old fashioned), I do untar uncompressed files,
> but, I can learn to do it the modern way.

It should still work just fine, it's just that it would read the data into a 
buffer and discard it rather than lseek forward to skip past.

Also, if you mean you do "cat file.tar.gz | gunzip | tar xv", it has to read 
and discard anyway.  You can't lseek on a pipe...

> The speed optimization is not that important to me, and besides,
> are you really un-tarring that much every hour?
>
> I'd avoid the seek_by_jump optimization then.
> If it turns out to be that needed, it could likely be added
> as a config option.

I'm not so much focusing on removing code _size_ as removing code 
_complexity_.  (For me, complexity is first, size second, speed third.  Both 
of which are balanced off against functionality, of course...)

Usually, less code is less complex, though... :)

> Regards,
> Steve

Rob
-- 
www.linucon.org: Linux Expo and Science Fiction Convention
October 8-10, 2004 in Austin Texas.  (I'm the con chair.)




More information about the busybox mailing list