svn commit: trunk/busybox

Rich Felker dalias at aerifal.cx
Thu May 11 13:33:23 PDT 2006


On Thu, May 11, 2006 at 03:30:50PM -0400, Rob Landley wrote:
> On Thursday 11 May 2006 2:50 pm, Rich Felker wrote:
> > On Thu, May 11, 2006 at 02:40:04PM -0400, Paul Fox wrote:
> > >  > >  > Just remove the 0, it's nonsense. Filenames do not have embedded
> > >  > >  > newlines in them! Anyone who creates such stupid filenames
> > >  > >  > deserves what they get.
> > >  > >
> > >  > > in this case, at least, it protects against spaces in names as
> > >  > > well, which are a lot more common than newlines.  that being
> > >  > > said, this isn't a windows or mac development tree.  okay by me
> > >  > > to drop the 0.
> > >  >
> > >  > protects spaces? how? xargs reads one filename per line, not per word.
> > >
> > > no.  what it reads is whitespace-separated.
> >
> > oh. how incredibly stupid. then just write a script to do what xargs
> > should do, or better yet, use the -exec option to find...
> 
> Or do it the way you were initially objecting to, which separates entries with 
> nul bytes rather than whitespace.
> 
> Files are are allowed to have any character except NUL and the directory 
> separator in them. Paths can have any character but NUL.  Therefore, the 
> logical character to use as a separator when you want reliable behavior, no 

However the relevant tools for dealing with filenames from scripts
process _text files_ which cannot contain NUL. Adding GNU extensions
that require tools to somehow deal with NUL (which greatly complicates
the implementation of these tools and adds to bloat) is hardly
worthwhile, considering that someone who makes a filename containing
\n is just an idiot.

IMO the -print0 option exists for administrators wanting to guard
their scripts against exploits involving filenames with embedded
newlines, not for ordinary program operating on trusted data (which
should be sanitized).

BTW, are you sure if POSIX is clear on whether control characters are
supposed to be allowed in filenames? I think implementations are
required to support the portable character set in filenames, but I
don't know whether this includes spacing control codes. IMO a sane
implementation would be to only support non-control characters that
are valid multibyte character strings (as opposed to arbitrary byte
strings).

> matter what kind of strange foreign language filesystem you're being used in 
> even when dealing with absolute paths with unknown upstream components is...

If you work correctly with relative paths you never need to see the
upstream components..

Rich



More information about the busybox mailing list