implenting BB variations of standard system calls

Robert P. J. Day rpjday at mindspring.com
Thu Apr 20 10:18:43 UTC 2006


On Thu, 20 Apr 2006, Denis Vlasenko wrote:

> On Wednesday 19 April 2006 21:47, Robert P. J. Day wrote:
> > clearly, it does the same thing but that's entirely forgivable since
> > it's meant to just be a simple extension of strncpy() and it would be
> > silly to change the semantics.  but let's keep going and concentrate
> > on the routines in bb_pwd.c.
> >
> >   consider the BB-specific bb_getug() which calls safe_strncpy():
> >
> > char * bb_getug(char *buffer, char *idname, long id, int bufsize, char
> > prefix)
> > {
> >     if(bufsize > 0 ) {
> >         assert(buffer!=NULL);
> >         if(idname) {
> >             return safe_strncpy(buffer, idname, bufsize);
> >         }
> >         snprintf(buffer, bufsize, "%ld", id);
> >     } else if(bufsize < 0 && !idname) {
> >         bb_error_msg_and_die("unknown %cid %ld", prefix, id);
> >     }
> >     return idname;
> > }
> >
> >   so bb_getug() has the same annoying design redundancy but, in *this*
> > case, it's entirely unnecessary.
>
> returning an int or pointer value requires _at most_ one insn in the
> function body (on i386), sometimes even that is not needed, because
> gcc will rearrange registers such that required value ends up in eax
> register.
>
> More to it, eax is clobbered by function call (even if you call void
> fn), thus you save nothing in terms of register pressure on call
> site.
>
> Thus you won't save much by removing "pointless" return value, you
> will just annoy users who took liberty of using this feature.

careful.  i'm not making this argument in terms of register or stack
space (although i'll keep that in mind).  my point was that, given
that the bb_getug() is an internal BB routine that doesn't seem to be
trying to emulate or enhance any standard library routine, there is
little rationale for it to hang on to this cumbersome way to return a
character pointer which is little more than what you got in the first
place as part of the routine arguments.

more to the point, as i demonstrated, those BB routines seem to
*insist* on continuing to use that technique even to the point where
the calling parms to getpwuid() and bb_getpwuid() are now *totally*
different.  i think that's bad -- for the BB and non-BB variations of
*the same routine* to have such different calling semantics.  it just
introduces complexity where none was necessary.

> They will need to do this:
>
> -   newsurname = strcpy(names[i][surname_ofs+1], "Day");
> +   newsurname = names[i][surname_ofs+1];
> +   strcpy(newsurname, "Day");
>
> You repay 3 bytes you saved inside strcpy in each call site like one
> shown above.
>
> > in short, i think there should be two general rules for writing
> > internal BB routines.  1) if you're emulating a system call, try to
> > keep the same semantics as much as possible, and 2) rewrite these
> > silly routines that return nothing more than what they were passed in
> > the first place, and use those return value for meaningful error
> > codes.
> >
> >   thoughts?
>
> If you need more than one error code, it makes sense. If there only
> one (or none) error condition possible, returning NULL will work as
> good as returning integer error code.

in the first place, i don't think that's a compelling argument even if
there's only one error condition.  lots of C routines still return an
int error code even if there's only one possible error.  the
difference is that returning an int error code is *extensible* whereas
the pointer technique *isn't* (at least, not without getting into some
really disgusting code).

if, in one of these routines, you suddenly decided to recognize a
second error condition, that's easy to do if you're returning an int
error code.  if you've been using that hacky character pointer
technique, you're kind of screwed.

let's not lose sight of my original point.  i think it's proper for a
BB routine to use the same calling semantics as the non-BB routine
it's emulating or trying to enhance, since people will be familiar
with the original form and it will make sense:

  char* strncpy(char*, ...)
  char* safe_strncpy(char*, ...)

that's fine.  but i think it's unnecessarily bad design to do
something like this:

  struct passwd* getpwuid(const char* name) ;
  char *bb_getpwuid(char *name, long uid, int bufsize)

now you have people who might be familiar with the semantics of
getpwuid() looking at the semantics for bb_getpwuid() and thinking,
"what the hell?? ..."

if you choose to define a BB variation of a non-BB routine like that,
it makes sense to try to mimic the semantics.  if, however, you really
do want/need to change the semantics drastically, then give the
routine a name that makes it clear it's *different* and in no way
associated with the original.  perhaps something like
"map_username_to_uid()" or something like that.

rday

p.s.  i apologize for spending so much time nit-picking over
aesthetics and presentation but the more i go over the BB code, the
more frustrated i get.

i appreciate that BB is the product of dozens of different developers.
the problem is that that is *exactly* what it looks like.



More information about the busybox mailing list