[PATCH] unify itoa
Rob Landley
rob at landley.net
Tue Jul 11 12:42:38 PDT 2006
On Tuesday 11 July 2006 10:44 am, Denis Vlasenko wrote:
> On Tuesday 11 July 2006 02:47, Rob Landley wrote:
> > > The attached test program shows that snprintf is 2.5 times slower
> > > than utoa. Operations done per second:
> > > sprintf='12345678' count=27346
> > > utoa_to_buf='12345678' count=69880
> > > utoa='12345678' count=65934
> > >
> > > A few days ago I ran "hexdump | grep" over entire hard drive.
> > > Took a long time. I wish hexdump was optimized a bit better...
> >
> > I haven't really looked closely at hexdump, but what does a base 10 utoa
> > have to do with a base 16 hexdump?
>
> This is a real-life example of a program which does a lot of
> int -> string conversions, so that it actually mattered for me recently.
>
> hexdump is of course using base 16 conversion. tcpdump could be
> an example of a proc which does base 10 conversions, but I didn't
> use it lately.
Understood, but if I wanted to speed up hexdump | grep I'd profile both
hexdump and grep, and probably find out that our grep is dog slow (there's
the pending patch to speed up get_chomped_line_from_file() in my todo heap,
which is waiting for me to get back around to the sed NUL bug and make sure
that they play nice together).
> > The lookup table is a classic size/speed tradeoff, something BusyBox has
> > traditionally come down on the "size" side of. Doing a write() on the
> > result is going to be way slower than the actual conversion anyway. What
> > actual bottleneck are you seeing?
> >
> > I definitely agree the bounds checking needs to be added, though. I'll
> > do that now...
>
> The version in current svn produces _most significant_ digits
> if buffer is too small. IOW: utoa_to_buf_rob(100223,buf,3)=="10".
> The usual practuce is to produce "23".
If the buffer is too small it essentially produces garbage, we just don't want
it to overflow and stomp memory it doesn't own.
> How about this modified version?
...
> It's even a bit smaller. gcc is smart enough to know div instruction
> produces the remainder also, and we get n %= i for free.
Very nice. My only concern is on which platforms does division automatically
produce a remainder? I expect division to be supported everywhere, and to
perform fairly well. I have a vague recollection that in some cases
environments modulus is overlooked and (as a result) kind of evil, but I
don't remember specifics.
> I still dislike the speed, i/=10 eats too much CPU...
It's a tight L1 cache-hot loop. In a pipeline like hexdump | grep it's
unlikely to be even 1% of the total CPU overhead. What platform have you
profiled this on? I don't believe integer division has boiled down to a
subtraction loop on any interesting processor for a quarter-century (I
remember that multiplication could be done in something like 6 clock cycles
without being _that_ clever, although this is a vague recollection from my
undergraduate days circa 1993), but then I'm not entirely certain how some of
the really low-transistor-budget embedded processors (like ARM) are
implemented. I doubt they're _stupid_, though. Many of them have floating
point units, there's no reason for them to stint on integer arithmetic
operations...
Could I have some benchmark numbers on the system you're worried about,
please? I already mentioned I tried the following here:
#include <stdio.h>
// 0x4f bytes
void utoa_to_buf(unsigned n, char *buf, int buflen)
{
int i, out = 0;
for (i=1000000000; i; i/=10) {
int res = n/i;
if (res || out || i == 1) {
out++;
n -= res*i;
*buf++ = '0' + res;
}
}
*buf = 0;
}
int main(int argc, char *argv[])
{
int i;
char blah[16];
for(i=0;i<1000000;i++) {
utoa_to_buf(i, blah, 16);
}
}
Ran that under "time", and got 0.707 seconds. 1000000/.707 is 1.4 million
according to my handy dandy python interpreter (the best calculator Linux
ever shipped with). Adding bounds checking didn't slow it down _that_ much.
What do you get on the system you're worried about the performance of?
Rob
--
Never bet against the cheap plastic solution.
More information about the busybox
mailing list