[PATCH] unify itoa

Tue Jul 11 14:44:48 UTC 2006

On Tuesday 11 July 2006 02:47, Rob Landley wrote:
> > The attached test program shows that snprintf is 2.5 times slower
> > than utoa. Operations done per second:
> > 	sprintf='12345678' count=27346
> > 	utoa_to_buf='12345678' count=69880
> > 	utoa='12345678' count=65934
> >
> > A few days ago I ran "hexdump | grep" over entire hard drive.
> > Took a long time. I wish hexdump was optimized a bit better...
> 
> I haven't really looked closely at hexdump, but what does a base 10 utoa have 
> to do with a base 16 hexdump?

This is a real-life example of a program which does a lot of
int -> string conversions, so that it actually mattered for me recently.

hexdump is of course using base 16 conversion. tcpdump could be
an example of a proc which does base 10 conversions, but I didn't
use it lately.

> The lookup table is a classic size/speed tradeoff, something BusyBox has 
> traditionally come down on the "size" side of.  Doing a write() on the result 
> is going to be way slower than the actual conversion anyway.  What actual 
> bottleneck are you seeing?
> 
> I definitely agree the bounds checking needs to be added, though.  I'll do 
> that now...

The version in current svn produces _most significant_ digits
if buffer is too small. IOW: utoa_to_buf_rob(100223,buf,3)=="10".
The usual practuce is to produce "23".

How about this modified version?

// 0x63 bytes
void utoa_to_buf_rob(unsigned n, char *buf, int buflen)
{
        unsigned i, out;
        if (buflen) {
                buflen -= 11;
                out = 0;
                for (i=1000000000; i; i/=10) {
                        unsigned res = n / i;
                        n %= i; // gcc reuses remainder produced in division
                        if (res || out || i == 1) {
                                out++;
                                if (buflen >= 0)
                                        *buf++ = '0' + res;
                        }
                        buflen++;
                }
                *buf = 0;
        }
}

It's even a bit smaller. gcc is smart enough to know div instruction
produces the remainder also, and we get n %= i for free.

I still dislike the speed, i/=10 eats too much CPU...
--
vda