[Bug 15036] New: [PATCH] Awk: fix bitwise functions when operating with large numbers
bugzilla at busybox.net
bugzilla at busybox.net
Sat Oct 8 17:45:50 UTC 2022
https://bugs.busybox.net/show_bug.cgi?id=15036
Bug ID: 15036
Summary: [PATCH] Awk: fix bitwise functions when operating with
large numbers
Product: Busybox
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: P5
Component: Other
Assignee: unassigned at busybox.net
Reporter: caribpa at outlook.com
CC: busybox-cvs at busybox.net
Target Milestone: ---
Created attachment 9376
--> https://bugs.busybox.net/attachment.cgi?id=9376&action=edit
awk-bitwiseop-fix-arch64.patch
Hi there!
While working on a small awk program in an arm64 I found that bitwise
operations are broken when operating with large numbers.
Looking under the hood I found that awk numbers are doubles[1], whereas bitwise
operations are performed over unsigned longs[2].
The problem:
- double is typically 2^53
- unsigned long is 2^32 in 32bit archs
- unsigned long is typically 2^64 in 64bit archs
So, the result of a unsigned long bitwise operation is stored on a double
This means that data is lost in 64bit archs that use 64bit unsigned longs when
the result is greater than 2^53. For example, operating with a simple compl(0)
on an arm64 or x64 Linux generates unexpected results:
awk 'BEGIN{print compl(0)%4}'
It returns 0 instead of 3.
But it works on GNU Awk, why?
Well, apparently all gawk bitwise operations return the result of a function
called make_integer[3] which in turn calls another function that fixes the
issue I described above: adjust_uint[4].
adjust_uint basically truncates sizes greater than 2^53 (like 2^64 unsigned
long) to 2^53 from the left, preserving low order bits.
So I went ahead and shamelessly copied adjust_uint into Busybox Awk and it
worked!
And here I am submitting a patch with the changes adapted to Busybox :)
This adaptation includes:
- Replacing uintmax_t with unsigned long on adjust_uint and the
count_trailing_zeros helper, as the result of bitwise operations on Busybox is
unsigned long
- Replacing GCC __builtin_ctzll (unsigned long long) with GCC __builtin_ctzl
(unsigned long)
- Including float.h for the FLT_RADIX macro
- Removing some macros that adapt adjust_uint when gawk numbers are long
doubles in some platforms
- Renaming some macros and their mention in the original gawk comments
Cheers,
Carlos
[1] -
https://git.busybox.net/busybox/tree/editors/awk.c?id=c8c1fcdba163f264a503380bc63485aacd09214c#n123
[2] -
https://git.busybox.net/busybox/tree/editors/awk.c?id=c8c1fcdba163f264a503380bc63485aacd09214c#n1048
[3] -
https://git.savannah.gnu.org/cgit/gawk.git/tree/builtin.c?id=d434cb3ce61e0cc5e26180da914f1a58223897a2#n3565
[4] -
https://git.savannah.gnu.org/cgit/gawk.git/tree/floatcomp.c?id=d434cb3ce61e0cc5e26180da914f1a58223897a2#n91
--
You are receiving this mail because:
You are on the CC list for the bug.
More information about the busybox-cvs
mailing list