From bugzilla at busybox.net Mon Nov 1 17:32:57 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Mon, 01 Nov 2021 17:32:57 +0000 Subject: [Bug 14306] New: ash: incorrect tilde expansion Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14306 Bug ID: 14306 Summary: ash: incorrect tilde expansion Product: Busybox Version: 1.33.x Hardware: All OS: Linux Status: NEW Severity: normal Priority: P5 Component: Standard Compliance Assignee: unassigned at busybox.net Reporter: dg+busybox at atufi.org CC: busybox-cvs at busybox.net Target Milestone: --- On busybox 1.33.1, tilde expansion incorrectly alters words when the tilde-prefix matches no valid login name (note the missing ending slash): $ ash -c 'echo ~~nouser/' ~~nouser $ bash -c 'echo ~~nouser/' ~~nouser/ -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Wed Nov 3 04:12:17 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Wed, 03 Nov 2021 04:12:17 +0000 Subject: [Bug 14316] New: get_free_loop needs waiting Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14316 Bug ID: 14316 Summary: get_free_loop needs waiting Product: Busybox Version: 1.33.x Hardware: PC OS: Linux Status: NEW Severity: major Priority: P5 Component: Other Assignee: unassigned at busybox.net Reporter: aswjh at 163.com CC: busybox-cvs at busybox.net Target Milestone: --- libbb/loop.c: set_loop Sometimes loop device is not ready arter get_free_loop, raise "can't setup loop device: No such file or directory". It will be ok if usleep before "goto open_lfd": try = xasprintf(LOOP_FORMAT, i); for (lc=0; lc<100; lc++) { if (stat(try, &buf2)==0) break; usleep(20); } goto open_lfd; -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Wed Nov 3 04:23:16 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Wed, 03 Nov 2021 04:23:16 +0000 Subject: [Bug 14316] get_free_loop needs waiting In-Reply-To: References: Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14316 --- Comment #1 from wjh --- Linux box 5.3.11-tinycore64 #1 SMP Wed Nov 20 08:16:37 CST 2019 x86_64 GNU/Linux -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Fri Nov 5 00:16:49 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Fri, 05 Nov 2021 00:16:49 +0000 Subject: [Bug 14326] New: [PATCH] pkill: add -e to display the name and PID of the process being killed Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14326 Bug ID: 14326 Summary: [PATCH] pkill: add -e to display the name and PID of the process being killed Product: Busybox Version: unspecified Hardware: All OS: Linux Status: NEW Severity: enhancement Priority: P5 Component: Other Assignee: unassigned at busybox.net Reporter: sautier.louis at gmail.com CC: busybox-cvs at busybox.net Target Milestone: --- Created attachment 9146 --> https://bugs.busybox.net/attachment.cgi?id=9146&action=edit 0001-pkill-add-e-to-display-the-name-and-PID-of-the-proce.patch Hello, I found this pkill feature very useful so I implemented it. Please let me know if the attached patch is OK. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sun Nov 7 19:29:45 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sun, 07 Nov 2021 19:29:45 +0000 Subject: =?UTF-8?B?W0J1ZyAxNDMzMV0gTmV3OiB0ciBkb2VzbuKAmXQgdW5kZXJzdGFu?= =?UTF-8?B?ZCBbOmNsYXNzOl0gY2hhcmFjdGVyIGNsYXNzZXM=?= Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14331 Bug ID: 14331 Summary: tr doesn?t understand [:class:] character classes Product: Busybox Version: 1.30.x Hardware: All OS: All Status: NEW Severity: critical Priority: P5 Component: Standard Compliance Assignee: unassigned at busybox.net Reporter: calestyo at scientia.org CC: busybox-cvs at busybox.net Target Milestone: --- Hey. Unlike mandated by POSIX: https://pubs.opengroup.org/onlinepubs/9699919799/utilities/tr.html busybox' tr doesn't seem to understand any of the character classes,... and I'd guess neither the other formats given in the EXTENDED DESCRIPTION of POSIX. Not only does it not understand this, but it even takes such characters literal so e.g. when using busybox tr -d '[:alpha:]' it will remove 'a' and so on. Cheers, Chris. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sun Nov 7 21:30:24 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sun, 07 Nov 2021 21:30:24 +0000 Subject: =?UTF-8?B?W0J1ZyAxNDMzMV0gdHIgZG9lc27igJl0IHVuZGVyc3RhbmQgWzpj?= =?UTF-8?B?bGFzczpdIGNoYXJhY3RlciBjbGFzc2Vz?= In-Reply-To: References: Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14331 --- Comment #1 from Ron Yorston --- Character classes should work with the default build configuration, though they can be disabled by turning off FEATURE_TR_CLASSES. Is it possible that's the case for the binary you're using? -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sun Nov 7 22:06:08 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sun, 07 Nov 2021 22:06:08 +0000 Subject: =?UTF-8?B?W0J1ZyAxNDMzMV0gdHIgZG9lc27igJl0IHVuZGVyc3RhbmQgWzpj?= =?UTF-8?B?bGFzczpdIGNoYXJhY3RlciBjbGFzc2Vz?= In-Reply-To: References: Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14331 Christoph Anton Mitterer changed: What |Removed |Added ---------------------------------------------------------------------------- Resolution|--- |INVALID Status|NEW |RESOLVED --- Comment #2 from Christoph Anton Mitterer --- Indeed, Debian seems to have disabled this. Sorry for the noise. Thanks, Chris. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sun Nov 7 22:55:08 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sun, 07 Nov 2021 22:55:08 +0000 Subject: =?UTF-8?B?W0J1ZyAxNDMzMV0gdHIgZG9lc27igJl0IHVuZGVyc3RhbmQgWzpj?= =?UTF-8?B?bGFzczpdIGNoYXJhY3RlciBjbGFzc2Vz?= In-Reply-To: References: Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14331 --- Comment #3 from Christoph Anton Mitterer --- Just for the records, forwarded downstream to: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=998803 -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sun Nov 7 23:44:35 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sun, 07 Nov 2021 23:44:35 +0000 Subject: [Bug 14336] New: busybox sed differs from GNU sed with respect to NUL (0x00) Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14336 Bug ID: 14336 Summary: busybox sed differs from GNU sed with respect to NUL (0x00) Product: Busybox Version: 1.30.x Hardware: All OS: All Status: NEW Severity: normal Priority: P5 Component: Other Assignee: unassigned at busybox.net Reporter: calestyo at scientia.org CC: busybox-cvs at busybox.net Target Milestone: --- Hey. Not sure whether this is a "bug" or just something not defined by POSIX (I'm not really sure whether POSIX says anything with respect to sed and NUL),... at least it doesn't seem to be a configure option this time. I've noted a differing behaviour between busybox' sed and GNU sed with respect to 0x00: It seems that GNU sed, leaves any 0x00 (as well as other "binary" characters) in the current line and respects it when matching. busybox' sed OTOH, doesn't do this but seems to terminate the string upon such 0x00. Example Files: $ hd test-with-0x00 00000000 66 6f 6f 0a 62 61 72 0a 7a 65 72 00 0a 62 61 7a |foo.bar.zer..baz| 00000010 0a 7a 65 72 00 0a 65 6e 64 0a |.zer..end.| 0000001a $ hd test-with-lone-0x00 00000000 66 6f 6f 0a 62 61 72 0a 00 0a 62 61 7a 0a 7a 65 |foo.bar...baz.ze| 00000010 72 00 0a 65 6e 64 0a |r..end.| 00000017 $ hd test-with-0x02-and-0x00 00000000 66 6f 6f 0a 62 61 72 0a 7a 65 02 00 0a 62 61 7a |foo.bar.ze...baz| 00000010 0a 7a 65 72 00 0a 65 6e 64 0a |.zer..end.| 0000001a $ hd test-with-0x00-followed-by-alpha 00000000 66 6f 6f 0a 62 61 72 0a 7a 65 72 00 6f 6f 0a 62 |foo.bar.zer.oo.b| 00000010 61 7a 0a 7a 65 72 00 74 74 0a 65 6e 64 0a |az.zer.tt.end.| 0000001e GNU sed: $ sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-0x00 | hd 00000000 7a 65 72 00 0a |zer..| 00000005 $ sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-lone-0x00 | hd 00000000 00 0a |..| 00000002 $ sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-0x02-and-0x00 | hd 00000000 7a 65 02 00 0a |ze...| 00000005 $ sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-0x00-followed-by-alpha | hd 00000000 7a 65 72 00 6f 6f 0a |zer.oo.| 00000007 (Note that GNU sed's -z option is NOT used.) busybox' sed: $ busybox sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-0x00 | hd $ busybox sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-lone-0x00 | hd $ busybox sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-0x02-and-0x00 | hd $ busybox sed -n '0,/[^[:alnum:][:space:][:punct:]]/{/[^[:alnum:][:space:][:punct:]]/p}' test-with-0x00-followed-by-alpha | hd $ So it seems that busybox' sed simply does the matching till the 0x00 (which is perhaps used as string terminator), while GNU sed goes fully down the end of line (\n). Though it's worth to bring this to your attention. Cheers, Chris. -- You are receiving this mail because: You are on the CC list for the bug. From vda.linux at googlemail.com Tue Nov 9 12:51:22 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Tue, 9 Nov 2021 13:51:22 +0100 Subject: [git commit] which: add -a to help text Message-ID: <20211109124747.1A5A48B7EE@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=15f7d618ea7f8c3a0277c98309268b709e20d77c branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta packed_usage 34075 34079 +4 Signed-off-by: Denys Vlasenko --- debianutils/which.c | 5 +++-- 1 file changed, 3 insertions(+), 2 deletions(-) diff --git a/debianutils/which.c b/debianutils/which.c index b9f1b92fd..23692dc6f 100644 --- a/debianutils/which.c +++ b/debianutils/which.c @@ -17,9 +17,10 @@ //kbuild:lib-$(CONFIG_WHICH) += which.o //usage:#define which_trivial_usage -//usage: "COMMAND..." +//usage: "[-a] COMMAND..." //usage:#define which_full_usage "\n\n" -//usage: "Locate COMMAND" +//usage: "Locate COMMAND\n" +//usage: "\n -a Show all matches" //usage: //usage:#define which_example_usage //usage: "$ which login\n" From bugzilla at busybox.net Wed Nov 10 08:18:13 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Wed, 10 Nov 2021 08:18:13 +0000 Subject: =?UTF-8?B?W0J1ZyAxNDM0MV0gTmV3OiBCdXN5Qm94IOKAkyAxNCBuZXcgdnVs?= =?UTF-8?B?bmVyYWJpbGl0aWVz?= Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14341 Bug ID: 14341 Summary: BusyBox ? 14 new vulnerabilities Product: Busybox Version: 1.33.x Hardware: All OS: Linux Status: NEW Severity: normal Priority: P5 Component: Other Assignee: unassigned at busybox.net Reporter: xiechengliang1 at huawei.com CC: busybox-cvs at busybox.net Target Milestone: --- The jfrog website has disclosed 14 vulnerabilities, and fixed in version buysbox 1.34.0. but I can?t find which is the repair commit for each CVE. who can help me? Reference: https://jfrog.com/blog/unboxing-busybox-14-new-vulnerabilities-uncovered-by-claroty-and-jfrog/ -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sat Nov 20 00:17:12 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sat, 20 Nov 2021 00:17:12 +0000 Subject: [Bug 14361] New: udhcpc ignores T1 and T2 values Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14361 Bug ID: 14361 Summary: udhcpc ignores T1 and T2 values Product: Busybox Version: unspecified Hardware: All OS: Linux Status: NEW Severity: normal Priority: P5 Component: Networking Assignee: unassigned at busybox.net Reporter: luke-jr+busyboxbugs at utopios.org CC: busybox-cvs at busybox.net Target Milestone: --- It seems the code just hard-coded T1 at half the lease time https://git.busybox.net/busybox/tree/networking/udhcp/dhcpc.c?h=1_34_stable#n1802 Values provided by the DHCP server (opt 58, 59) just get ignored... Use case: I issue 24 hour leases, and want the lease to be used that long if necessary (eg, if router is down), but I also want DHCP renewals every minute so: 1) I can rapidly give static IP leases and have the clients pick up on them; and 2) router reboots can quickly rebuild their lease table without persistent state. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sat Nov 20 11:33:22 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sat, 20 Nov 2021 11:33:22 +0000 Subject: [Bug 13736] LABEL/UUID mount in fstab doesn't work In-Reply-To: References: Message-ID: https://bugs.busybox.net/show_bug.cgi?id=13736 --- Comment #1 from stsp --- Ironically, I reported the same bug to util-linux and it was fixed: https://github.com/util-linux/util-linux/issues/1492 I wonder if busybox's mount also uses libblkid from util-linux, so maybe in busybox this is now fixed too? Or the same bug in an entirely different code bases? -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Sun Nov 21 10:42:50 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Sun, 21 Nov 2021 10:42:50 +0000 Subject: [Bug 13736] LABEL/UUID mount in fstab doesn't work In-Reply-To: References: Message-ID: https://bugs.busybox.net/show_bug.cgi?id=13736 --- Comment #2 from Fabrice Fontaine --- >From my understanding, this is a different code base: https://git.busybox.net/busybox/tree/util-linux -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Tue Nov 23 17:34:49 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Tue, 23 Nov 2021 17:34:49 +0000 Subject: [Bug 14376] New: Tar component in busybox version 1.34.1 has a memory leak bug when trying to unpack a tar file. Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14376 Bug ID: 14376 Summary: Tar component in busybox version 1.34.1 has a memory leak bug when trying to unpack a tar file. Product: Busybox Version: unspecified Hardware: All OS: Linux Status: NEW Severity: major Priority: P5 Component: Other Assignee: unassigned at busybox.net Reporter: spwpun at gmail.com CC: busybox-cvs at busybox.net Target Milestone: --- Created attachment 9156 --> https://bugs.busybox.net/attachment.cgi?id=9156&action=edit try to unpack this file with cmds above. Hi~ In libbb/xfuncs_printf.c:50, malloc twice for archive_handle and archive_hadle->fileheader with 184 and 72 bytes heap space. Back to tar_main function, the two pointers(tar_handle?tar_handle->file_header) hasn't been freed when return. Complie cmds: ``` make O=/path/to/build defconfig make O=/path/to/build menuconfig # and choice ASAN options cd /path/to/build && make -j4 ``` Reproduce cmd: ``` ./busybox_unstripped tar -xf test.tar ``` Backtarce in gdb: ``` [#0] 0x555555e7022e ? tar_main(argc=0x3, argv=0x7fffffffe430) [#1] 0x555555b06aac ? run_applet_no_and_exit(applet_no=0x148, name=0x7fffffffe709 "tar", argv=0x7fffffffe430) [#2] 0x555555b06b6b ? run_applet_and_exit(name=0x7fffffffe709 "tar", argv=0x7fffffffe430) [#3] 0x555555b067cf ? busybox_main(argv=0x7fffffffe430) [#4] 0x555555b06b29 ? run_applet_and_exit(name=0x7fffffffe6f6 "busybox_unstripped", argv=0x7fffffffe428) [#5] 0x555555b06cbf ? main(argc=0x4, argv=0x7fffffffe428) ``` LeakSanitizer log: ``` ================================================================= ==120986==ERROR: LeakSanitizer: detected memory leaks Direct leak of 184 byte(s) in 1 object(s) allocated from: #0 0x7efda806bb40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40) #1 0x555577ed8987 in xmalloc /home/zy/packages/dhcp-targets/busybox-1.34.1/libbb/xfuncs_printf.c:50 Indirect leak of 72 byte(s) in 1 object(s) allocated from: #0 0x7efda806bb40 in __interceptor_malloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xdeb40) #1 0x555577ed8987 in xmalloc /home/zy/packages/dhcp-targets/busybox-1.34.1/libbb/xfuncs_printf.c:50 SUMMARY: AddressSanitizer: 256 byte(s) leaked in 2 allocation(s). ``` -- You are receiving this mail because: You are on the CC list for the bug. From vda.linux at googlemail.com Tue Nov 23 04:31:30 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Tue, 23 Nov 2021 05:31:30 +0100 Subject: [git commit branch/1_33_stable] unlzma: fix a case where we could read before beginning of buffer Message-ID: <20211124132354.46FE78F1C6@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=d326be2850ea2bd78fe2c22d6c45c3b861d82937 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/1_33_stable Testcase: 21 01 01 00 00 00 00 00 e7 01 01 01 ef 00 df b6 00 17 02 10 11 0f ff 00 16 00 00 Unfortunately, the bug is not reliably causing a segfault, the behavior depends on what's in memory before the buffer. function old new delta unpack_lzma_stream 2762 2768 +6 Signed-off-by: Denys Vlasenko (cherry picked from commit 04f052c56ded5ab6a904e3a264a73dc0412b2e78) --- archival/libarchive/decompress_unlzma.c | 5 ++++- testsuite/unlzma.tests | 17 +++++++++++++---- testsuite/unlzma_issue_3.lzma | Bin 0 -> 27 bytes 3 files changed, 17 insertions(+), 5 deletions(-) diff --git a/archival/libarchive/decompress_unlzma.c b/archival/libarchive/decompress_unlzma.c index 0744f231a..fb5aac8fe 100644 --- a/archival/libarchive/decompress_unlzma.c +++ b/archival/libarchive/decompress_unlzma.c @@ -290,8 +290,11 @@ unpack_lzma_stream(transformer_state_t *xstate) uint32_t pos; pos = buffer_pos - rep0; - if ((int32_t)pos < 0) + if ((int32_t)pos < 0) { pos += header.dict_size; + if ((int32_t)pos < 0) + goto bad; + } match_byte = buffer[pos]; do { int bit; diff --git a/testsuite/unlzma.tests b/testsuite/unlzma.tests index 0e98afe09..fcc6e9441 100755 --- a/testsuite/unlzma.tests +++ b/testsuite/unlzma.tests @@ -8,14 +8,23 @@ # Damaged encrypted streams testing "unlzma (bad archive 1)" \ - "unlzma /dev/null; echo \$?" \ -"1 + "unlzma &1 >/dev/null; echo \$?" \ +"unlzma: corrupted data +1 " "" "" # Damaged encrypted streams testing "unlzma (bad archive 2)" \ - "unlzma /dev/null; echo \$?" \ -"1 + "unlzma &1 >/dev/null; echo \$?" \ +"unlzma: corrupted data +1 +" "" "" + +# Damaged encrypted streams +testing "unlzma (bad archive 3)" \ + "unlzma &1 >/dev/null; echo \$?" \ +"unlzma: corrupted data +1 " "" "" exit $FAILCOUNT diff --git a/testsuite/unlzma_issue_3.lzma b/testsuite/unlzma_issue_3.lzma new file mode 100644 index 000000000..cc60f29e4 Binary files /dev/null and b/testsuite/unlzma_issue_3.lzma differ From vda.linux at googlemail.com Tue Nov 23 04:31:30 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Tue, 23 Nov 2021 05:31:30 +0100 Subject: [git commit branch/1_33_stable] ash: parser: Fix VSLENGTH parsing with trailing garbage Message-ID: <20211124132354.54F288F1C7@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=5b939a6d290651bcd836083d2a3e6fa6ff7bc636 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/1_33_stable Let's adopt Herbert Xu's patch, not waiting for it to reach dash git: hush already has a similar fix. Signed-off-by: Denys Vlasenko (cherry picked from commit 53a7a9cd8c15d64fcc2278cf8981ba526dfbe0d2) --- shell/ash.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/shell/ash.c b/shell/ash.c index a33ab0626..1ca45f9c1 100644 --- a/shell/ash.c +++ b/shell/ash.c @@ -12635,7 +12635,7 @@ parsesub: { do { STPUTC(c, out); c = pgetc_eatbnl(); - } while (!subtype && isdigit(c)); + } while ((subtype == 0 || subtype == VSLENGTH) && isdigit(c)); } else if (c != '}') { /* $[{[#]][}] */ int cc = c; @@ -12665,11 +12665,6 @@ parsesub: { } else goto badsub; - if (c != '}' && subtype == VSLENGTH) { - /* ${#VAR didn't end with } */ - goto badsub; - } - if (subtype == 0) { static const char types[] ALIGN1 = "}-+?="; /* ${VAR...} but not $VAR or ${#VAR} */ @@ -12726,6 +12721,8 @@ parsesub: { #endif } } else { + if (subtype == VSLENGTH && c != '}') + subtype = 0; badsub: pungetc(); } From vda.linux at googlemail.com Wed Nov 24 13:27:03 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Wed, 24 Nov 2021 14:27:03 +0100 Subject: [git commit branch/1_33_stable] Bump version to 1.33.2 Message-ID: <20211124132354.794FF8F1C6@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=db726ae0c61ffec6b58e19749e0c63aaaf4f6989 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/1_33_stable Signed-off-by: Denys Vlasenko --- Makefile | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Makefile b/Makefile index 35d1589cb..5af09b38c 100644 --- a/Makefile +++ b/Makefile @@ -1,6 +1,6 @@ VERSION = 1 PATCHLEVEL = 33 -SUBLEVEL = 1 +SUBLEVEL = 2 EXTRAVERSION = NAME = Unnamed From vda.linux at googlemail.com Tue Nov 23 04:31:30 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Tue, 23 Nov 2021 05:31:30 +0100 Subject: [git commit branch/1_33_stable] hush: fix handling of "cmd && &" Message-ID: <20211124132354.70F2A8F1C7@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=bb612052900542046ce75e61a4e0b030c946984b branch: https://git.busybox.net/busybox/commit/?id=refs/heads/1_33_stable function old new delta done_pipe 213 231 +18 Signed-off-by: Denys Vlasenko (cherry picked from commit 83a4967e50422867f340328d404994553e56b839) --- shell/hush.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-) diff --git a/shell/hush.c b/shell/hush.c index 249728b9d..41a4653ea 100644 --- a/shell/hush.c +++ b/shell/hush.c @@ -3694,9 +3694,10 @@ static void debug_print_tree(struct pipe *pi, int lvl) pin = 0; while (pi) { - fdprintf(2, "%*spipe %d %sres_word=%s followup=%d %s\n", + fdprintf(2, "%*spipe %d #cmds:%d %sres_word=%s followup=%d %s\n", lvl*2, "", pin, + pi->num_cmds, (IF_HAS_KEYWORDS(pi->pi_inverted ? "! " :) ""), RES[pi->res_word], pi->followup, PIPE[pi->followup] @@ -3839,6 +3840,9 @@ static void done_pipe(struct parse_context *ctx, pipe_style type) #endif /* Replace all pipes in ctx with one newly created */ ctx->list_head = ctx->pipe = pi; + /* for cases like "cmd && &", do not be tricked by last command + * being null - the entire {...} & is NOT null! */ + not_null = 1; } else { no_conv: ctx->pipe->followup = type; From vda.linux at googlemail.com Tue Nov 23 04:31:30 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Tue, 23 Nov 2021 05:31:30 +0100 Subject: [git commit branch/1_33_stable] hush: fix handling of \^C and "^C" Message-ID: <20211124132354.643E18F1C6@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=56a335378ac100d51c30b21eee499a2effa37fba branch: https://git.busybox.net/busybox/commit/?id=refs/heads/1_33_stable function old new delta parse_stream 2238 2252 +14 encode_string 243 256 +13 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 2/0 up/down: 27/0) Total: 27 bytes Signed-off-by: Denys Vlasenko (cherry picked from commit 1b7a9b68d0e9aa19147d7fda16eb9a6b54156985) --- shell/ash_test/ash-misc/control_char3.right | 1 + shell/ash_test/ash-misc/control_char3.tests | 2 ++ shell/ash_test/ash-misc/control_char4.right | 1 + shell/ash_test/ash-misc/control_char4.tests | 2 ++ shell/hush.c | 11 +++++++++++ shell/hush_test/hush-misc/control_char3.right | 1 + shell/hush_test/hush-misc/control_char3.tests | 2 ++ shell/hush_test/hush-misc/control_char4.right | 1 + shell/hush_test/hush-misc/control_char4.tests | 2 ++ 9 files changed, 23 insertions(+) diff --git a/shell/ash_test/ash-misc/control_char3.right b/shell/ash_test/ash-misc/control_char3.right new file mode 100644 index 000000000..283e02cbb --- /dev/null +++ b/shell/ash_test/ash-misc/control_char3.right @@ -0,0 +1 @@ +SHELL: line 1: : not found diff --git a/shell/ash_test/ash-misc/control_char3.tests b/shell/ash_test/ash-misc/control_char3.tests new file mode 100755 index 000000000..4359db3f3 --- /dev/null +++ b/shell/ash_test/ash-misc/control_char3.tests @@ -0,0 +1,2 @@ +# (set argv0 to "SHELL" to avoid "/path/to/shell: blah" in error messages) +$THIS_SH -c '\' SHELL diff --git a/shell/ash_test/ash-misc/control_char4.right b/shell/ash_test/ash-misc/control_char4.right new file mode 100644 index 000000000..2bf18e684 --- /dev/null +++ b/shell/ash_test/ash-misc/control_char4.right @@ -0,0 +1 @@ +SHELL: line 1: -: not found diff --git a/shell/ash_test/ash-misc/control_char4.tests b/shell/ash_test/ash-misc/control_char4.tests new file mode 100755 index 000000000..48010f154 --- /dev/null +++ b/shell/ash_test/ash-misc/control_char4.tests @@ -0,0 +1,2 @@ +# (set argv0 to "SHELL" to avoid "/path/to/shell: blah" in error messages) +$THIS_SH -c '"-"' SHELL diff --git a/shell/hush.c b/shell/hush.c index 9fead37da..249728b9d 100644 --- a/shell/hush.c +++ b/shell/hush.c @@ -5235,6 +5235,11 @@ static int encode_string(o_string *as_string, } #endif o_addQchr(dest, ch); + if (ch == SPECIAL_VAR_SYMBOL) { + /* Convert "^C" to corresponding special variable reference */ + o_addchr(dest, SPECIAL_VAR_QUOTED_SVS); + o_addchr(dest, SPECIAL_VAR_SYMBOL); + } goto again; #undef as_string } @@ -5346,6 +5351,11 @@ static struct pipe *parse_stream(char **pstring, if (ch == '\n') continue; /* drop \, get next char */ nommu_addchr(&ctx.as_string, '\\'); + if (ch == SPECIAL_VAR_SYMBOL) { + nommu_addchr(&ctx.as_string, ch); + /* Convert \^C to corresponding special variable reference */ + goto case_SPECIAL_VAR_SYMBOL; + } o_addchr(&ctx.word, '\\'); if (ch == EOF) { /* Testcase: eval 'echo Ok\' */ @@ -5670,6 +5680,7 @@ static struct pipe *parse_stream(char **pstring, /* Note: nommu_addchr(&ctx.as_string, ch) is already done */ switch (ch) { + case_SPECIAL_VAR_SYMBOL: case SPECIAL_VAR_SYMBOL: /* Convert raw ^C to corresponding special variable reference */ o_addchr(&ctx.word, SPECIAL_VAR_SYMBOL); diff --git a/shell/hush_test/hush-misc/control_char3.right b/shell/hush_test/hush-misc/control_char3.right new file mode 100644 index 000000000..94b4f8699 --- /dev/null +++ b/shell/hush_test/hush-misc/control_char3.right @@ -0,0 +1 @@ +hush: can't execute '': No such file or directory diff --git a/shell/hush_test/hush-misc/control_char3.tests b/shell/hush_test/hush-misc/control_char3.tests new file mode 100755 index 000000000..4359db3f3 --- /dev/null +++ b/shell/hush_test/hush-misc/control_char3.tests @@ -0,0 +1,2 @@ +# (set argv0 to "SHELL" to avoid "/path/to/shell: blah" in error messages) +$THIS_SH -c '\' SHELL diff --git a/shell/hush_test/hush-misc/control_char4.right b/shell/hush_test/hush-misc/control_char4.right new file mode 100644 index 000000000..698e21427 --- /dev/null +++ b/shell/hush_test/hush-misc/control_char4.right @@ -0,0 +1 @@ +hush: can't execute '-': No such file or directory diff --git a/shell/hush_test/hush-misc/control_char4.tests b/shell/hush_test/hush-misc/control_char4.tests new file mode 100755 index 000000000..48010f154 --- /dev/null +++ b/shell/hush_test/hush-misc/control_char4.tests @@ -0,0 +1,2 @@ +# (set argv0 to "SHELL" to avoid "/path/to/shell: blah" in error messages) +$THIS_SH -c '"-"' SHELL From vda at busybox.net Wed Nov 24 13:27:03 2021 From: vda at busybox.net (Denys Vlasenko) Date: Wed, 24 Nov 2021 14:27:03 +0100 Subject: [tag/1_33_2] new tag created Message-ID: <20211124132406.675128F1C7@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=db726ae0c61ffec6b58e19749e0c63aaaf4f6989 tag: https://git.busybox.net/busybox/commit/?id=refs/tags/1_33_2 Bump version to 1.33.2 From bugzilla at busybox.net Thu Nov 25 11:44:57 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Thu, 25 Nov 2021 11:44:57 +0000 Subject: [Bug 14381] New: busybox awk '$2 == var' can fail to give only lines with given search string Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14381 Bug ID: 14381 Summary: busybox awk '$2 == var' can fail to give only lines with given search string Product: Busybox Version: unspecified Hardware: All OS: Linux Status: NEW Severity: major Priority: P5 Component: Standard Compliance Assignee: unassigned at busybox.net Reporter: ricercar at tuta.io CC: busybox-cvs at busybox.net Target Milestone: --- Created attachment 9161 --> https://bugs.busybox.net/attachment.cgi?id=9161&action=edit .config Version:1.34.1 Expected: this code to only output second line, but it outputs both lines: > printf "8 0091\n9 0133\n"|~/busybox-1.34.1/busybox awk '$2 == 0133' -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Thu Nov 25 11:51:37 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Thu, 25 Nov 2021 11:51:37 +0000 Subject: [Bug 14386] New: ls -sh does not show human readable size Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14386 Bug ID: 14386 Summary: ls -sh does not show human readable size Product: Busybox Version: unspecified Hardware: All OS: Linux Status: NEW Severity: minor Priority: P5 Component: Standard Compliance Assignee: unassigned at busybox.net Reporter: ricercar at tuta.io CC: busybox-cvs at busybox.net Target Milestone: --- Created attachment 9166 --> https://bugs.busybox.net/attachment.cgi?id=9166&action=edit .config Version: 1.34.1 and 1.33.1 Expected: Human readable size, but code below show same output as ls -s (only "total:" is human readable): ~/busybox-1.34.1/busybox ls -sh -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Thu Nov 25 15:46:05 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Thu, 25 Nov 2021 15:46:05 +0000 Subject: [Bug 14381] busybox awk '$2 == var' can fail to give only lines with given search string In-Reply-To: References: Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14381 --- Comment #1 from ricercar at tuta.io --- I'm on Alpine Linux btw, which uses musl. -- You are receiving this mail because: You are on the CC list for the bug. From bugzilla at busybox.net Thu Nov 25 21:28:30 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Thu, 25 Nov 2021 21:28:30 +0000 Subject: [Bug 14391] New: sha1sum slow on x64 and possibly others Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14391 Bug ID: 14391 Summary: sha1sum slow on x64 and possibly others Product: Busybox Version: unspecified Hardware: All OS: All Status: NEW Severity: enhancement Priority: P5 Component: Other Assignee: unassigned at busybox.net Reporter: blazejroszkowski at gmail.com CC: busybox-cvs at busybox.net Target Milestone: --- Created attachment 9171 --> https://bugs.busybox.net/attachment.cgi?id=9171&action=edit dot config file from my build on Fedora VM sha1sum in BusyBox is over twice as slow as decently optimized C implementation. All tests were done on x64-86 (VMs and real OSes, Windws and Linux, two laptops). I don't know if this is a performance problem on other architectures (I'd guess it is). Test file (1 GiB): dd if=/dev/urandom of=gig.gig bs=1024 count=$((1024 * 1024)) GNU coreutils sha1sum (Git for Windows, no libcrypto use) and my own implementation take (on my personal laptop) 2.8 seconds. BusyBox takes 6.3 seconds. It's around 2x slower on my work laptop as well. This is present in at least versions: 1.34.1 (my own build on Fedora 34, .config attached), 1.34.1 in latest Alpine (3.15.0), 1.34.1 from Fedora 34 repos, 1.34.0 on Windows. I also remember it being present on Ubuntu 18.04 LTS and 20.04 LTS as well (busybox from the repos). My optimized plain C sha1 implementation (that I'm happy to contribute) is here: https://github.com/FRex/blasha1 The only downside I see from using optimized C version is potential increase in binary size, since optimized code is heavily unrolled (but I didn't investigate this increase). I've searched for "performance", "sha1", "sha1sum", and "speed" on this Bugzilla and found nothing about this. I understand it's an old semi-obsolete algorithm but if BusyBox provides this util, I assume it's better that it's faster rather than slower. -- You are receiving this mail because: You are on the CC list for the bug. From vda.linux at googlemail.com Sat Nov 27 10:28:11 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 11:28:11 +0100 Subject: [git commit] tls: P256: 64-bit optimizations Message-ID: <20211127105345.39F0C880C3@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=4bc9da10718df7ed9e992b1ddd2e80d53d894177 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta sp_256_proj_point_dbl_8 421 428 +7 sp_256_point_from_bin2x32 78 84 +6 sp_256_cmp_8 38 42 +4 sp_256_to_bin_8 28 31 +3 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 4/0 up/down: 20/0) Total: 20 bytes Signed-off-by: Denys Vlasenko --- include/platform.h | 2 + networking/tls_sp_c32.c | 114 +++++++++++++++++++++++++++++++++++++++++------- 2 files changed, 101 insertions(+), 15 deletions(-) diff --git a/include/platform.h b/include/platform.h index 9e1fb047d..ad27bb31a 100644 --- a/include/platform.h +++ b/include/platform.h @@ -239,6 +239,7 @@ typedef uint64_t bb__aliased_uint64_t FIX_ALIASING; # define move_from_unaligned_long(v, longp) ((v) = *(bb__aliased_long*)(longp)) # define move_from_unaligned16(v, u16p) ((v) = *(bb__aliased_uint16_t*)(u16p)) # define move_from_unaligned32(v, u32p) ((v) = *(bb__aliased_uint32_t*)(u32p)) +# define move_from_unaligned64(v, u64p) ((v) = *(bb__aliased_uint64_t*)(u64p)) # define move_to_unaligned16(u16p, v) (*(bb__aliased_uint16_t*)(u16p) = (v)) # define move_to_unaligned32(u32p, v) (*(bb__aliased_uint32_t*)(u32p) = (v)) # define move_to_unaligned64(u64p, v) (*(bb__aliased_uint64_t*)(u64p) = (v)) @@ -250,6 +251,7 @@ typedef uint64_t bb__aliased_uint64_t FIX_ALIASING; # define move_from_unaligned_long(v, longp) (memcpy(&(v), (longp), sizeof(long))) # define move_from_unaligned16(v, u16p) (memcpy(&(v), (u16p), 2)) # define move_from_unaligned32(v, u32p) (memcpy(&(v), (u32p), 4)) +# define move_from_unaligned64(v, u64p) (memcpy(&(v), (u64p), 8)) # define move_to_unaligned16(u16p, v) do { \ uint16_t __t = (v); \ memcpy((u16p), &__t, 2); \ diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 4d4ecdd74..d09f7e881 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -29,6 +29,20 @@ static void dump_hex(const char *fmt, const void *vp, int len) typedef uint32_t sp_digit; typedef int32_t signed_sp_digit; +/* 64-bit optimizations: + * if BB_UNALIGNED_MEMACCESS_OK && ULONG_MAX > 0xffffffff, + * then loads and stores can be done in 64-bit chunks. + * + * A narrower case is when arch is also little-endian (such as x86_64), + * then "LSW first", uint32[8] and uint64[4] representations are equivalent, + * and arithmetic can be done in 64 bits too. + */ +#if defined(__GNUC__) && defined(__x86_64__) +# define UNALIGNED_LE_64BIT 1 +#else +# define UNALIGNED_LE_64BIT 0 +#endif + /* The code below is taken from parts of * wolfssl-3.15.3/wolfcrypt/src/sp_c32.c * and heavily modified. @@ -58,6 +72,22 @@ static const sp_digit p256_mod[8] = { * r A single precision integer. * a Byte array. */ +#if BB_UNALIGNED_MEMACCESS_OK && ULONG_MAX > 0xffffffff +static void sp_256_to_bin_8(const sp_digit* rr, uint8_t* a) +{ + int i; + const uint64_t* r = (void*)rr; + + sp_256_norm_8(rr); + + r += 4; + for (i = 0; i < 4; i++) { + r--; + move_to_unaligned64(a, SWAP_BE64(*r)); + a += 8; + } +} +#else static void sp_256_to_bin_8(const sp_digit* r, uint8_t* a) { int i; @@ -71,6 +101,7 @@ static void sp_256_to_bin_8(const sp_digit* r, uint8_t* a) a += 4; } } +#endif /* Read big endian unsigned byte array into r. * @@ -78,6 +109,21 @@ static void sp_256_to_bin_8(const sp_digit* r, uint8_t* a) * a Byte array. * n Number of bytes in array to read. */ +#if BB_UNALIGNED_MEMACCESS_OK && ULONG_MAX > 0xffffffff +static void sp_256_from_bin_8(sp_digit* rr, const uint8_t* a) +{ + int i; + uint64_t* r = (void*)rr; + + r += 4; + for (i = 0; i < 4; i++) { + uint64_t v; + move_from_unaligned64(v, a); + *--r = SWAP_BE64(v); + a += 8; + } +} +#else static void sp_256_from_bin_8(sp_digit* r, const uint8_t* a) { int i; @@ -90,6 +136,7 @@ static void sp_256_from_bin_8(sp_digit* r, const uint8_t* a) a += 4; } } +#endif #if SP_DEBUG static void dump_256(const char *fmt, const sp_digit* r) @@ -125,6 +172,20 @@ static void sp_256_point_from_bin2x32(sp_point* p, const uint8_t *bin2x32) * return -ve, 0 or +ve if a is less than, equal to or greater than b * respectively. */ +#if UNALIGNED_LE_64BIT +static signed_sp_digit sp_256_cmp_8(const sp_digit* aa, const sp_digit* bb) +{ + const uint64_t* a = (void*)aa; + const uint64_t* b = (void*)bb; + int i; + for (i = 3; i >= 0; i--) { + if (a[i] == b[i]) + continue; + return (a[i] > b[i]) * 2 - 1; + } + return 0; +} +#else static signed_sp_digit sp_256_cmp_8(const sp_digit* a, const sp_digit* b) { int i; @@ -140,6 +201,7 @@ static signed_sp_digit sp_256_cmp_8(const sp_digit* a, const sp_digit* b) } return 0; } +#endif /* Compare two numbers to determine if they are equal. * @@ -196,8 +258,6 @@ static int sp_256_add_8(sp_digit* r, const sp_digit* a, const sp_digit* b) ); return reg; #elif ALLOW_ASM && defined(__GNUC__) && defined(__x86_64__) - /* x86_64 has no alignment restrictions, and is little-endian, - * so 64-bit and 32-bit representations are identical */ uint64_t reg; asm volatile ( "\n movq (%0), %3" @@ -294,8 +354,6 @@ static int sp_256_sub_8(sp_digit* r, const sp_digit* a, const sp_digit* b) ); return reg; #elif ALLOW_ASM && defined(__GNUC__) && defined(__x86_64__) - /* x86_64 has no alignment restrictions, and is little-endian, - * so 64-bit and 32-bit representations are identical */ uint64_t reg; asm volatile ( "\n movq (%0), %3" @@ -440,8 +498,6 @@ static void sp_256_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) r[15] = accl; memcpy(r, rr, sizeof(rr)); #elif ALLOW_ASM && defined(__GNUC__) && defined(__x86_64__) - /* x86_64 has no alignment restrictions, and is little-endian, - * so 64-bit and 32-bit representations are identical */ const uint64_t* aa = (const void*)a; const uint64_t* bb = (const void*)b; uint64_t rr[8]; @@ -551,17 +607,32 @@ static void sp_256_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) } /* Shift number right one bit. Bottom bit is lost. */ -static void sp_256_rshift1_8(sp_digit* r, sp_digit* a, sp_digit carry) +#if UNALIGNED_LE_64BIT +static void sp_256_rshift1_8(sp_digit* rr, uint64_t carry) +{ + uint64_t *r = (void*)rr; + int i; + + carry = (((uint64_t)!!carry) << 63); + for (i = 3; i >= 0; i--) { + uint64_t c = r[i] << 63; + r[i] = (r[i] >> 1) | carry; + carry = c; + } +} +#else +static void sp_256_rshift1_8(sp_digit* r, sp_digit carry) { int i; - carry = (!!carry << 31); + carry = (((sp_digit)!!carry) << 31); for (i = 7; i >= 0; i--) { - sp_digit c = a[i] << 31; - r[i] = (a[i] >> 1) | carry; + sp_digit c = r[i] << 31; + r[i] = (r[i] >> 1) | carry; carry = c; } } +#endif /* Divide the number by 2 mod the modulus (prime). (r = a / 2 % m) */ static void sp_256_div2_8(sp_digit* r, const sp_digit* a, const sp_digit* m) @@ -570,7 +641,7 @@ static void sp_256_div2_8(sp_digit* r, const sp_digit* a, const sp_digit* m) if (a[0] & 1) carry = sp_256_add_8(r, a, m); sp_256_norm_8(r); - sp_256_rshift1_8(r, r, carry); + sp_256_rshift1_8(r, carry); } /* Add two Montgomery form numbers (r = a + b % m) */ @@ -634,15 +705,28 @@ static void sp_256_mont_tpl_8(sp_digit* r, const sp_digit* a /*, const sp_digit* } /* Shift the result in the high 256 bits down to the bottom. */ -static void sp_256_mont_shift_8(sp_digit* r, const sp_digit* a) +#if BB_UNALIGNED_MEMACCESS_OK && ULONG_MAX > 0xffffffff +static void sp_256_mont_shift_8(sp_digit* rr) +{ + uint64_t *r = (void*)rr; + int i; + + for (i = 0; i < 4; i++) { + r[i] = r[i+4]; + r[i+4] = 0; + } +} +#else +static void sp_256_mont_shift_8(sp_digit* r) { int i; for (i = 0; i < 8; i++) { - r[i] = a[i+8]; + r[i] = r[i+8]; r[i+8] = 0; } } +#endif /* Mul a by scalar b and add into r. (r += a * b) */ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) @@ -800,7 +884,7 @@ static void sp_256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/ goto inc_next_word0; } } - sp_256_mont_shift_8(a, a); + sp_256_mont_shift_8(a); if (word16th != 0) sp_256_sub_8_p256_mod(a); sp_256_norm_8(a); @@ -820,7 +904,7 @@ static void sp_256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/ goto inc_next_word; } } - sp_256_mont_shift_8(a, a); + sp_256_mont_shift_8(a); if (word16th != 0) sp_256_sub_8_p256_mod(a); sp_256_norm_8(a); From vda.linux at googlemail.com Sat Nov 27 11:03:43 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 12:03:43 +0100 Subject: [git commit] tls: tweak debug printout Message-ID: <20211127105932.4C80A88224@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=446d136109633c12d748d63e2034db238f77ef97 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master Signed-off-by: Denys Vlasenko --- networking/tls.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/networking/tls.c b/networking/tls.c index 675ef4b3a..415952f16 100644 --- a/networking/tls.c +++ b/networking/tls.c @@ -1883,10 +1883,12 @@ static void process_server_key(tls_state_t *tls, int len) keybuf += 4; switch (t32) { case _0x03001d20: //curve_x25519 + dbg("got x25519 eccPubKey\n"); tls->flags |= GOT_EC_CURVE_X25519; memcpy(tls->hsd->ecc_pub_key32, keybuf, 32); break; case _0x03001741: //curve_secp256r1 (aka P256) + dbg("got P256 eccPubKey\n"); /* P256 point can be transmitted odd- or even-compressed * (first byte is 3 or 2) or uncompressed (4). */ @@ -1899,7 +1901,6 @@ static void process_server_key(tls_state_t *tls, int len) } tls->flags |= GOT_EC_KEY; - dbg("got eccPubKey\n"); } static void send_empty_client_cert(tls_state_t *tls) From vda.linux at googlemail.com Sat Nov 27 14:06:57 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 15:06:57 +0100 Subject: [git commit] tls: P256: remove constant-time trick in sp_256_proj_point_add_8 Message-ID: <20211127152257.67EC98B52C@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=bbda85c74b7a53d8b2bb46f3b44d8f0932a6e95d branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta sp_256_proj_point_add_8 576 544 -32 Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 79 +++++++++++++++++++++++-------------------------- 1 file changed, 37 insertions(+), 42 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 29dd04293..3b0473036 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -1269,52 +1269,47 @@ static NOINLINE void sp_256_proj_point_add_8(sp_point* r, sp_point* p, sp_point* && (sp_256_cmp_equal_8(p->y, q->y) || sp_256_cmp_equal_8(p->y, t1)) ) { sp_256_proj_point_dbl_8(r, p); + return; } - else { - sp_point tp; - sp_point *v; - - v = r; - if (p->infinity | q->infinity) { - memset(&tp, 0, sizeof(tp)); - v = &tp; - } - *r = p->infinity ? *q : *p; /* struct copy */ - /* U1 = X1*Z2^2 */ - sp_256_mont_sqr_8(t1, q->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t3, t1, q->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t1, t1, v->x /*, p256_mod, p256_mp_mod*/); - /* U2 = X2*Z1^2 */ - sp_256_mont_sqr_8(t2, v->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t4, t2, v->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t2, t2, q->x /*, p256_mod, p256_mp_mod*/); - /* S1 = Y1*Z2^3 */ - sp_256_mont_mul_8(t3, t3, v->y /*, p256_mod, p256_mp_mod*/); - /* S2 = Y2*Z1^3 */ - sp_256_mont_mul_8(t4, t4, q->y /*, p256_mod, p256_mp_mod*/); - /* H = U2 - U1 */ - sp_256_mont_sub_8(t2, t2, t1 /*, p256_mod*/); - /* R = S2 - S1 */ - sp_256_mont_sub_8(t4, t4, t3 /*, p256_mod*/); - /* Z3 = H*Z1*Z2 */ - sp_256_mont_mul_8(v->z, v->z, q->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(v->z, v->z, t2 /*, p256_mod, p256_mp_mod*/); - /* X3 = R^2 - H^3 - 2*U1*H^2 */ - sp_256_mont_sqr_8(v->x, t4 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_sqr_8(t5, t2 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(v->y, t1, t5 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t5, t5, t2 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_sub_8(v->x, v->x, t5 /*, p256_mod*/); - sp_256_mont_dbl_8(t1, v->y /*, p256_mod*/); - sp_256_mont_sub_8(v->x, v->x, t1 /*, p256_mod*/); - /* Y3 = R*(U1*H^2 - X3) - S1*H^3 */ - sp_256_mont_sub_8(v->y, v->y, v->x /*, p256_mod*/); - sp_256_mont_mul_8(v->y, v->y, t4 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t5, t5, t3 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_sub_8(v->y, v->y, t5 /*, p256_mod*/); + if (p->infinity || q->infinity) { + *r = p->infinity ? *q : *p; /* struct copy */ + return; } + + /* U1 = X1*Z2^2 */ + sp_256_mont_sqr_8(t1, q->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t3, t1, q->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t1, t1, r->x /*, p256_mod, p256_mp_mod*/); + /* U2 = X2*Z1^2 */ + sp_256_mont_sqr_8(t2, r->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t4, t2, r->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t2, t2, q->x /*, p256_mod, p256_mp_mod*/); + /* S1 = Y1*Z2^3 */ + sp_256_mont_mul_8(t3, t3, r->y /*, p256_mod, p256_mp_mod*/); + /* S2 = Y2*Z1^3 */ + sp_256_mont_mul_8(t4, t4, q->y /*, p256_mod, p256_mp_mod*/); + /* H = U2 - U1 */ + sp_256_mont_sub_8(t2, t2, t1 /*, p256_mod*/); + /* R = S2 - S1 */ + sp_256_mont_sub_8(t4, t4, t3 /*, p256_mod*/); + /* Z3 = H*Z1*Z2 */ + sp_256_mont_mul_8(r->z, r->z, q->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->z, r->z, t2 /*, p256_mod, p256_mp_mod*/); + /* X3 = R^2 - H^3 - 2*U1*H^2 */ + sp_256_mont_sqr_8(r->x, t4 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t5, t2 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->y, t1, t5 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t5, t5, t2 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sub_8(r->x, r->x, t5 /*, p256_mod*/); + sp_256_mont_dbl_8(t1, r->y /*, p256_mod*/); + sp_256_mont_sub_8(r->x, r->x, t1 /*, p256_mod*/); + /* Y3 = R*(U1*H^2 - X3) - S1*H^3 */ + sp_256_mont_sub_8(r->y, r->y, r->x /*, p256_mod*/); + sp_256_mont_mul_8(r->y, r->y, t4 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t5, t5, t3 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sub_8(r->y, r->y, t5 /*, p256_mod*/); } /* Multiply the point by the scalar and return the result. From vda.linux at googlemail.com Sat Nov 27 14:00:14 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 15:00:14 +0100 Subject: [git commit] tls: P256: do not open-code copying of struct variables Message-ID: <20211127152257.5EB0D8B528@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=26c85225229b0a439bcc66c8ee786d16f23be9ed branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta sp_256_ecc_mulmod_8 536 534 -2 Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index d09f7e881..29dd04293 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -1361,13 +1361,13 @@ static void sp_256_ecc_mulmod_8(sp_point* r, const sp_point* g, const sp_digit* dump_512("t[1].y %s\n", t[1].y); dump_512("t[1].z %s\n", t[1].z); dbg("t[2] = t[%d]\n", y); - memcpy(&t[2], &t[y], sizeof(sp_point)); + t[2] = t[y]; /* struct copy */ dbg("t[2] *= 2\n"); sp_256_proj_point_dbl_8(&t[2], &t[2]); dump_512("t[2].x %s\n", t[2].x); dump_512("t[2].y %s\n", t[2].y); dump_512("t[2].z %s\n", t[2].z); - memcpy(&t[y], &t[2], sizeof(sp_point)); + t[y] = t[2]; /* struct copy */ n <<= 1; c--; From vda.linux at googlemail.com Sat Nov 27 15:24:49 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 16:24:49 +0100 Subject: [git commit] tls: P256: fix sp_256_div2_8 - it wouldn't use a[] if low bit is 0 Message-ID: <20211127152257.88FB58B52C@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=dcfd8d3d1013ba989fa511f44bb0553a88c1ef10 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master It worked by chance because the only caller passed both parameters as two pointers to the same array. My fault (I made this error when converting from 26-bit code). Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 12 +++++++----- 1 file changed, 7 insertions(+), 5 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index baed62f41..b3f7888f5 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -636,12 +636,14 @@ static void sp_256_rshift1_8(sp_digit* r, sp_digit carry) } #endif -/* Divide the number by 2 mod the modulus (prime). (r = a / 2 % m) */ -static void sp_256_div2_8(sp_digit* r, const sp_digit* a, const sp_digit* m) +/* Divide the number by 2 mod the modulus (prime). (r = (r / 2) % m) */ +static void sp_256_div2_8(sp_digit* r /*, const sp_digit* m*/) { + const sp_digit* m = p256_mod; + int carry = 0; - if (a[0] & 1) - carry = sp_256_add_8(r, a, m); + if (r[0] & 1) + carry = sp_256_add_8(r, r, m); sp_256_norm_8(r); sp_256_rshift1_8(r, carry); } @@ -1125,7 +1127,7 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) /* T2 = Y * Y */ sp_256to512z_mont_sqr_8(t2, r->y /*, p256_mod, p256_mp_mod*/); /* T2 = T2/2 */ - sp_256_div2_8(t2, t2, p256_mod); + sp_256_div2_8(t2 /*, p256_mod*/); /* Y = Y * X */ sp_256to512z_mont_mul_8(r->y, r->y, r->x /*, p256_mod, p256_mp_mod*/); /* X = T1 * T1 */ From vda.linux at googlemail.com Sat Nov 27 14:50:40 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 15:50:40 +0100 Subject: [git commit] tls: P256: remove redundant zeroing in sp_256_map_8 Message-ID: <20211127152257.8031F8820C@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=8cbb70365f653397c8c2b9370214d5aed36ec9fa branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master Previous change made it obvious that we zero out already-zeroed high bits function old new delta sp_256_ecc_mulmod_8 534 494 -40 Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 2 -- 1 file changed, 2 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 74ded2cda..baed62f41 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -1062,7 +1062,6 @@ static void sp_256_map_8(sp_point* r, sp_point* p) /* x /= z^2 */ sp_256to512z_mont_mul_8(r->x, p->x, t2 /*, p256_mod, p256_mp_mod*/); - memset(r->x + 8, 0, sizeof(r->x) / 2); sp_512to256_mont_reduce_8(r->x /*, p256_mod, p256_mp_mod*/); /* Reduce x to less than modulus */ if (sp_256_cmp_8(r->x, p256_mod) >= 0) @@ -1071,7 +1070,6 @@ static void sp_256_map_8(sp_point* r, sp_point* p) /* y /= z^3 */ sp_256to512z_mont_mul_8(r->y, p->y, t1 /*, p256_mod, p256_mp_mod*/); - memset(r->y + 8, 0, sizeof(r->y) / 2); sp_512to256_mont_reduce_8(r->y /*, p256_mod, p256_mp_mod*/); /* Reduce y to less than modulus */ if (sp_256_cmp_8(r->y, p256_mod) >= 0) From vda.linux at googlemail.com Sat Nov 27 14:47:26 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 15:47:26 +0100 Subject: [git commit] tls: P256: explain which functions use double-wide arrays, no code changes Message-ID: <20211127152257.725D88B607@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=4415f7bc06f1ee382bcbaabd86c3d7aca0b46d93 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta sp_512to256_mont_reduce_8 - 243 +243 sp_256to512z_mont_mul_8 - 150 +150 sp_256to512z_mont_sqr_8 - 7 +7 sp_256_mont_sqr_8 7 - -7 sp_256_mont_mul_8 150 - -150 sp_256_mont_reduce_8 243 - -243 ------------------------------------------------------------------------------ (add/remove: 3/3 grow/shrink: 0/0 up/down: 400/-400) Total: 0 bytes Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 211 +++++++++++++----------------------------------- 1 file changed, 58 insertions(+), 153 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 3b0473036..74ded2cda 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -455,8 +455,10 @@ static void sp_256_sub_8_p256_mod(sp_digit* r) } #endif -/* Multiply a and b into r. (r = a * b) */ -static void sp_256_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) +/* Multiply a and b into r. (r = a * b) + * r should be [16] array (512 bits). + */ +static void sp_256to512_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) { #if ALLOW_ASM && defined(__GNUC__) && defined(__i386__) sp_digit rr[15]; /* in case r coincides with a or b */ @@ -704,9 +706,11 @@ static void sp_256_mont_tpl_8(sp_digit* r, const sp_digit* a /*, const sp_digit* } } -/* Shift the result in the high 256 bits down to the bottom. */ +/* Shift the result in the high 256 bits down to the bottom. + * High half is cleared to zeros. + */ #if BB_UNALIGNED_MEMACCESS_OK && ULONG_MAX > 0xffffffff -static void sp_256_mont_shift_8(sp_digit* rr) +static void sp_512to256_mont_shift_8(sp_digit* rr) { uint64_t *r = (void*)rr; int i; @@ -717,7 +721,7 @@ static void sp_256_mont_shift_8(sp_digit* rr) } } #else -static void sp_256_mont_shift_8(sp_digit* r) +static void sp_512to256_mont_shift_8(sp_digit* r) { int i; @@ -728,7 +732,10 @@ static void sp_256_mont_shift_8(sp_digit* r) } #endif -/* Mul a by scalar b and add into r. (r += a * b) */ +/* Mul a by scalar b and add into r. (r += a * b) + * a = p256_mod + * b = r[0] + */ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) { // const sp_digit* a = p256_mod; @@ -857,11 +864,11 @@ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) /* Reduce the number back to 256 bits using Montgomery reduction. * - * a A single precision number to reduce in place. + * a Double-wide number to reduce in place. * m The single precision number representing the modulus. * mp The digit representing the negative inverse of m mod 2^n. */ -static void sp_256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/) +static void sp_512to256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/) { // const sp_digit* m = p256_mod; sp_digit mp = p256_mp_mod; @@ -884,7 +891,7 @@ static void sp_256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/ goto inc_next_word0; } } - sp_256_mont_shift_8(a); + sp_512to256_mont_shift_8(a); if (word16th != 0) sp_256_sub_8_p256_mod(a); sp_256_norm_8(a); @@ -892,7 +899,7 @@ static void sp_256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/ else { /* Same code for explicit mp == 1 (which is always the case for P256) */ sp_digit word16th = 0; for (i = 0; i < 8; i++) { - /*mu = a[i];*/ +// mu = a[i]; if (sp_256_mul_add_8(a+i /*, m, mu*/)) { int j = i + 8; inc_next_word: @@ -904,148 +911,46 @@ static void sp_256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/ goto inc_next_word; } } - sp_256_mont_shift_8(a); + sp_512to256_mont_shift_8(a); if (word16th != 0) sp_256_sub_8_p256_mod(a); sp_256_norm_8(a); } } -#if 0 -//TODO: arm32 asm (also adapt for x86?) -static void sp_256_mont_reduce_8(sp_digit* a, sp_digit* m, sp_digit mp) -{ - sp_digit ca = 0; - - asm volatile ( - # i = 0 - mov r12, #0 - ldr r10, [%[a], #0] - ldr r14, [%[a], #4] -1: - # mu = a[i] * mp - mul r8, %[mp], r10 - # a[i+0] += m[0] * mu - ldr r7, [%[m], #0] - ldr r9, [%[a], #0] - umull r6, r7, r8, r7 - adds r10, r10, r6 - adc r5, r7, #0 - # a[i+1] += m[1] * mu - ldr r7, [%[m], #4] - ldr r9, [%[a], #4] - umull r6, r7, r8, r7 - adds r10, r14, r6 - adc r4, r7, #0 - adds r10, r10, r5 - adc r4, r4, #0 - # a[i+2] += m[2] * mu - ldr r7, [%[m], #8] - ldr r14, [%[a], #8] - umull r6, r7, r8, r7 - adds r14, r14, r6 - adc r5, r7, #0 - adds r14, r14, r4 - adc r5, r5, #0 - # a[i+3] += m[3] * mu - ldr r7, [%[m], #12] - ldr r9, [%[a], #12] - umull r6, r7, r8, r7 - adds r9, r9, r6 - adc r4, r7, #0 - adds r9, r9, r5 - str r9, [%[a], #12] - adc r4, r4, #0 - # a[i+4] += m[4] * mu - ldr r7, [%[m], #16] - ldr r9, [%[a], #16] - umull r6, r7, r8, r7 - adds r9, r9, r6 - adc r5, r7, #0 - adds r9, r9, r4 - str r9, [%[a], #16] - adc r5, r5, #0 - # a[i+5] += m[5] * mu - ldr r7, [%[m], #20] - ldr r9, [%[a], #20] - umull r6, r7, r8, r7 - adds r9, r9, r6 - adc r4, r7, #0 - adds r9, r9, r5 - str r9, [%[a], #20] - adc r4, r4, #0 - # a[i+6] += m[6] * mu - ldr r7, [%[m], #24] - ldr r9, [%[a], #24] - umull r6, r7, r8, r7 - adds r9, r9, r6 - adc r5, r7, #0 - adds r9, r9, r4 - str r9, [%[a], #24] - adc r5, r5, #0 - # a[i+7] += m[7] * mu - ldr r7, [%[m], #28] - ldr r9, [%[a], #28] - umull r6, r7, r8, r7 - adds r5, r5, r6 - adcs r7, r7, %[ca] - mov %[ca], #0 - adc %[ca], %[ca], %[ca] - adds r9, r9, r5 - str r9, [%[a], #28] - ldr r9, [%[a], #32] - adcs r9, r9, r7 - str r9, [%[a], #32] - adc %[ca], %[ca], #0 - # i += 1 - add %[a], %[a], #4 - add r12, r12, #4 - cmp r12, #32 - blt 1b - - str r10, [%[a], #0] - str r14, [%[a], #4] - : [ca] "+r" (ca), [a] "+r" (a) - : [m] "r" (m), [mp] "r" (mp) - : "memory", "r4", "r5", "r6", "r7", "r8", "r9", "r10", "r12", "r14" - ); - - memcpy(a, a + 8, 32); - if (ca) - a -= m; -} -#endif /* Multiply two Montogmery form numbers mod the modulus (prime). * (r = a * b mod m) * * r Result of multiplication. + * Should be [16] array (512 bits), but high half is cleared to zeros (used as scratch pad). * a First number to multiply in Montogmery form. * b Second number to multiply in Montogmery form. * m Modulus (prime). * mp Montogmery mulitplier. */ -static void sp_256_mont_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b +static void sp_256to512z_mont_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b /*, const sp_digit* m, sp_digit mp*/) { //const sp_digit* m = p256_mod; //sp_digit mp = p256_mp_mod; - sp_256_mul_8(r, a, b); - sp_256_mont_reduce_8(r /*, m, mp*/); + sp_256to512_mul_8(r, a, b); + sp_512to256_mont_reduce_8(r /*, m, mp*/); } /* Square the Montgomery form number. (r = a * a mod m) * * r Result of squaring. + * Should be [16] array (512 bits), but high half is cleared to zeros (used as scratch pad). * a Number to square in Montogmery form. * m Modulus (prime). * mp Montogmery mulitplier. */ -static void sp_256_mont_sqr_8(sp_digit* r, const sp_digit* a +static void sp_256to512z_mont_sqr_8(sp_digit* r, const sp_digit* a /*, const sp_digit* m, sp_digit mp*/) { //const sp_digit* m = p256_mod; //sp_digit mp = p256_mp_mod; - sp_256_mont_mul_8(r, a, a /*, m, mp*/); + sp_256to512z_mont_mul_8(r, a, a /*, m, mp*/); } /* Invert the number, in Montgomery form, modulo the modulus (prime) of the @@ -1068,15 +973,15 @@ static const uint32_t p256_mod_2[8] = { #endif static void sp_256_mont_inv_8(sp_digit* r, sp_digit* a) { - sp_digit t[2*8]; //can be just [8]? + sp_digit t[2*8]; int i; memcpy(t, a, sizeof(sp_digit) * 8); for (i = 254; i >= 0; i--) { - sp_256_mont_sqr_8(t, t /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(t, t /*, p256_mod, p256_mp_mod*/); /*if (p256_mod_2[i / 32] & ((sp_digit)1 << (i % 32)))*/ if (i >= 224 || i == 192 || (i <= 95 && i != 1)) - sp_256_mont_mul_8(t, t, a /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t, t, a /*, p256_mod, p256_mp_mod*/); } memcpy(r, t, sizeof(sp_digit) * 8); } @@ -1152,22 +1057,22 @@ static void sp_256_map_8(sp_point* r, sp_point* p) sp_256_mont_inv_8(t1, p->z); - sp_256_mont_sqr_8(t2, t1 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t1, t2, t1 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(t2, t1 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t1, t2, t1 /*, p256_mod, p256_mp_mod*/); /* x /= z^2 */ - sp_256_mont_mul_8(r->x, p->x, t2 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->x, p->x, t2 /*, p256_mod, p256_mp_mod*/); memset(r->x + 8, 0, sizeof(r->x) / 2); - sp_256_mont_reduce_8(r->x /*, p256_mod, p256_mp_mod*/); + sp_512to256_mont_reduce_8(r->x /*, p256_mod, p256_mp_mod*/); /* Reduce x to less than modulus */ if (sp_256_cmp_8(r->x, p256_mod) >= 0) sp_256_sub_8_p256_mod(r->x); sp_256_norm_8(r->x); /* y /= z^3 */ - sp_256_mont_mul_8(r->y, p->y, t1 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->y, p->y, t1 /*, p256_mod, p256_mp_mod*/); memset(r->y + 8, 0, sizeof(r->y) / 2); - sp_256_mont_reduce_8(r->y /*, p256_mod, p256_mp_mod*/); + sp_512to256_mont_reduce_8(r->y /*, p256_mod, p256_mp_mod*/); /* Reduce y to less than modulus */ if (sp_256_cmp_8(r->y, p256_mod) >= 0) sp_256_sub_8_p256_mod(r->y); @@ -1202,9 +1107,9 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) } /* T1 = Z * Z */ - sp_256_mont_sqr_8(t1, r->z /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(t1, r->z /*, p256_mod, p256_mp_mod*/); /* Z = Y * Z */ - sp_256_mont_mul_8(r->z, r->y, r->z /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->z, r->y, r->z /*, p256_mod, p256_mp_mod*/); /* Z = 2Z */ sp_256_mont_dbl_8(r->z, r->z /*, p256_mod*/); /* T2 = X - T1 */ @@ -1212,21 +1117,21 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) /* T1 = X + T1 */ sp_256_mont_add_8(t1, r->x, t1 /*, p256_mod*/); /* T2 = T1 * T2 */ - sp_256_mont_mul_8(t2, t1, t2 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t2, t1, t2 /*, p256_mod, p256_mp_mod*/); /* T1 = 3T2 */ sp_256_mont_tpl_8(t1, t2 /*, p256_mod*/); /* Y = 2Y */ sp_256_mont_dbl_8(r->y, r->y /*, p256_mod*/); /* Y = Y * Y */ - sp_256_mont_sqr_8(r->y, r->y /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(r->y, r->y /*, p256_mod, p256_mp_mod*/); /* T2 = Y * Y */ - sp_256_mont_sqr_8(t2, r->y /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(t2, r->y /*, p256_mod, p256_mp_mod*/); /* T2 = T2/2 */ sp_256_div2_8(t2, t2, p256_mod); /* Y = Y * X */ - sp_256_mont_mul_8(r->y, r->y, r->x /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->y, r->y, r->x /*, p256_mod, p256_mp_mod*/); /* X = T1 * T1 */ - sp_256_mont_mul_8(r->x, t1, t1 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->x, t1, t1 /*, p256_mod, p256_mp_mod*/); /* X = X - Y */ sp_256_mont_sub_8(r->x, r->x, r->y /*, p256_mod*/); /* X = X - Y */ @@ -1234,7 +1139,7 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) /* Y = Y - X */ sp_256_mont_sub_8(r->y, r->y, r->x /*, p256_mod*/); /* Y = Y * T1 */ - sp_256_mont_mul_8(r->y, r->y, t1 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->y, r->y, t1 /*, p256_mod, p256_mp_mod*/); /* Y = Y - T2 */ sp_256_mont_sub_8(r->y, r->y, t2 /*, p256_mod*/); dump_512("y2 %s\n", r->y); @@ -1279,36 +1184,36 @@ static NOINLINE void sp_256_proj_point_add_8(sp_point* r, sp_point* p, sp_point* } /* U1 = X1*Z2^2 */ - sp_256_mont_sqr_8(t1, q->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t3, t1, q->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t1, t1, r->x /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(t1, q->z /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t3, t1, q->z /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t1, t1, r->x /*, p256_mod, p256_mp_mod*/); /* U2 = X2*Z1^2 */ - sp_256_mont_sqr_8(t2, r->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t4, t2, r->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t2, t2, q->x /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(t2, r->z /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t4, t2, r->z /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t2, t2, q->x /*, p256_mod, p256_mp_mod*/); /* S1 = Y1*Z2^3 */ - sp_256_mont_mul_8(t3, t3, r->y /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t3, t3, r->y /*, p256_mod, p256_mp_mod*/); /* S2 = Y2*Z1^3 */ - sp_256_mont_mul_8(t4, t4, q->y /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t4, t4, q->y /*, p256_mod, p256_mp_mod*/); /* H = U2 - U1 */ sp_256_mont_sub_8(t2, t2, t1 /*, p256_mod*/); /* R = S2 - S1 */ sp_256_mont_sub_8(t4, t4, t3 /*, p256_mod*/); /* Z3 = H*Z1*Z2 */ - sp_256_mont_mul_8(r->z, r->z, q->z /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(r->z, r->z, t2 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->z, r->z, q->z /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->z, r->z, t2 /*, p256_mod, p256_mp_mod*/); /* X3 = R^2 - H^3 - 2*U1*H^2 */ - sp_256_mont_sqr_8(r->x, t4 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_sqr_8(t5, t2 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(r->y, t1, t5 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t5, t5, t2 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(r->x, t4 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_sqr_8(t5, t2 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->y, t1, t5 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t5, t5, t2 /*, p256_mod, p256_mp_mod*/); sp_256_mont_sub_8(r->x, r->x, t5 /*, p256_mod*/); sp_256_mont_dbl_8(t1, r->y /*, p256_mod*/); sp_256_mont_sub_8(r->x, r->x, t1 /*, p256_mod*/); /* Y3 = R*(U1*H^2 - X3) - S1*H^3 */ sp_256_mont_sub_8(r->y, r->y, r->x /*, p256_mod*/); - sp_256_mont_mul_8(r->y, r->y, t4 /*, p256_mod, p256_mp_mod*/); - sp_256_mont_mul_8(t5, t5, t3 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(r->y, r->y, t4 /*, p256_mod, p256_mp_mod*/); + sp_256to512z_mont_mul_8(t5, t5, t3 /*, p256_mod, p256_mp_mod*/); sp_256_mont_sub_8(r->y, r->y, t5 /*, p256_mod*/); } From vda.linux at googlemail.com Sat Nov 27 17:42:27 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 18:42:27 +0100 Subject: [git commit] tls: P256: do not open-code copying of struct variables Message-ID: <20211127182803.A28D58D5F1@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=9c671fe3dd2e46a28c02d266130f56a1a6296791 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index b3f7888f5..3291b553c 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -865,6 +865,8 @@ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) } /* Reduce the number back to 256 bits using Montgomery reduction. + * Note: the result is NOT guaranteed to be less than p256_mod! + * (it is only guaranteed to fit into 256 bits). * * a Double-wide number to reduce in place. * m The single precision number representing the modulus. @@ -1276,7 +1278,7 @@ static void sp_256_ecc_mulmod_8(sp_point* r, const sp_point* g, const sp_digit* if (map) sp_256_map_8(r, &t[0]); else - memcpy(r, &t[0], sizeof(sp_point)); + *r = t[0]; /* struct copy */ memset(t, 0, sizeof(t)); //paranoia } From vda.linux at googlemail.com Sat Nov 27 18:27:03 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 19:27:03 +0100 Subject: [git commit] tls: P256: change logic so that we don't need double-wide vectors everywhere Message-ID: <20211127182803.AB8928D5F8@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=f92ae1dc4bc00e352e683b826609efa5e1e22708 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master Change sp_256to512z_mont_{mul,sqr}_8 to not require/zero upper 256 bits. There is only one place where we actually used that (and that's why there used to be zeroing memset of top half!). Fix up that place. As a bonus, 256x256->512 multiply no longer needs to care for "r overlaps a or b" case. This shrinks sp_point structure as well, not just temporaries. function old new delta sp_256to512z_mont_mul_8 150 - -150 sp_256_mont_mul_8 - 147 +147 sp_256to512z_mont_sqr_8 7 - -7 sp_256_mont_sqr_8 - 7 +7 sp_256_ecc_mulmod_8 494 543 +49 sp_512to256_mont_reduce_8 243 249 +6 sp_256_point_from_bin2x32 73 70 -3 sp_256_proj_point_dbl_8 353 345 -8 sp_256_proj_point_add_8 544 499 -45 ------------------------------------------------------------------------------ (add/remove: 2/2 grow/shrink: 2/3 up/down: 209/-213) Total: -4 bytes Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 178 ++++++++++++++++++++---------------------------- 1 file changed, 72 insertions(+), 106 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 3291b553c..3452b08b9 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -49,9 +49,9 @@ typedef int32_t signed_sp_digit; */ typedef struct sp_point { - sp_digit x[2 * 8]; - sp_digit y[2 * 8]; - sp_digit z[2 * 8]; + sp_digit x[8]; + sp_digit y[8]; + sp_digit z[8]; int infinity; } sp_point; @@ -456,12 +456,11 @@ static void sp_256_sub_8_p256_mod(sp_digit* r) #endif /* Multiply a and b into r. (r = a * b) - * r should be [16] array (512 bits). + * r should be [16] array (512 bits), and must not coincide with a or b. */ static void sp_256to512_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) { #if ALLOW_ASM && defined(__GNUC__) && defined(__i386__) - sp_digit rr[15]; /* in case r coincides with a or b */ int k; uint32_t accl; uint32_t acch; @@ -493,16 +492,15 @@ static void sp_256to512_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) j--; i++; } while (i != 8 && i <= k); - rr[k] = accl; + r[k] = accl; accl = acch; acch = acc_hi; } r[15] = accl; - memcpy(r, rr, sizeof(rr)); #elif ALLOW_ASM && defined(__GNUC__) && defined(__x86_64__) const uint64_t* aa = (const void*)a; const uint64_t* bb = (const void*)b; - uint64_t rr[8]; + const uint64_t* rr = (const void*)r; int k; uint64_t accl; uint64_t acch; @@ -539,11 +537,8 @@ static void sp_256to512_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) acch = acc_hi; } rr[7] = accl; - memcpy(r, rr, sizeof(rr)); #elif 0 //TODO: arm assembly (untested) - sp_digit tmp[16]; - asm volatile ( "\n mov r5, #0" "\n mov r6, #0" @@ -575,12 +570,10 @@ static void sp_256to512_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) "\n cmp r5, #56" "\n ble 1b" "\n str r6, [%[r], r5]" - : [r] "r" (tmp), [a] "r" (a), [b] "r" (b) + : [r] "r" (r), [a] "r" (a), [b] "r" (b) : "memory", "r3", "r4", "r5", "r6", "r7", "r8", "r9", "r10", "r12", "r14" ); - memcpy(r, tmp, sizeof(tmp)); #else - sp_digit rr[15]; /* in case r coincides with a or b */ int i, j, k; uint64_t acc; @@ -600,11 +593,10 @@ static void sp_256to512_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) j--; i++; } while (i != 8 && i <= k); - rr[k] = acc; + r[k] = acc; acc = (acc >> 32) | ((uint64_t)acc_hi << 32); } r[15] = acc; - memcpy(r, rr, sizeof(rr)); #endif } @@ -709,30 +701,11 @@ static void sp_256_mont_tpl_8(sp_digit* r, const sp_digit* a /*, const sp_digit* } /* Shift the result in the high 256 bits down to the bottom. - * High half is cleared to zeros. */ -#if BB_UNALIGNED_MEMACCESS_OK && ULONG_MAX > 0xffffffff -static void sp_512to256_mont_shift_8(sp_digit* rr) +static void sp_512to256_mont_shift_8(sp_digit* r, sp_digit* a) { - uint64_t *r = (void*)rr; - int i; - - for (i = 0; i < 4; i++) { - r[i] = r[i+4]; - r[i+4] = 0; - } + memcpy(r, a + 8, sizeof(*r) * 8); } -#else -static void sp_512to256_mont_shift_8(sp_digit* r) -{ - int i; - - for (i = 0; i < 8; i++) { - r[i] = r[i+8]; - r[i+8] = 0; - } -} -#endif /* Mul a by scalar b and add into r. (r += a * b) * a = p256_mod @@ -868,11 +841,12 @@ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) * Note: the result is NOT guaranteed to be less than p256_mod! * (it is only guaranteed to fit into 256 bits). * - * a Double-wide number to reduce in place. + * r Result. + * a Double-wide number to reduce. Clobbered. * m The single precision number representing the modulus. * mp The digit representing the negative inverse of m mod 2^n. */ -static void sp_512to256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit mp*/) +static void sp_512to256_mont_reduce_8(sp_digit* r, sp_digit* a/*, const sp_digit* m, sp_digit mp*/) { // const sp_digit* m = p256_mod; sp_digit mp = p256_mp_mod; @@ -895,10 +869,10 @@ static void sp_512to256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit goto inc_next_word0; } } - sp_512to256_mont_shift_8(a); + sp_512to256_mont_shift_8(r, a); if (word16th != 0) - sp_256_sub_8_p256_mod(a); - sp_256_norm_8(a); + sp_256_sub_8_p256_mod(r); + sp_256_norm_8(r); } else { /* Same code for explicit mp == 1 (which is always the case for P256) */ sp_digit word16th = 0; @@ -915,10 +889,10 @@ static void sp_512to256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit goto inc_next_word; } } - sp_512to256_mont_shift_8(a); + sp_512to256_mont_shift_8(r, a); if (word16th != 0) - sp_256_sub_8_p256_mod(a); - sp_256_norm_8(a); + sp_256_sub_8_p256_mod(r); + sp_256_norm_8(r); } } @@ -926,35 +900,34 @@ static void sp_512to256_mont_reduce_8(sp_digit* a/*, const sp_digit* m, sp_digit * (r = a * b mod m) * * r Result of multiplication. - * Should be [16] array (512 bits), but high half is cleared to zeros (used as scratch pad). * a First number to multiply in Montogmery form. * b Second number to multiply in Montogmery form. * m Modulus (prime). * mp Montogmery mulitplier. */ -static void sp_256to512z_mont_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b +static void sp_256_mont_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b /*, const sp_digit* m, sp_digit mp*/) { //const sp_digit* m = p256_mod; //sp_digit mp = p256_mp_mod; - sp_256to512_mul_8(r, a, b); - sp_512to256_mont_reduce_8(r /*, m, mp*/); + sp_digit t[2 * 8]; + sp_256to512_mul_8(t, a, b); + sp_512to256_mont_reduce_8(r, t /*, m, mp*/); } /* Square the Montgomery form number. (r = a * a mod m) * * r Result of squaring. - * Should be [16] array (512 bits), but high half is cleared to zeros (used as scratch pad). * a Number to square in Montogmery form. * m Modulus (prime). * mp Montogmery mulitplier. */ -static void sp_256to512z_mont_sqr_8(sp_digit* r, const sp_digit* a +static void sp_256_mont_sqr_8(sp_digit* r, const sp_digit* a /*, const sp_digit* m, sp_digit mp*/) { //const sp_digit* m = p256_mod; //sp_digit mp = p256_mp_mod; - sp_256to512z_mont_mul_8(r, a, a /*, m, mp*/); + sp_256_mont_mul_8(r, a, a /*, m, mp*/); } /* Invert the number, in Montgomery form, modulo the modulus (prime) of the @@ -964,11 +937,8 @@ static void sp_256to512z_mont_sqr_8(sp_digit* r, const sp_digit* a * a Number to invert. */ #if 0 -/* Mod-2 for the P256 curve. */ -static const uint32_t p256_mod_2[8] = { - 0xfffffffd,0xffffffff,0xffffffff,0x00000000, - 0x00000000,0x00000000,0x00000001,0xffffffff, -}; +//p256_mod - 2: +//ffffffff 00000001 00000000 00000000 00000000 ffffffff ffffffff ffffffff - 2 //Bit pattern: //2 2 2 2 2 2 2 1...1 //5 5 4 3 2 1 0 9...0 9...1 @@ -977,15 +947,15 @@ static const uint32_t p256_mod_2[8] = { #endif static void sp_256_mont_inv_8(sp_digit* r, sp_digit* a) { - sp_digit t[2*8]; + sp_digit t[8]; int i; memcpy(t, a, sizeof(sp_digit) * 8); for (i = 254; i >= 0; i--) { - sp_256to512z_mont_sqr_8(t, t /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t, t /*, p256_mod, p256_mp_mod*/); /*if (p256_mod_2[i / 32] & ((sp_digit)1 << (i % 32)))*/ if (i >= 224 || i == 192 || (i <= 95 && i != 1)) - sp_256to512z_mont_mul_8(t, t, a /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t, t, a /*, p256_mod, p256_mp_mod*/); } memcpy(r, t, sizeof(sp_digit) * 8); } @@ -1056,25 +1026,28 @@ static void sp_256_mod_mul_norm_8(sp_digit* r, const sp_digit* a) */ static void sp_256_map_8(sp_point* r, sp_point* p) { - sp_digit t1[2*8]; - sp_digit t2[2*8]; + sp_digit t1[8]; + sp_digit t2[8]; + sp_digit rr[2 * 8]; sp_256_mont_inv_8(t1, p->z); - sp_256to512z_mont_sqr_8(t2, t1 /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(t1, t2, t1 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t2, t1 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t1, t2, t1 /*, p256_mod, p256_mp_mod*/); /* x /= z^2 */ - sp_256to512z_mont_mul_8(r->x, p->x, t2 /*, p256_mod, p256_mp_mod*/); - sp_512to256_mont_reduce_8(r->x /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(rr, p->x, t2 /*, p256_mod, p256_mp_mod*/); + memset(rr + 8, 0, sizeof(rr) / 2); + sp_512to256_mont_reduce_8(r->x, rr /*, p256_mod, p256_mp_mod*/); /* Reduce x to less than modulus */ if (sp_256_cmp_8(r->x, p256_mod) >= 0) sp_256_sub_8_p256_mod(r->x); sp_256_norm_8(r->x); /* y /= z^3 */ - sp_256to512z_mont_mul_8(r->y, p->y, t1 /*, p256_mod, p256_mp_mod*/); - sp_512to256_mont_reduce_8(r->y /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(rr, p->y, t1 /*, p256_mod, p256_mp_mod*/); + memset(rr + 8, 0, sizeof(rr) / 2); + sp_512to256_mont_reduce_8(r->y, rr /*, p256_mod, p256_mp_mod*/); /* Reduce y to less than modulus */ if (sp_256_cmp_8(r->y, p256_mod) >= 0) sp_256_sub_8_p256_mod(r->y); @@ -1091,8 +1064,8 @@ static void sp_256_map_8(sp_point* r, sp_point* p) */ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) { - sp_digit t1[2*8]; - sp_digit t2[2*8]; + sp_digit t1[8]; + sp_digit t2[8]; /* Put point to double into result */ if (r != p) @@ -1101,17 +1074,10 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) if (r->infinity) return; - if (SP_DEBUG) { - /* unused part of t2, may result in spurios - * differences in debug output. Clear it. - */ - memset(t2, 0, sizeof(t2)); - } - /* T1 = Z * Z */ - sp_256to512z_mont_sqr_8(t1, r->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t1, r->z /*, p256_mod, p256_mp_mod*/); /* Z = Y * Z */ - sp_256to512z_mont_mul_8(r->z, r->y, r->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->z, r->y, r->z /*, p256_mod, p256_mp_mod*/); /* Z = 2Z */ sp_256_mont_dbl_8(r->z, r->z /*, p256_mod*/); /* T2 = X - T1 */ @@ -1119,21 +1085,21 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) /* T1 = X + T1 */ sp_256_mont_add_8(t1, r->x, t1 /*, p256_mod*/); /* T2 = T1 * T2 */ - sp_256to512z_mont_mul_8(t2, t1, t2 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t2, t1, t2 /*, p256_mod, p256_mp_mod*/); /* T1 = 3T2 */ sp_256_mont_tpl_8(t1, t2 /*, p256_mod*/); /* Y = 2Y */ sp_256_mont_dbl_8(r->y, r->y /*, p256_mod*/); /* Y = Y * Y */ - sp_256to512z_mont_sqr_8(r->y, r->y /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(r->y, r->y /*, p256_mod, p256_mp_mod*/); /* T2 = Y * Y */ - sp_256to512z_mont_sqr_8(t2, r->y /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t2, r->y /*, p256_mod, p256_mp_mod*/); /* T2 = T2/2 */ sp_256_div2_8(t2 /*, p256_mod*/); /* Y = Y * X */ - sp_256to512z_mont_mul_8(r->y, r->y, r->x /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->y, r->y, r->x /*, p256_mod, p256_mp_mod*/); /* X = T1 * T1 */ - sp_256to512z_mont_mul_8(r->x, t1, t1 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->x, t1, t1 /*, p256_mod, p256_mp_mod*/); /* X = X - Y */ sp_256_mont_sub_8(r->x, r->x, r->y /*, p256_mod*/); /* X = X - Y */ @@ -1141,7 +1107,7 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) /* Y = Y - X */ sp_256_mont_sub_8(r->y, r->y, r->x /*, p256_mod*/); /* Y = Y * T1 */ - sp_256to512z_mont_mul_8(r->y, r->y, t1 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->y, r->y, t1 /*, p256_mod, p256_mp_mod*/); /* Y = Y - T2 */ sp_256_mont_sub_8(r->y, r->y, t2 /*, p256_mod*/); dump_512("y2 %s\n", r->y); @@ -1155,11 +1121,11 @@ static void sp_256_proj_point_dbl_8(sp_point* r, sp_point* p) */ static NOINLINE void sp_256_proj_point_add_8(sp_point* r, sp_point* p, sp_point* q) { - sp_digit t1[2*8]; - sp_digit t2[2*8]; - sp_digit t3[2*8]; - sp_digit t4[2*8]; - sp_digit t5[2*8]; + sp_digit t1[8]; + sp_digit t2[8]; + sp_digit t3[8]; + sp_digit t4[8]; + sp_digit t5[8]; /* Ensure only the first point is the same as the result. */ if (q == r) { @@ -1186,36 +1152,36 @@ static NOINLINE void sp_256_proj_point_add_8(sp_point* r, sp_point* p, sp_point* } /* U1 = X1*Z2^2 */ - sp_256to512z_mont_sqr_8(t1, q->z /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(t3, t1, q->z /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(t1, t1, r->x /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t1, q->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t3, t1, q->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t1, t1, r->x /*, p256_mod, p256_mp_mod*/); /* U2 = X2*Z1^2 */ - sp_256to512z_mont_sqr_8(t2, r->z /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(t4, t2, r->z /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(t2, t2, q->x /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t2, r->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t4, t2, r->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t2, t2, q->x /*, p256_mod, p256_mp_mod*/); /* S1 = Y1*Z2^3 */ - sp_256to512z_mont_mul_8(t3, t3, r->y /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t3, t3, r->y /*, p256_mod, p256_mp_mod*/); /* S2 = Y2*Z1^3 */ - sp_256to512z_mont_mul_8(t4, t4, q->y /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t4, t4, q->y /*, p256_mod, p256_mp_mod*/); /* H = U2 - U1 */ sp_256_mont_sub_8(t2, t2, t1 /*, p256_mod*/); /* R = S2 - S1 */ sp_256_mont_sub_8(t4, t4, t3 /*, p256_mod*/); /* Z3 = H*Z1*Z2 */ - sp_256to512z_mont_mul_8(r->z, r->z, q->z /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(r->z, r->z, t2 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->z, r->z, q->z /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->z, r->z, t2 /*, p256_mod, p256_mp_mod*/); /* X3 = R^2 - H^3 - 2*U1*H^2 */ - sp_256to512z_mont_sqr_8(r->x, t4 /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_sqr_8(t5, t2 /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(r->y, t1, t5 /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(t5, t5, t2 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(r->x, t4 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(t5, t2 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->y, t1, t5 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t5, t5, t2 /*, p256_mod, p256_mp_mod*/); sp_256_mont_sub_8(r->x, r->x, t5 /*, p256_mod*/); sp_256_mont_dbl_8(t1, r->y /*, p256_mod*/); sp_256_mont_sub_8(r->x, r->x, t1 /*, p256_mod*/); /* Y3 = R*(U1*H^2 - X3) - S1*H^3 */ sp_256_mont_sub_8(r->y, r->y, r->x /*, p256_mod*/); - sp_256to512z_mont_mul_8(r->y, r->y, t4 /*, p256_mod, p256_mp_mod*/); - sp_256to512z_mont_mul_8(t5, t5, t3 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r->y, r->y, t4 /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(t5, t5, t3 /*, p256_mod, p256_mp_mod*/); sp_256_mont_sub_8(r->y, r->y, t5 /*, p256_mod*/); } From vda.linux at googlemail.com Sat Nov 27 18:36:23 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sat, 27 Nov 2021 19:36:23 +0100 Subject: [git commit] tls: P256: trivial x86-64 fix Message-ID: <20211127183312.2BC338DB0C@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=0b13ab66f43fc1a9437361cfcd33b485422eb0ae branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 3452b08b9..4c8f08d4e 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -500,7 +500,7 @@ static void sp_256to512_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b) #elif ALLOW_ASM && defined(__GNUC__) && defined(__x86_64__) const uint64_t* aa = (const void*)a; const uint64_t* bb = (const void*)b; - const uint64_t* rr = (const void*)r; + uint64_t* rr = (void*)r; int k; uint64_t accl; uint64_t acch; From vda.linux at googlemail.com Sun Nov 28 01:56:02 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 28 Nov 2021 02:56:02 +0100 Subject: [git commit] tls: P256: pad struct sp_point to 64 bits (on 64-bit arches) Message-ID: <20211128015203.3E0788E29B@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=1b93c7c4ecc47318905b6e6f801732b7dd31e0ee branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta curve_P256_compute_pubkey_and_premaster 198 190 -8 Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 14 +++++++++----- 1 file changed, 9 insertions(+), 5 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 4c8f08d4e..37e1cfa1c 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -49,14 +49,19 @@ typedef int32_t signed_sp_digit; */ typedef struct sp_point { - sp_digit x[8]; + sp_digit x[8] +#if ULONG_MAX > 0xffffffff + /* Make sp_point[] arrays to not be 64-bit misaligned */ + ALIGNED(8) +#endif + ; sp_digit y[8]; sp_digit z[8]; int infinity; } sp_point; /* The modulus (prime) of the curve P256. */ -static const sp_digit p256_mod[8] = { +static const sp_digit p256_mod[8] ALIGNED(8) = { 0xffffffff,0xffffffff,0xffffffff,0x00000000, 0x00000000,0x00000000,0x00000001,0xffffffff, }; @@ -903,7 +908,7 @@ static void sp_512to256_mont_reduce_8(sp_digit* r, sp_digit* a/*, const sp_digit * a First number to multiply in Montogmery form. * b Second number to multiply in Montogmery form. * m Modulus (prime). - * mp Montogmery mulitplier. + * mp Montogmery multiplier. */ static void sp_256_mont_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b /*, const sp_digit* m, sp_digit mp*/) @@ -920,7 +925,7 @@ static void sp_256_mont_mul_8(sp_digit* r, const sp_digit* a, const sp_digit* b * r Result of squaring. * a Number to square in Montogmery form. * m Modulus (prime). - * mp Montogmery mulitplier. + * mp Montogmery multiplier. */ static void sp_256_mont_sqr_8(sp_digit* r, const sp_digit* a /*, const sp_digit* m, sp_digit mp*/) @@ -1145,7 +1150,6 @@ static NOINLINE void sp_256_proj_point_add_8(sp_point* r, sp_point* p, sp_point* return; } - if (p->infinity || q->infinity) { *r = p->infinity ? *q : *p; /* struct copy */ return; From rep.dot.nop at gmail.com Sun Nov 28 09:53:22 2021 From: rep.dot.nop at gmail.com (Bernhard Reutner-Fischer) Date: Sun, 28 Nov 2021 10:53:22 +0100 Subject: [git commit] libarchive: remove duplicate forward declaration Message-ID: <20211128095007.D8F8E8D36F@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=bfefa6ab6cf30507009cca7182c7302900fb5534 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master Signed-off-by: Bernhard Reutner-Fischer --- include/bb_archive.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/bb_archive.h b/include/bb_archive.h index dc5e55f0a..e0ef8fc4e 100644 --- a/include/bb_archive.h +++ b/include/bb_archive.h @@ -195,7 +195,6 @@ char get_header_ar(archive_handle_t *archive_handle) FAST_FUNC; char get_header_cpio(archive_handle_t *archive_handle) FAST_FUNC; char get_header_tar(archive_handle_t *archive_handle) FAST_FUNC; char get_header_tar_gz(archive_handle_t *archive_handle) FAST_FUNC; -char get_header_tar_xz(archive_handle_t *archive_handle) FAST_FUNC; char get_header_tar_bz2(archive_handle_t *archive_handle) FAST_FUNC; char get_header_tar_lzma(archive_handle_t *archive_handle) FAST_FUNC; char get_header_tar_xz(archive_handle_t *archive_handle) FAST_FUNC; From vda.linux at googlemail.com Sun Nov 28 10:15:34 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 28 Nov 2021 11:15:34 +0100 Subject: [git commit] tls: P256: simplify sp_256_mont_inv_8 (no need for a temporary) Message-ID: <20211128101115.1778F8D47D@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=cfb615781df5c7439fe0060a85e6b6a56d10dc7f branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta sp_256_ecc_mulmod_8 543 517 -26 Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 10 ++++------ 1 file changed, 4 insertions(+), 6 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 37e1cfa1c..9bd5c6832 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -938,7 +938,7 @@ static void sp_256_mont_sqr_8(sp_digit* r, const sp_digit* a /* Invert the number, in Montgomery form, modulo the modulus (prime) of the * P256 curve. (r = 1 / a mod m) * - * r Inverse result. + * r Inverse result. Must not coincide with a. * a Number to invert. */ #if 0 @@ -952,17 +952,15 @@ static void sp_256_mont_sqr_8(sp_digit* r, const sp_digit* a #endif static void sp_256_mont_inv_8(sp_digit* r, sp_digit* a) { - sp_digit t[8]; int i; - memcpy(t, a, sizeof(sp_digit) * 8); + memcpy(r, a, sizeof(sp_digit) * 8); for (i = 254; i >= 0; i--) { - sp_256_mont_sqr_8(t, t /*, p256_mod, p256_mp_mod*/); + sp_256_mont_sqr_8(r, r /*, p256_mod, p256_mp_mod*/); /*if (p256_mod_2[i / 32] & ((sp_digit)1 << (i % 32)))*/ if (i >= 224 || i == 192 || (i <= 95 && i != 1)) - sp_256_mont_mul_8(t, t, a /*, p256_mod, p256_mp_mod*/); + sp_256_mont_mul_8(r, r, a /*, p256_mod, p256_mp_mod*/); } - memcpy(r, t, sizeof(sp_digit) * 8); } /* Multiply a number by Montogmery normalizer mod modulus (prime). From vda.linux at googlemail.com Sun Nov 28 11:21:23 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 28 Nov 2021 12:21:23 +0100 Subject: [git commit] libbb: code shrink in des encryption, in setup_salt() Message-ID: <20211128112225.5A6958F29C@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=00b5051cd25ef7e42ac62637ba16b70d3ac1014a branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta pw_encrypt 978 971 -7 .rodata 108208 108192 -16 des_crypt 1211 1181 -30 ------------------------------------------------------------------------------ (add/remove: 0/0 grow/shrink: 0/3 up/down: 0/-53) Total: -53 bytes Signed-off-by: Denys Vlasenko --- libbb/pw_encrypt_des.c | 29 ++++++++++++++--------------- testsuite/cryptpw.tests | 14 ++++++++++++++ 2 files changed, 28 insertions(+), 15 deletions(-) diff --git a/libbb/pw_encrypt_des.c b/libbb/pw_encrypt_des.c index dcd3521e2..fe8237cfe 100644 --- a/libbb/pw_encrypt_des.c +++ b/libbb/pw_encrypt_des.c @@ -363,7 +363,7 @@ des_init(struct des_ctx *ctx, const struct const_des_ctx *cctx) old_rawkey0 = old_rawkey1 = 0; old_salt = 0; #endif - saltbits = 0; + //saltbits = 0; /* not needed: we call setup_salt() before do_des() */ bits28 = bits32 + 4; bits24 = bits28 + 4; @@ -481,12 +481,11 @@ des_init(struct des_ctx *ctx, const struct const_des_ctx *cctx) return ctx; } - +/* Accepts 24-bit salt at max */ static void setup_salt(struct des_ctx *ctx, uint32_t salt) { - uint32_t obit, saltbit; - int i; + uint32_t invbits; #if USE_REPETITIVE_SPEEDUP if (salt == old_salt) @@ -494,15 +493,15 @@ setup_salt(struct des_ctx *ctx, uint32_t salt) old_salt = salt; #endif - saltbits = 0; - saltbit = 1; - obit = 0x800000; - for (i = 0; i < 24; i++) { - if (salt & saltbit) - saltbits |= obit; - saltbit <<= 1; - obit >>= 1; - } + invbits = 0; + + salt |= (1 << 24); + do { + invbits = (invbits << 1) + (salt & 1); + salt >>= 1; + } while (salt != 1); + + saltbits = invbits; } static void @@ -736,14 +735,14 @@ des_crypt(struct des_ctx *ctx, char output[DES_OUT_BUFSIZE], des_setkey(ctx, (char *)keybuf); /* - * salt_str - 2 bytes of salt + * salt_str - 2 chars of salt (converted to 12 bits) * key - up to 8 characters */ output[0] = salt_str[0]; output[1] = salt_str[1]; salt = (ascii_to_bin(salt_str[1]) << 6) | ascii_to_bin(salt_str[0]); - setup_salt(ctx, salt); + setup_salt(ctx, salt); /* set ctx->saltbits for do_des() */ /* Do it. */ do_des(ctx, /*0, 0,*/ &r0, &r1, 25 /* count */); diff --git a/testsuite/cryptpw.tests b/testsuite/cryptpw.tests index 8ec476c9f..0dd91fe15 100755 --- a/testsuite/cryptpw.tests +++ b/testsuite/cryptpw.tests @@ -7,6 +7,20 @@ # testing "description" "command" "result" "infile" "stdin" +#optional USE_BB_CRYPT +testing "cryptpw des 12" \ + "cryptpw -m des QWErty '123456789012345678901234567890'" \ + '12MnB3PqfVbMA\n' "" "" + +testing "cryptpw des 55" \ + "cryptpw -m des QWErty 55" \ + '55tgFLtkT1Y72\n' "" "" + +testing "cryptpw des zz" \ + "cryptpw -m des QWErty zz" \ + 'zzIZaaXWOkxVk\n' "" "" +#SKIP= + optional USE_BB_CRYPT_SHA testing "cryptpw sha256" \ "cryptpw -m sha256 QWErty '123456789012345678901234567890'" \ From vda.linux at googlemail.com Sun Nov 28 11:55:20 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 28 Nov 2021 12:55:20 +0100 Subject: [git commit] tls: P256: add comment on logic in sp_512to256_mont_reduce_8, no code changes Message-ID: <20211128115118.298808F2A3@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=832626227ea3798403159080532f763a37273a91 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 33 +++++++++++++++++++++++---------- 1 file changed, 23 insertions(+), 10 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index 9bd5c6832..eb6cc2431 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -850,6 +850,20 @@ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) * a Double-wide number to reduce. Clobbered. * m The single precision number representing the modulus. * mp The digit representing the negative inverse of m mod 2^n. + * + * Montgomery reduction on multiprecision integers: + * Montgomery reduction requires products modulo R. + * When R is a power of B [in our case R=2^128, B=2^32], there is a variant + * of Montgomery reduction which requires products only of machine word sized + * integers. T is stored as an little-endian word array a[0..n]. The algorithm + * reduces it one word at a time. First an appropriate multiple of modulus + * is added to make T divisible by B. [In our case, it is p256_mp_mod * a[0].] + * Then a multiple of modulus is added to make T divisible by B^2. + * [In our case, it is (p256_mp_mod * a[1]) << 32.] + * And so on. Eventually T is divisible by R, and after division by R + * the algorithm is in the same place as the usual Montgomery reduction was. + * + * TODO: Can conditionally use 64-bit (if bit-little-endian arch) logic? */ static void sp_512to256_mont_reduce_8(sp_digit* r, sp_digit* a/*, const sp_digit* m, sp_digit mp*/) { @@ -941,15 +955,6 @@ static void sp_256_mont_sqr_8(sp_digit* r, const sp_digit* a * r Inverse result. Must not coincide with a. * a Number to invert. */ -#if 0 -//p256_mod - 2: -//ffffffff 00000001 00000000 00000000 00000000 ffffffff ffffffff ffffffff - 2 -//Bit pattern: -//2 2 2 2 2 2 2 1...1 -//5 5 4 3 2 1 0 9...0 9...1 -//543210987654321098765432109876543210987654321098765432109876543210...09876543210...09876543210 -//111111111111111111111111111111110000000000000000000000000000000100...00000111111...11111111101 -#endif static void sp_256_mont_inv_8(sp_digit* r, sp_digit* a) { int i; @@ -957,7 +962,15 @@ static void sp_256_mont_inv_8(sp_digit* r, sp_digit* a) memcpy(r, a, sizeof(sp_digit) * 8); for (i = 254; i >= 0; i--) { sp_256_mont_sqr_8(r, r /*, p256_mod, p256_mp_mod*/); - /*if (p256_mod_2[i / 32] & ((sp_digit)1 << (i % 32)))*/ +/* p256_mod - 2: + * ffffffff 00000001 00000000 00000000 00000000 ffffffff ffffffff ffffffff - 2 + * Bit pattern: + * 2 2 2 2 2 2 2 1...1 + * 5 5 4 3 2 1 0 9...0 9...1 + * 543210987654321098765432109876543210987654321098765432109876543210...09876543210...09876543210 + * 111111111111111111111111111111110000000000000000000000000000000100...00000111111...11111111101 + */ + /*if (p256_mod_minus_2[i / 32] & ((sp_digit)1 << (i % 32)))*/ if (i >= 224 || i == 192 || (i <= 95 && i != 1)) sp_256_mont_mul_8(r, r, a /*, p256_mod, p256_mp_mod*/); } From vda.linux at googlemail.com Sun Nov 28 14:44:08 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 28 Nov 2021 15:44:08 +0100 Subject: [git commit] tls: P256: add 64-bit montgomery reduce (disabled), small optimization in 32-bit code Message-ID: <20211128144102.094C390981@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=90b0d3304455ad432c49f38e0419ac7820a625f7 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master function old new delta sp_512to256_mont_reduce_8 191 185 -6 Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 177 +++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 159 insertions(+), 18 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index eb6cc2431..b1c410037 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -705,36 +705,174 @@ static void sp_256_mont_tpl_8(sp_digit* r, const sp_digit* a /*, const sp_digit* } } -/* Shift the result in the high 256 bits down to the bottom. - */ +/* Shift the result in the high 256 bits down to the bottom. */ static void sp_512to256_mont_shift_8(sp_digit* r, sp_digit* a) { memcpy(r, a + 8, sizeof(*r) * 8); } +// Disabled for now. Seems to work, but ugly and 40 bytes larger on x86-64. +#if 0 //UNALIGNED_LE_64BIT +/* 64-bit little-endian optimized version. + * See generic 32-bit version below for explanation. + * The benefit of this version is: even though r[3] calculation is atrocious, + * we call sp_256_mul_add_4() four times, not 8. + */ +static int sp_256_mul_add_4(uint64_t *r /*, const uint64_t* a, uint64_t b*/) +{ + uint64_t b = r[0]; + +# if 0 + const uint64_t* a = (const void*)p256_mod; +//a[3..0] = ffffffff00000001 0000000000000000 00000000ffffffff ffffffffffffffff + uint128_t t; + int i; + t = 0; + for (i = 0; i < 4; i++) { + uint32_t t_hi; + uint128_t m = ((uint128_t)b * a[i]) + r[i]; + t += m; + t_hi = (t < m); + r[i] = (uint64_t)t; + t = (t >> 64) | ((uint128_t)t_hi << 64); + } + r[4] += (uint64_t)t; + return (r[4] < (uint64_t)t); /* 1 if addition overflowed */ +# else + // Unroll, then optimize the above loop: + //uint32_t t_hi; + //uint128_t m; + uint64_t t64, t64u; + + //m = ((uint128_t)b * a[0]) + r[0]; + // Since b is r[0] and a[0] is ffffffffffffffff, the above optimizes to: + // m = r[0] * ffffffffffffffff + r[0] = (r[0] << 64 - r[0]) + r[0] = r[0] << 64; + //t += m; + // t = r[0] << 64 = b << 64; + //t_hi = (t < m); + // t_hi = 0; + //r[0] = (uint64_t)t; +// r[0] = 0; +//the store can be eliminated since caller won't look at lower 256 bits of the result + //t = (t >> 64) | ((uint128_t)t_hi << 64); + // t = b; + + //m = ((uint128_t)b * a[1]) + r[1]; + // Since a[1] is 00000000ffffffff, the above optimizes to: + // m = b * ffffffff + r[1] = (b * 100000000 - b) + r[1] = (b << 32) - b + r[1]; + //t += m; + // t = b + (b << 32) - b + r[1] = (b << 32) + r[1]; + //t_hi = (t < m); + // t_hi = 0; + //r[1] = (uint64_t)t; + r[1] += (b << 32); + //t = (t >> 64) | ((uint128_t)t_hi << 64); + t64 = (r[1] < (b << 32)); + t64 += (b >> 32); + + //m = ((uint128_t)b * a[2]) + r[2]; + // Since a[2] is 0000000000000000, the above optimizes to: + // m = b * 0 + r[2] = r[2]; + //t += m; + // t = t64 + r[2]; + //t_hi = (t < m); + // t_hi = 0; + //r[2] = (uint64_t)t; + r[2] += t64; + //t = (t >> 64) | ((uint128_t)t_hi << 64); + t64 = (r[2] < t64); + + //m = ((uint128_t)b * a[3]) + r[3]; + // Since a[3] is ffffffff00000001, the above optimizes to: + // m = b * ffffffff00000001 + r[3]; + // m = b + b*ffffffff00000000 + r[3] + // m = b + (b*ffffffff << 32) + r[3] + // m = b + (((b<<32) - b) << 32) + r[3] + //t += m; + // t = t64 + (uint128_t)b + ((((uint128_t)b << 32) - b) << 32) + r[3]; + t64 += b; + t64u = (t64 < b); + t64 += r[3]; + t64u += (t64 < r[3]); + { + uint64_t lo,hi; + //lo = (((b << 32) - b) << 32 + //hi = (((uint128_t)b << 32) - b) >> 32 + //but without uint128_t: + hi = (b << 32) - b; /* form lower 32 bits of "hi" part 1 */ + b = (b >> 32) - (/*borrowed above?*/(b << 32) < b); /* upper 32 bits of "hi" are in b */ + lo = hi << 32; /* (use "hi" value to calculate "lo",... */ + t64 += lo; /* ...consume... */ + t64u += (t64 < lo); /* ..."lo") */ + hi >>= 32; /* form lower 32 bits of "hi" part 2 */ + hi |= (b << 32); /* combine lower and upper */ + t64u += hi; /* consume "hi" */ + } + //t_hi = (t < m); + // t_hi = 0; + //r[3] = (uint64_t)t; + r[3] = t64; + //t = (t >> 64) | ((uint128_t)t_hi << 64); + // t = t64u; + + r[4] += t64u; + return (r[4] < t64u); /* 1 if addition overflowed */ +# endif +} + +static void sp_512to256_mont_reduce_8(sp_digit* r, sp_digit* aa/*, const sp_digit* m, sp_digit mp*/) +{ +// const sp_digit* m = p256_mod; + int i; + uint64_t *a = (void*)aa; + + sp_digit carry = 0; + for (i = 0; i < 4; i++) { +// mu = a[i]; + if (sp_256_mul_add_4(a+i /*, m, mu*/)) { + int j = i + 4; + inc_next_word: + if (++j > 7) { /* a[8] array has no more words? */ + carry++; + continue; + } + if (++a[j] == 0) /* did this overflow too? */ + goto inc_next_word; + } + } + sp_512to256_mont_shift_8(r, aa); + if (carry != 0) + sp_256_sub_8_p256_mod(r); + sp_256_norm_8(r); +} + +#else /* Generic 32-bit version */ + /* Mul a by scalar b and add into r. (r += a * b) * a = p256_mod * b = r[0] */ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) { -// const sp_digit* a = p256_mod; -//a[7..0] = ffffffff 00000001 00000000 00000000 00000000 ffffffff ffffffff ffffffff sp_digit b = r[0]; - uint64_t t; -// t = 0; -// for (i = 0; i < 8; i++) { -// uint32_t t_hi; -// uint64_t m = ((uint64_t)b * a[i]) + r[i]; -// t += m; -// t_hi = (t < m); -// r[i] = (sp_digit)t; -// t = (t >> 32) | ((uint64_t)t_hi << 32); -// } -// r[8] += (sp_digit)t; - +# if 0 + const sp_digit* a = p256_mod; +//a[7..0] = ffffffff 00000001 00000000 00000000 00000000 ffffffff ffffffff ffffffff + int i; + t = 0; + for (i = 0; i < 8; i++) { + uint32_t t_hi; + uint64_t m = ((uint64_t)b * a[i]) + r[i]; + t += m; + t_hi = (t < m); + r[i] = (sp_digit)t; + t = (t >> 32) | ((uint64_t)t_hi << 32); + } + r[8] += (sp_digit)t; + return (r[8] < (sp_digit)t); /* 1 if addition overflowed */ +# else // Unroll, then optimize the above loop: //uint32_t t_hi; uint64_t m; @@ -748,7 +886,8 @@ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) //t_hi = (t < m); // t_hi = 0; //r[0] = (sp_digit)t; - r[0] = 0; +// r[0] = 0; +//the store can be eliminated since caller won't look at lower 256 bits of the result //t = (t >> 32) | ((uint64_t)t_hi << 32); // t = b; @@ -840,6 +979,7 @@ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) r[8] += (sp_digit)t; return (r[8] < (sp_digit)t); /* 1 if addition overflowed */ +# endif } /* Reduce the number back to 256 bits using Montgomery reduction. @@ -861,7 +1001,7 @@ static int sp_256_mul_add_8(sp_digit* r /*, const sp_digit* a, sp_digit b*/) * Then a multiple of modulus is added to make T divisible by B^2. * [In our case, it is (p256_mp_mod * a[1]) << 32.] * And so on. Eventually T is divisible by R, and after division by R - * the algorithm is in the same place as the usual Montgomery reduction was. + * the algorithm is in the same place as the usual Montgomery reduction. * * TODO: Can conditionally use 64-bit (if bit-little-endian arch) logic? */ @@ -914,6 +1054,7 @@ static void sp_512to256_mont_reduce_8(sp_digit* r, sp_digit* a/*, const sp_digit sp_256_norm_8(r); } } +#endif /* Multiply two Montogmery form numbers mod the modulus (prime). * (r = a * b mod m) From vda.linux at googlemail.com Sun Nov 28 20:43:51 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Sun, 28 Nov 2021 21:43:51 +0100 Subject: [git commit] tls: P256: enable 64-bit version of montgomery reduction Message-ID: <20211128204212.CC9C390AC1@busybox.osuosl.org> commit: https://git.busybox.net/busybox/commit/?id=8514b4166d7a9d7720006d852ae67f43baed8ef1 branch: https://git.busybox.net/busybox/commit/?id=refs/heads/master After more testing, (1) I'm more sure it is indeed correct, and (2) it is a significant speedup - we do a lot of those multiplications. function old new delta sp_512to256_mont_reduce_8 191 223 +32 Signed-off-by: Denys Vlasenko --- networking/tls_sp_c32.c | 15 ++++++++------- 1 file changed, 8 insertions(+), 7 deletions(-) diff --git a/networking/tls_sp_c32.c b/networking/tls_sp_c32.c index b1c410037..cb166e413 100644 --- a/networking/tls_sp_c32.c +++ b/networking/tls_sp_c32.c @@ -711,12 +711,13 @@ static void sp_512to256_mont_shift_8(sp_digit* r, sp_digit* a) memcpy(r, a + 8, sizeof(*r) * 8); } -// Disabled for now. Seems to work, but ugly and 40 bytes larger on x86-64. -#if 0 //UNALIGNED_LE_64BIT +#if UNALIGNED_LE_64BIT /* 64-bit little-endian optimized version. * See generic 32-bit version below for explanation. * The benefit of this version is: even though r[3] calculation is atrocious, * we call sp_256_mul_add_4() four times, not 8. + * Measured run time improvement of curve_P256_compute_pubkey_and_premaster() + * call on x86-64: from ~1500us to ~900us. Code size +32 bytes. */ static int sp_256_mul_add_4(uint64_t *r /*, const uint64_t* a, uint64_t b*/) { @@ -794,18 +795,18 @@ static int sp_256_mul_add_4(uint64_t *r /*, const uint64_t* a, uint64_t b*/) t64u = (t64 < b); t64 += r[3]; t64u += (t64 < r[3]); - { - uint64_t lo,hi; + { // add ((((uint128_t)b << 32) - b) << 32): + uint64_t lo, hi; //lo = (((b << 32) - b) << 32 //hi = (((uint128_t)b << 32) - b) >> 32 //but without uint128_t: - hi = (b << 32) - b; /* form lower 32 bits of "hi" part 1 */ + hi = (b << 32) - b; /* make lower 32 bits of "hi", part 1 */ b = (b >> 32) - (/*borrowed above?*/(b << 32) < b); /* upper 32 bits of "hi" are in b */ lo = hi << 32; /* (use "hi" value to calculate "lo",... */ t64 += lo; /* ...consume... */ t64u += (t64 < lo); /* ..."lo") */ - hi >>= 32; /* form lower 32 bits of "hi" part 2 */ - hi |= (b << 32); /* combine lower and upper */ + hi >>= 32; /* make lower 32 bits of "hi", part 2 */ + hi |= (b << 32); /* combine lower and upper 32 bits */ t64u += hi; /* consume "hi" */ } //t_hi = (t < m); From bugzilla at busybox.net Mon Nov 29 08:18:13 2021 From: bugzilla at busybox.net (bugzilla at busybox.net) Date: Mon, 29 Nov 2021 08:18:13 +0000 Subject: [Bug 14401] New: The unzip Binary is from older version which has security vulnerabilities. Message-ID: https://bugs.busybox.net/show_bug.cgi?id=14401 Bug ID: 14401 Summary: The unzip Binary is from older version which has security vulnerabilities. Product: Busybox Version: 1.33.x Hardware: All OS: All Status: NEW Severity: major Priority: P5 Component: Other Assignee: unassigned at busybox.net Reporter: sandep121 at gmail.com CC: busybox-cvs at busybox.net Target Milestone: --- Current unzip binary has following security issues: https://nvd.nist.gov/vuln/detail/CVE-2005-0602 https://nvd.nist.gov/vuln/detail/CVE-2001-1268 https://nvd.nist.gov/vuln/detail/CVE-2001-1269 -- You are receiving this mail because: You are on the CC list for the bug. From vda.linux at googlemail.com Tue Nov 30 22:41:13 2021 From: vda.linux at googlemail.com (Denys Vlasenko) Date: Tue, 30 Nov 2021 23:41:13 +0100 Subject: [git commit] Announce 1.33.2 Message-ID: <20211130223754.BABBD92198@busybox.osuosl.org> commit: https://git.busybox.net/busybox-website/commit/?id=76db5f4e96656f869defbbddf2d572a87daf3ddd branch: https://git.busybox.net/busybox-website/commit/?id=refs/heads/master Signed-off-by: Denys Vlasenko --- news.html | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/news.html b/news.html index bdffc39..fc1f8fb 100644 --- a/news.html +++ b/news.html @@ -34,6 +34,16 @@

+
  • 30 November 2021 -- BusyBox 1.33.2 (stable) +

    BusyBox 1.33.2. + (git)

    + +

    Bug fix release. 1.33.2 has fixes for + hush and ash (parsing fixes) and + unlzma (fix a case where we could read before beginning of buffer). +

    +
  • +
  • 30 September 2021 -- BusyBox 1.34.1 (stable)

    BusyBox 1.34.1. (git)