[PATCH] Add support for zstd decompression
Norbert Lange
nolange79 at gmail.com
Sun Sep 12 21:12:13 UTC 2021
Am So., 12. Sept. 2021 um 09:00 Uhr schrieb Jeff Pohlmeyer
<yetanothergeek at gmail.com>:
>
> On Fri, Sep 10, 2021 at 7:52 PM Denys Vlasenko <vda.linux at googlemail.com> wrote:
> > I'm getting this:
>
> > (add/remove: 96/0 grow/shrink: 6/2 up/down: 24743/-98) Total: 24645 bytes
I can kick this down a bit by declaring all functions static,
inlining and constant propagation does the rest.
Using git/busybox as source for busybox (x86_64, gcc 10)
GEN /tmp/build/Makefile
function old new delta
unpack_zstd_stream - 5070 +5070
static.HUF_readDTableX1_wksp_bmi2 - 1755 +1755
static.ZSTD_decompressBlock_internal - 1468 +1468
static.ZSTD_decompressSequences_body - 1429 +1429
ZSTD_decompressContinue - 1062 +1062
HUF_decompress4X1_usingDTable_internal_body - 883 +883
FSE_readNCount_body - 622 +622
ML_defaultDTable - 520 +520
LL_defaultDTable - 520 +520
static.ZSTD_buildFSETable_body - 518 +518
XXH64_digest - 494 +494
static.FSE_decompress_usingDTable_generic - 470 +470
ZSTD_getFrameHeader_advanced - 423 +423
static.XXH64_update_endian - 416 +416
ZSTD_decompressBegin_usingDDict - 391 +391
static.ZSTD_buildSeqTable - 375 +375
ZSTD_execSequenceEnd - 300 +300
OF_defaultDTable - 264 +264
ZSTD_decodeFrameHeader - 259 +259
BIT_initDStream - 258 +258
ZSTD_DCtx_selectFrameDDict - 234 +234
ZSTD_safecopy - 225 +225
ML_bits - 212 +212
ML_base - 212 +212
.rodata 98830 99029 +199
ZSTD_decompressContinueStream - 177 +177
LL_bits - 144 +144
LL_base - 144 +144
OF_bits - 128 +128
OF_base - 128 +128
unzstd_main - 126 +126
static.HUF_decodeStreamX1 - 117 +117
BIT_reloadDStream - 114 +114
ZSTD_overlapCopy8 - 107 +107
ZSTD_clearDict - 105 +105
ZSTD_frameHeaderSize_internal - 103 +103
HUF_decompress1X1_usingDTable_internal_body - 102 +102
ZSTD_wildcopy - 94 +94
static.unzstd_longopts - 81 +81
packed_usage 34120 34198 +78
ZSTD_getcBlockSize - 78 +78
tar_main 1290 1360 +70
FSE_decodeSymbolFast - 58 +58
BIT_reloadDStreamFast - 50 +50
setup_transformer_on_fd 155 204 +49
FSE_decodeSymbol - 44 +44
HUF_decodeSymbolX1 - 39 +39
BIT_readBits - 38 +38
ZSTD_initFseState - 34 +34
static.dec64table - 32 +32
static.dec32table - 32 +32
ZSTD_fcs_fieldSize - 32 +32
ZSTD_did_fieldSize - 32 +32
static.ZSTD_customFree - 27 +27
applet_main 3192 3216 +24
BIT_endOfDStream - 22 +22
applet_names 2747 2767 +20
repStartValue - 12 +12
tar_longopts 314 321 +7
static.CSWTCH - 6 +6
applet_suid 100 101 +1
applet_install_loc 200 201 +1
------------------------------------------------------------------------------
(add/remove: 54/0 grow/shrink: 9/0 up/down: 21035/0) Total: 21035 bytes
text data bss dec hex filename
999282 16443 1856 1017581 f86ed busybox_old
1020376 16467 1856 1038699 fd96b busybox_unstripped
>
> > I suspect Facebook et al do not share busybox's zeal about smaller size.
Particularly some bullet points for zstd are speed, so that's a bit
beside the point ;)
Ideally we could define some macros to get there,
I believe the simplest assumption is, that just no one cared enough
to cleanly separate every option.
>
> I found this comment on github[1]:
> "There is no new magic number planned in the foreseeable future.
> 0xFD2FB528 is intended to be the only magic number for zstd frames."
>
> Do you think that implies that at least the basic file format is
> probably stable?
The format is documented and even publicized as rfc8878.
Digging through the code I already found some spots adding code to ensure
no data is produced that old (reference) implementations cant decode
(ie. workaround for bugs).
so going with the reference implementation should be rather safe.
Still I think that being able to track upstream should be the best path.
I did my own patch (some time ago, just took time to clean it up),
as far as I can see some bits are there that are missing in Jeff's patch,
the unzstd applet is a bit more feature full and behaves like the reference.
The concept for upstream sources would be to use tools/scripts
for most changes. (documented in README.source aswell).
extending that, to say cut out comments or functions that aren't used
(anything related to compression/dictionaries) should result
in something making upstream syncs simpler and drop like 2/3 rds of lines.
$zstd_path/contrib/freestanding_lib/freestanding.py \
--source-lib $zstd_path/lib \
--output-lib zstd \
-DZSTD_NO_INTRINSICS \
-DZSTD_NO_UNUSED_FUNCTIONS \
-DZSTD_LEGACY_SUPPORT=0 \
-DZSTD_STATIC_LINKING_ONLY \
-DFSE_STATIC_LINKING_ONLY \
-DHUF_STATIC_LINKING_ONLY \
-DXXH_STATIC_LINKING_ONLY \
-DZSTD_ADDRESS_SANITIZER=0 \
-DZSTD_MEMORY_SANITIZER=0 \
-UFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION \
-U__cplusplus \
-UZSTD_DLL_EXPORT \
-UZSTD_DLL_IMPORT \
-UZSTD_MULTITHREAD \
-RZSTDLIB_API=MEM_STATIC \
-RZSTDLIB_VISIBILITY=MEM_STATIC \
-RZSTDERRORLIB_VISIBILITY=MEM_STATIC \
-DZSTD_HAVE_WEAK_SYMBOLS=0 \
-DZSTD_TRACE=0 \
-DZSTD_NO_TRACE
sed -e 's,^\([[:alnum:]_\*]* ERR_[[:alnum:]_]*\)(,static \1(,' \
-e 's,^\([[:alnum:]_\*]* FSE_[[:alnum:]_]*\) \?(,static \1(,' \
-e 's,^\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1(,' \
-e 's,^\([[:alnum:]_\*]* HUF_[[:alnum:]_]*\) \?(,static \1(,' \
-e 's,^\([[:alnum:]_\*]* HIST_[[:alnum:]_]*\)(,static \1(,' \
-e 's,^\(const \)\?\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1\2(,' \
-i zstd/*/*.h
Norbert
More information about the busybox
mailing list