[PATCH] Add support for zstd decompression

Norbert Lange nolange79 at gmail.com
Sun Sep 12 21:12:13 UTC 2021


Am So., 12. Sept. 2021 um 09:00 Uhr schrieb Jeff Pohlmeyer
<yetanothergeek at gmail.com>:
>
> On Fri, Sep 10, 2021 at 7:52 PM Denys Vlasenko <vda.linux at googlemail.com> wrote:
> > I'm getting this:
>
> > (add/remove: 96/0 grow/shrink: 6/2 up/down: 24743/-98)      Total: 24645 bytes

I can kick this down a bit by declaring all functions static,
inlining and constant propagation does the rest.

  Using git/busybox as source for busybox (x86_64, gcc 10)
  GEN     /tmp/build/Makefile
function                                             old     new   delta
unpack_zstd_stream                                     -    5070   +5070
static.HUF_readDTableX1_wksp_bmi2                      -    1755   +1755
static.ZSTD_decompressBlock_internal                   -    1468   +1468
static.ZSTD_decompressSequences_body                   -    1429   +1429
ZSTD_decompressContinue                                -    1062   +1062
HUF_decompress4X1_usingDTable_internal_body            -     883    +883
FSE_readNCount_body                                    -     622    +622
ML_defaultDTable                                       -     520    +520
LL_defaultDTable                                       -     520    +520
static.ZSTD_buildFSETable_body                         -     518    +518
XXH64_digest                                           -     494    +494
static.FSE_decompress_usingDTable_generic              -     470    +470
ZSTD_getFrameHeader_advanced                           -     423    +423
static.XXH64_update_endian                             -     416    +416
ZSTD_decompressBegin_usingDDict                        -     391    +391
static.ZSTD_buildSeqTable                              -     375    +375
ZSTD_execSequenceEnd                                   -     300    +300
OF_defaultDTable                                       -     264    +264
ZSTD_decodeFrameHeader                                 -     259    +259
BIT_initDStream                                        -     258    +258
ZSTD_DCtx_selectFrameDDict                             -     234    +234
ZSTD_safecopy                                          -     225    +225
ML_bits                                                -     212    +212
ML_base                                                -     212    +212
.rodata                                            98830   99029    +199
ZSTD_decompressContinueStream                          -     177    +177
LL_bits                                                -     144    +144
LL_base                                                -     144    +144
OF_bits                                                -     128    +128
OF_base                                                -     128    +128
unzstd_main                                            -     126    +126
static.HUF_decodeStreamX1                              -     117    +117
BIT_reloadDStream                                      -     114    +114
ZSTD_overlapCopy8                                      -     107    +107
ZSTD_clearDict                                         -     105    +105
ZSTD_frameHeaderSize_internal                          -     103    +103
HUF_decompress1X1_usingDTable_internal_body            -     102    +102
ZSTD_wildcopy                                          -      94     +94
static.unzstd_longopts                                 -      81     +81
packed_usage                                       34120   34198     +78
ZSTD_getcBlockSize                                     -      78     +78
tar_main                                            1290    1360     +70
FSE_decodeSymbolFast                                   -      58     +58
BIT_reloadDStreamFast                                  -      50     +50
setup_transformer_on_fd                              155     204     +49
FSE_decodeSymbol                                       -      44     +44
HUF_decodeSymbolX1                                     -      39     +39
BIT_readBits                                           -      38     +38
ZSTD_initFseState                                      -      34     +34
static.dec64table                                      -      32     +32
static.dec32table                                      -      32     +32
ZSTD_fcs_fieldSize                                     -      32     +32
ZSTD_did_fieldSize                                     -      32     +32
static.ZSTD_customFree                                 -      27     +27
applet_main                                         3192    3216     +24
BIT_endOfDStream                                       -      22     +22
applet_names                                        2747    2767     +20
repStartValue                                          -      12     +12
tar_longopts                                         314     321      +7
static.CSWTCH                                          -       6      +6
applet_suid                                          100     101      +1
applet_install_loc                                   200     201      +1
------------------------------------------------------------------------------
(add/remove: 54/0 grow/shrink: 9/0 up/down: 21035/0)        Total: 21035 bytes
   text    data     bss     dec     hex filename
 999282   16443    1856 1017581   f86ed busybox_old
1020376   16467    1856 1038699   fd96b busybox_unstripped

>
> > I suspect Facebook et al do not share busybox's zeal about smaller size.

Particularly some bullet points for zstd are speed, so that's a bit
beside the point ;)
Ideally we could define some macros to get there,
I believe the simplest assumption is, that just no one cared enough
to cleanly separate every option.

>
> I found this comment on github[1]:
> "There is no new magic number planned in the foreseeable future.
> 0xFD2FB528 is intended to be the only magic number for zstd frames."
>
> Do you think that implies that at least the basic file format is
> probably stable?

The format is documented and even publicized as rfc8878.
Digging through the code I already found some spots adding code to ensure
no data is produced that old (reference) implementations cant decode
(ie. workaround for bugs).

so going with the reference implementation should be rather safe.

Still I think that being able to track upstream should be the best path.

I did my own patch (some time ago, just took time to clean it up),
as far as I can see some bits are there that are missing in Jeff's patch,
the unzstd applet is a bit more feature full and behaves like the reference.

The concept for upstream sources would be to use tools/scripts
for most changes. (documented in README.source aswell).

extending that, to say cut out comments or functions that aren't used
(anything related to compression/dictionaries) should result
in something making upstream syncs simpler and drop like 2/3 rds of lines.

$zstd_path/contrib/freestanding_lib/freestanding.py \
--source-lib $zstd_path/lib \
--output-lib zstd \
-DZSTD_NO_INTRINSICS \
-DZSTD_NO_UNUSED_FUNCTIONS \
-DZSTD_LEGACY_SUPPORT=0 \
-DZSTD_STATIC_LINKING_ONLY \
-DFSE_STATIC_LINKING_ONLY \
-DHUF_STATIC_LINKING_ONLY \
-DXXH_STATIC_LINKING_ONLY \
-DZSTD_ADDRESS_SANITIZER=0 \
-DZSTD_MEMORY_SANITIZER=0 \
-UFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION \
-U__cplusplus \
-UZSTD_DLL_EXPORT \
-UZSTD_DLL_IMPORT \
-UZSTD_MULTITHREAD \
-RZSTDLIB_API=MEM_STATIC \
-RZSTDLIB_VISIBILITY=MEM_STATIC \
-RZSTDERRORLIB_VISIBILITY=MEM_STATIC \
-DZSTD_HAVE_WEAK_SYMBOLS=0 \
-DZSTD_TRACE=0 \
-DZSTD_NO_TRACE

sed -e 's,^\([[:alnum:]_\*]* ERR_[[:alnum:]_]*\)(,static \1(,' \
  -e 's,^\([[:alnum:]_\*]* FSE_[[:alnum:]_]*\) \?(,static \1(,' \
  -e 's,^\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1(,' \
  -e 's,^\([[:alnum:]_\*]* HUF_[[:alnum:]_]*\) \?(,static \1(,' \
  -e 's,^\([[:alnum:]_\*]* HIST_[[:alnum:]_]*\)(,static \1(,' \
  -e 's,^\(const \)\?\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1\2(,' \
  -i zstd/*/*.h

Norbert


More information about the busybox mailing list