[PATCH] Add support for zstd decompression

Norbert Lange nolange79 at gmail.com
Tue Sep 28 09:16:29 UTC 2021


Am So., 12. Sept. 2021 um 23:12 Uhr schrieb Norbert Lange <nolange79 at gmail.com>:
>
> Am So., 12. Sept. 2021 um 09:00 Uhr schrieb Jeff Pohlmeyer
> <yetanothergeek at gmail.com>:
> >
> > On Fri, Sep 10, 2021 at 7:52 PM Denys Vlasenko <vda.linux at googlemail.com> wrote:
> > > I'm getting this:
> >
> > > (add/remove: 96/0 grow/shrink: 6/2 up/down: 24743/-98)      Total: 24645 bytes
>
> I can kick this down a bit by declaring all functions static,
> inlining and constant propagation does the rest.
>
>   Using git/busybox as source for busybox (x86_64, gcc 10)
>   GEN     /tmp/build/Makefile
> function                                             old     new   delta
> unpack_zstd_stream                                     -    5070   +5070
> static.HUF_readDTableX1_wksp_bmi2                      -    1755   +1755
> static.ZSTD_decompressBlock_internal                   -    1468   +1468
> static.ZSTD_decompressSequences_body                   -    1429   +1429
> ZSTD_decompressContinue                                -    1062   +1062
> HUF_decompress4X1_usingDTable_internal_body            -     883    +883
> FSE_readNCount_body                                    -     622    +622
> ML_defaultDTable                                       -     520    +520
> LL_defaultDTable                                       -     520    +520
> static.ZSTD_buildFSETable_body                         -     518    +518
> XXH64_digest                                           -     494    +494
> static.FSE_decompress_usingDTable_generic              -     470    +470
> ZSTD_getFrameHeader_advanced                           -     423    +423
> static.XXH64_update_endian                             -     416    +416
> ZSTD_decompressBegin_usingDDict                        -     391    +391
> static.ZSTD_buildSeqTable                              -     375    +375
> ZSTD_execSequenceEnd                                   -     300    +300
> OF_defaultDTable                                       -     264    +264
> ZSTD_decodeFrameHeader                                 -     259    +259
> BIT_initDStream                                        -     258    +258
> ZSTD_DCtx_selectFrameDDict                             -     234    +234
> ZSTD_safecopy                                          -     225    +225
> ML_bits                                                -     212    +212
> ML_base                                                -     212    +212
> .rodata                                            98830   99029    +199
> ZSTD_decompressContinueStream                          -     177    +177
> LL_bits                                                -     144    +144
> LL_base                                                -     144    +144
> OF_bits                                                -     128    +128
> OF_base                                                -     128    +128
> unzstd_main                                            -     126    +126
> static.HUF_decodeStreamX1                              -     117    +117
> BIT_reloadDStream                                      -     114    +114
> ZSTD_overlapCopy8                                      -     107    +107
> ZSTD_clearDict                                         -     105    +105
> ZSTD_frameHeaderSize_internal                          -     103    +103
> HUF_decompress1X1_usingDTable_internal_body            -     102    +102
> ZSTD_wildcopy                                          -      94     +94
> static.unzstd_longopts                                 -      81     +81
> packed_usage                                       34120   34198     +78
> ZSTD_getcBlockSize                                     -      78     +78
> tar_main                                            1290    1360     +70
> FSE_decodeSymbolFast                                   -      58     +58
> BIT_reloadDStreamFast                                  -      50     +50
> setup_transformer_on_fd                              155     204     +49
> FSE_decodeSymbol                                       -      44     +44
> HUF_decodeSymbolX1                                     -      39     +39
> BIT_readBits                                           -      38     +38
> ZSTD_initFseState                                      -      34     +34
> static.dec64table                                      -      32     +32
> static.dec32table                                      -      32     +32
> ZSTD_fcs_fieldSize                                     -      32     +32
> ZSTD_did_fieldSize                                     -      32     +32
> static.ZSTD_customFree                                 -      27     +27
> applet_main                                         3192    3216     +24
> BIT_endOfDStream                                       -      22     +22
> applet_names                                        2747    2767     +20
> repStartValue                                          -      12     +12
> tar_longopts                                         314     321      +7
> static.CSWTCH                                          -       6      +6
> applet_suid                                          100     101      +1
> applet_install_loc                                   200     201      +1
> ------------------------------------------------------------------------------
> (add/remove: 54/0 grow/shrink: 9/0 up/down: 21035/0)        Total: 21035 bytes
>    text    data     bss     dec     hex filename
>  999282   16443    1856 1017581   f86ed busybox_old
> 1020376   16467    1856 1038699   fd96b busybox_unstripped
>
> >
> > > I suspect Facebook et al do not share busybox's zeal about smaller size.
>
> Particularly some bullet points for zstd are speed, so that's a bit
> beside the point ;)
> Ideally we could define some macros to get there,
> I believe the simplest assumption is, that just no one cared enough
> to cleanly separate every option.
>
> >
> > I found this comment on github[1]:
> > "There is no new magic number planned in the foreseeable future.
> > 0xFD2FB528 is intended to be the only magic number for zstd frames."
> >
> > Do you think that implies that at least the basic file format is
> > probably stable?
>
> The format is documented and even publicized as rfc8878.
> Digging through the code I already found some spots adding code to ensure
> no data is produced that old (reference) implementations cant decode
> (ie. workaround for bugs).
>
> so going with the reference implementation should be rather safe.
>
> Still I think that being able to track upstream should be the best path.
>
> I did my own patch (some time ago, just took time to clean it up),
> as far as I can see some bits are there that are missing in Jeff's patch,
> the unzstd applet is a bit more feature full and behaves like the reference.
>
> The concept for upstream sources would be to use tools/scripts
> for most changes. (documented in README.source aswell).
>
> extending that, to say cut out comments or functions that aren't used
> (anything related to compression/dictionaries) should result
> in something making upstream syncs simpler and drop like 2/3 rds of lines.
>
> $zstd_path/contrib/freestanding_lib/freestanding.py \
> --source-lib $zstd_path/lib \
> --output-lib zstd \
> -DZSTD_NO_INTRINSICS \
> -DZSTD_NO_UNUSED_FUNCTIONS \
> -DZSTD_LEGACY_SUPPORT=0 \
> -DZSTD_STATIC_LINKING_ONLY \
> -DFSE_STATIC_LINKING_ONLY \
> -DHUF_STATIC_LINKING_ONLY \
> -DXXH_STATIC_LINKING_ONLY \
> -DZSTD_ADDRESS_SANITIZER=0 \
> -DZSTD_MEMORY_SANITIZER=0 \
> -UFUZZING_BUILD_MODE_UNSAFE_FOR_PRODUCTION \
> -U__cplusplus \
> -UZSTD_DLL_EXPORT \
> -UZSTD_DLL_IMPORT \
> -UZSTD_MULTITHREAD \
> -RZSTDLIB_API=MEM_STATIC \
> -RZSTDLIB_VISIBILITY=MEM_STATIC \
> -RZSTDERRORLIB_VISIBILITY=MEM_STATIC \
> -DZSTD_HAVE_WEAK_SYMBOLS=0 \
> -DZSTD_TRACE=0 \
> -DZSTD_NO_TRACE
>
> sed -e 's,^\([[:alnum:]_\*]* ERR_[[:alnum:]_]*\)(,static \1(,' \
>   -e 's,^\([[:alnum:]_\*]* FSE_[[:alnum:]_]*\) \?(,static \1(,' \
>   -e 's,^\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1(,' \
>   -e 's,^\([[:alnum:]_\*]* HUF_[[:alnum:]_]*\) \?(,static \1(,' \
>   -e 's,^\([[:alnum:]_\*]* HIST_[[:alnum:]_]*\)(,static \1(,' \
>   -e 's,^\(const \)\?\([[:alnum:]_\*]* ZSTD_[[:alnum:]_]*\) \?(,static \1\2(,' \
>   -i zstd/*/*.h
>
> Norbert

New version is in a fork: https://github.com/nolange/busybox/commits/zstdapplets
Down to around 17KB cost in size.

Few of the improvements are already upstreamed, and I made an issue
about how to proceed:
https://github.com/facebook/zstd/issues/2806

Would like to clear up how to proceed?

Norbert


More information about the busybox mailing list