Skip to content

modules/zstd: Add support for decoding compressed blocks #1857

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 103 commits into from

Conversation

lpawelcz
Copy link
Contributor

@lpawelcz lpawelcz commented Jan 15, 2025

This PR extends the ZstdDecoder with support for decoding compressed blocks.

It supersedes PRs:

The decoder is capable of decoding RAW and RLE literals as well as sequences with predefined FSE tables.
A suite of DSLX tests comprising unit tests of all underlying procs and an integration test was prepared.
The integration test, similarly as in #1654, first generates a random valid ZSTD frame with compressed blocks and expected decoded output. Test data is then converted to a DSLX file (example) that is imported by the integration tests file.
At the beginning of the test, the default FSE decoding tables are filled with default distributions taken from RFC 8878 section 3.1.1.3.2.2. Default Distributions . Next, the encoded frame is loaded to the system memory and the decoder is configured through a set of CSRs to start the decoding process. The decoder starts the operation and writes the decoded frame back into the output buffer in the system memory. Once it finishes, it sends a pulse on the notify channel signaling the end of the decoding. The output of the decoder is compared against the decoding result from the reference library.

The PR introduces among others:

  • CompressedBlockDecoder - manages both SequenceDecoder and LiteralsDecoder to enable compress block decoding. Integrated with the top-level ZstdDecoder
  • SequenceDecoder - responsible for decoding sequence sections of the compressed blocks
  • FseDecoder - introduced as the core part of the SequenceDecoder
  • RefillingShiftBuffer - used for storing and outputting in forward and backward fashion an arbitrary amount of bits required by the FSE decoder
  • LiteralsDecoder - capable of decoding RAW, RLE and Huffman-coded literals
  • HuffmanDecoder - used in decoding huffman-coded literals. Decoded Huffman trees are then used to decode one or four Huffman-coded streams.
  • CommandConstructor - this proc is responsible for sending packets with decoded sequences and literals to the SequenceExecutor proc
  • RamMux and RamDemux - procs used for handling requests/responses to multiple memory models. The procs interface with 3 separate memory buffers for FSE decoding tables.

Copy link

google-cla bot commented Jan 15, 2025

Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

View this failed invocation of the CLA check for more information.

For the most up to date status, view the checks section at the bottom of the pull request.

@proppy
Copy link
Member

proppy commented Jan 27, 2025

Can we rebase and consolidate with #1654 ?

@lpawelcz
Copy link
Contributor Author

lpawelcz commented Feb 5, 2025

Can we rebase and consolidate with #1654 ?

@proppy done
This PR also supersedes #1616 so I also closed it in favor of this one

@@ -0,0 +1,170 @@
# Copyright 2024 The XLS Authors
Copy link
Member

@proppy proppy Mar 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can. you add a docstring about what the module is doing and where it is being used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the docstring

@@ -0,0 +1,33 @@
# Copyright 2024 The XLS Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can. you add a docstring about what those external are and why they are needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added the docstring and a License file for the third party sources

@@ -24,6 +26,15 @@ pub struct PlainData<SYMB_WIDTH: u32> {
last: bool, // flush RLE
}

// Structure contains multiple uncompressed symbols.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is that used by zstd?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like it is not used in this version of the ZSTD Decoder. Let's just remove it for now.

@@ -0,0 +1,753 @@
// Copyright 2024 The XLS Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can. you add a docstring about what the module is and why it is needed? (if it's for driving the cocostb test maybe move it there or in a separate rtl folder?)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a separate rtl directory and added short description

@@ -0,0 +1,193 @@
// Copyright 2024 The XLS Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you move the verilog file in a separate rtl directory w/ a README (or docstring describing the modules)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to a separate rtl directory and added short description

@proppy
Copy link
Member

proppy commented Mar 6, 2025

can you also rebase?

cocotb==1.9.0
cocotbext-axi==0.1.24
cocotb_bus==0.2.1
zstandard==0.23.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

q: does this need to be in sync w/ the C++ dep?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we dropped the C++ tests for the ZSTD Decoder I believe those don't have to be in sync

@lpawelcz lpawelcz force-pushed the zstd_compressed_block_dec branch from cae18b3 to 70cbb9d Compare April 9, 2025 14:44
@lpawelcz
Copy link
Contributor Author

lpawelcz commented Apr 9, 2025

Addressed the review comments, apart from that:

  • It was required to remove materialize_internal_fifos for procs with loopback channels
    • Explicitly set materialize_internal_fifos=false for offending procs and provided verilog implementation of the fifo for the synthesis targets
    • Added the option to procs that had it missing
  • Renamed axi_ram.x file to axi_ram_reader.x
  • Removed CC tests step from the ZSTD module CI workflow
    • No longer applicable as there are no C++ tests in ZSTD now
  • Added missing codegen options (mostly module_name)
  • Removed Reset CSR
    • Adjusted DSLX code, tests and the docs

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing license?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - there's a license header now.

@@ -0,0 +1,418 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to go in third_party, since it is not authored by a contributor.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been moved to third_party/verilog_axi.

@@ -0,0 +1,391 @@
/*
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this has to go in third_party, since it is not authored by a contributor.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This has been moved to third_party/verilog_axi.

@@ -0,0 +1,21 @@
pub struct DataArray<BITS_PER_WORD: u32, LENGTH: u32>{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing license?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - there's a license header now.

@@ -0,0 +1,21 @@
pub struct DataArray<BITS_PER_WORD: u32, LENGTH: u32>{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing license?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done - there's a license header now.

lpawelcz and others added 7 commits May 16, 2025 17:03
Signed-off-by: Pawel Czarnecki <[email protected]>
* Use materialize_internal_fifos when possible
* Disable the above option for procs with loopback channels
* Add missing module names
* Add xls_fifo_wrapper verilog dependency to the synthesis of the procs without materialized internal fifos

Signed-off-by: Pawel Czarnecki <[email protected]>
No longer applicable - there are no CC tests in ZSTD module

Signed-off-by: Pawel Czarnecki <[email protected]>
Signed-off-by: Wojciech Sipak <[email protected]>
@wsipak wsipak force-pushed the zstd_compressed_block_dec branch from 1f89c87 to 412eb98 Compare May 16, 2025 15:06
@wsipak wsipak force-pushed the zstd_compressed_block_dec branch 3 times, most recently from de1ee9a to d77223d Compare May 19, 2025 16:04
@tmichalak tmichalak force-pushed the zstd_compressed_block_dec branch from d77223d to b327068 Compare May 20, 2025 07:14
@@ -0,0 +1,226 @@
// Copyright 2024 The XLS Authors
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

make sure to wrap line to 100 characters in that file.

@wsipak wsipak force-pushed the zstd_compressed_block_dec branch 2 times, most recently from cb9d65d to 35a215a Compare May 20, 2025 09:46
which did not have the `last` flag set
* DecoderMux
* At the beginning of the simulation or after receiving
`ExtendedBlockDataPacket` with `last` and `last_block` (decoding new ZSTD
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

justify to 80 columns?

Signed-off-by: Wojciech Sipak <[email protected]>
@wsipak wsipak force-pushed the zstd_compressed_block_dec branch from 35a215a to 6836389 Compare May 20, 2025 12:38
@@ -13,6 +13,11 @@ pyyaml==6.0.1
# We build most of z3 ourselves but building python is really complicated. Just
# use pypi version
z3-solver==4.14.0.0
pytest==8.2.2
cocotb==1.9.0
cocotbext-axi==0.1.24
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is available in our internal repository, so it would be better to untie this dependency for now and remove the associated script from the PR and import them separately.

@@ -13,6 +13,11 @@ pyyaml==6.0.1
# We build most of z3 ourselves but building python is really complicated. Just
# use pypi version
z3-solver==4.14.0.0
pytest==8.2.2
cocotb==1.9.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, internally we are on v1.7.1, so if there is anything that's specific to 1.9.0, will need to upgrade the internal copy of cocotb first; as I commented elsewhere it might be better to untie this and the associated python tests from this PR and import this separately.

pytest==8.2.2
cocotb==1.9.0
cocotbext-axi==0.1.24
cocotb_bus==0.2.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, this is not available yet internally and will need to be imported separately.

cocotb==1.9.0
cocotbext-axi==0.1.24
cocotb_bus==0.2.1
zstandard==0.23.0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI, internally this is on 0.19.0

@@ -13,6 +13,11 @@ pyyaml==6.0.1
# We build most of z3 ourselves but building python is really complicated. Just
# use pypi version
z3-solver==4.14.0.0
pytest==8.2.2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Internally this is on 8.0.2

@@ -17,11 +17,11 @@
load("@rules_hdl//place_and_route:build_defs.bzl", "place_and_route")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider adding https://google.github.io/xls/bazel_rules_macros/#xls_dslx_fmt_test to enforce DSLX formatting.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you also want to consider running https://google.github.io/xls/bazel_rules_macros/#xls_dslx_fmt_test to enforce DSLX formatting as part of the CI.

@proppy
Copy link
Member

proppy commented May 21, 2025

super seeded by #2230

@proppy proppy closed this May 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.