Draft
Conversation
…mption is TCDM is 32b with element width=8b. However, low precision workload might have different memory requirements. This PR handles this issue. Additionally, another stride is added to the address generator to enable input tiling useful for output stationary dataflow. d0_stride-> LAYER_C_IN (Next element in W to fill input buffer -> PE_W dimension) d1_stride-> LAYER_C_IN * W_IN (Next element in H to fill input buffer -> PE_H dimension) d2_stride-> TP_IN (number of input channels supported in a tile at a go -> ceil(LAYER_C_IN//TP_IN) dimension) d3_stride-> LAYER_C_IN (Next element in W -> FS dimension) d4_stride-> LAYER_C_IN * W_IN (Next element in H -> FS dimension)
…of data. But in theory a bank doesn't necessarily need to accommodate 4 bytes or 4 elements. This commit fixes these assumptions by making the bank size configurable via ADDR_OFFSET parameter derived from ELEMENTS_PER_BANK.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request introduces significant updates to the HCI (Hardware Communication Interface) modules, focusing on improving support for flexible element and bank widths, enhancing address generation, and updating related data alignment logic. The changes modernize the streaming and routing modules to support parameterized data widths and more generic, scalable memory banking, and update address generation logic to version 4. Below are the most important changes grouped by theme:
1. Parameterization and Data Alignment Enhancements
ELEMENT_WIDTH,ELEMENTS_PER_BANK, etc.) to bothhci_core_sinkandhci_core_sourcemodules, enabling more flexible configuration of element and bank widths. This includes calculation of derived localparams for data alignment and addressing. (rtl/core/hci_core_sink.sv[1] [2];rtl/core/hci_core_source.sv[3] [4] [5]rtl/core/hci_core_sink.sv[1];rtl/core/hci_core_source.sv[2] [3]2. Address Generation and Interface Updates
addressgen_v3toaddressgen_v4in both sink and source modules, and updated associated control and flag structures in the common package. (rtl/common/hci_package.sv[1];rtl/core/hci_core_sink.sv[2];rtl/core/hci_core_source.sv[3]rtl/core/hci_core_sink.sv[1];rtl/core/hci_core_source.sv[2]3. Router Module Generalization
hci_routermodule to support configurable bank word and element widths, updating all relevant localparams and logic for data, byte enable, and address calculation. (rtl/interco/hci_router.sv[1] [2] [3] [4] [5]rtl/interco/hci_router.svrtl/interco/hci_router.svL193-R209)4. Miscellaneous and Protocol Updates
no_valid_transfersflag to the streamer flags structure and its assignment in the sink, to signal when no transfers are in flight. (rtl/common/hci_package.sv[1];rtl/core/hci_core_sink.sv[2]rtl/core/hci_core_sink.sv[1];rtl/core/hci_core_source.sv[2]These updates collectively make the HCI modules more flexible, scalable, and maintainable, supporting a wider range of memory configurations and improving the robustness of data streaming and routing.