[2025秋季][T2-1-1] scbz4learning #915
Open
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
最 Naive 版本的.
想移植marlin, 但是时间不够了
HONOR_CODE.md
REFERENCE.md
很多优化可以做
最主要的是应该做融合算子, 不然先反量化再gemm, 两次设备主机之间通信, 开销太大.
算子本身可以更优化, 但是框架本身提供的AWQ中注释来自英伟达官方, 好像也摸不清语法, 有待测试.