Conversation
afb5175 to
1a03831
Compare
mp-17
left a comment
There was a problem hiding this comment.
Hi Navaneeth, thanks! This is probably better than before (at least it's more permissive and does not trap on legal code), but I am not sure it's fully complying with the specs req.
- the destination reg should be checked against the current lmul and sew
- source regs checked against emul and eew
- rules for overlapping are the following (look at the third requirement):
A destination vector register group can overlap a source vector register group only if one of the
following holds:
⚫ The destination EEW equals the source EEW.
⚫ The destination EEW is smaller than the source EEW and the overlap is in the lowest-numbered
part of the source register group (e.g., when LMUL=1, vnsrl.wi v0, v0, 3 is legal, but a destination
of v1 is not).
⚫ The destination EEW is greater than the source EEW, the source EMUL is at least 1, and the overlap
is in the highest-numbered part of the destination register group (e.g., when LMUL=8, vzext.vf4 v0,
v6 is legal, but a source of v0, v2, or v4 is not).
What happens if you try out the following?
With LMUL8
vzext.vf4 v0, v6 # ok
vzext.vf4 v0, v0 # trap
vzext.vf8 v4, v16 # trap
Do the last two cases trap?
Anyway, if the legal cases are now passing, we can merge this.
|
Hi Matteo, You are right, there are more cases to be considered. Here I just want to avoid trapping on legal code. For the 3 cases, you mentioned, cases 1, 3 work as expected now. If you are okay with the changes, you can merge them. |
|
Great! Thanks again! |
Vector integer extension operations have src EMUL smaller than LMUL and can have registers in the EMUL group. For example
vzext.vf2 v2, v1when LMUL=2 is a valid instruction.Changelog
Fixed
lmul_vs2with the appropriate src EMUL for illegal src operand checkAdded
Checklist