feat: RISCOF signature extraction for RISC-V architecture tests#52
feat: RISCOF signature extraction for RISC-V architecture tests#52codygunton wants to merge 3 commits intobrevis-network:mainfrom
Conversation
|
Thanks @eason1981 for running my workflow. It looks like everything is fine except a linter error that I have fixed. This means I didn't break anything, but it also doesn't show my new work is correct, since I didn't add any tests of the new API endpoint. Is that alright? I think breakages should be rare, but ofc if you'd like help integrating into your own CI, I'd be happy to consult. |
Thanks for adding this sub command to pico-sdk CLI and fixing lint!
I looked through the code, it looks great and shouldn’t break anything. |
|
Great! I tried to make this RISCOF testing setup super portable and reproducible here: https://github.com/eth-act/zkevm-test-monitor. You should be able to execute |
|
Is there anything more I can do to help this get merged? |
I believe that’s all for now. I’ll ping @succinctli for another review. |
|
Will you please merge this? |
I’ve just reviewed it and it looks good to me. The branch is a bit outdated, so I’ll update it with the latest changes and then proceed with merging. |
|
I’ve opened a follow-up PR here that rebases these changes on top of the latest pico (v1.1.9). @codygunton |
|
The follow-up that @kaiwei-0 linked has been merged 🙏 |
The RISCOF framework is used for differential testing of RISC-V implementations against reference emulators such as the formally specified Sail model. The dashboard here tracks compliance using the canonical RISC-V architecture tests, executed via RISCOF, and in time we may add additional tests in a custom test suite.
In the RISCOF approach, the target implementation ("DUT") and the reference both dump a region of memory to a file (the "signatures") for a direct value-by-value comparison. This pull request adds the signature extraction mechanism for Pico.
Pico is compliant for every rv32im opcode except
fence. This indicates a very high degree of compliance, for instance, theaddtest suite alone addresses over 500 edge cases. Thefences must be unneeded due to the presence of only a single hart, but but they are apparently not properly treated asnops. This should not affect proving in most cases (cf the recent progress on RTP!), but it does complicate the use of certain tools such as fuzzers that may emit such instructions.If this PR is accepted, the dashboard will be updated to run a nightly job looking for updates to the main branch, and running the tests in the case of a new commit. This is useful for catching regressions, and also for tracking any ISA changes that may occur.