Skip to content

emit number(0) for insn like "xor eax, eax" #2639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 24 commits into
base: master
Choose a base branch
from

Conversation

v1bh475u
Copy link
Contributor

closes #2622

Checklist

  • No CHANGELOG update needed
  • No new tests needed
  • No documentation update needed

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello @v1bh475u, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

Summary of Changes

This pull request addresses issue #2622 and introduces a new feature to emit number(0) for instructions like xor eax, eax. This optimization helps in identifying and representing a common code pattern where a register is zeroed by XORing it with itself. The changes include modifications to capa/features/extractors/viv/insn.py to implement the feature and tests/fixtures.py to add a new test case. A changelog entry is also added.

Highlights

  • New Feature: The pull request introduces a new feature to emit number(0) for instructions like xor eax, eax.
  • Code Modification: The capa/features/extractors/viv/insn.py file is modified to implement the new feature.
  • Test Case: A new test case is added to tests/fixtures.py to verify the new feature.

Changelog

  • CHANGELOG.md
  • capa/features/extractors/viv/insn.py
    • Added logic to detect xor eax, eax pattern and yield Number(0) feature at the instruction address (lines 597-607).
  • tests/fixtures.py
    • Added a new test case for the xor eax, eax instruction pattern to verify the Number(0) feature extraction (line 1024).
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in issue comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist is currently in preview and may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.


Trivia time!

What does XOR stand for in the context of bitwise operations?

Click here for the answer
XOR stands for exclusive OR.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request introduces a new feature to emit number(0) for instructions like "xor eax, eax", enhancing the detection capabilities of capa. The changes include modifications to the CHANGELOG, the core feature extraction logic, and the test suite. Overall, the changes seem well-structured and address the intended issue. However, there are a few areas that could benefit from further attention.

Summary of Findings

  • Clarity of XOR Instruction Detection Logic: The logic for detecting the XOR instruction pattern could be made more explicit and potentially more robust by using constants for register names instead of directly comparing register objects. This would improve readability and maintainability.
  • Test Coverage: While a new test case has been added, it might be beneficial to include additional test cases that cover different register combinations or edge cases to ensure the feature works as expected across various scenarios.

Merge Readiness

The pull request is almost ready for merging. Addressing the clarity of the XOR instruction detection logic and expanding the test coverage would further enhance the quality and reliability of the changes. I am unable to directly approve the pull request, and users should have others review and approve this code before merging. I recommend addressing the medium severity issues before merging.

@v1bh475u
Copy link
Contributor Author

@williballenthin , I have tested binexport2 for arch64 but could not find any binary with arm architecture. However, I am confident that it will work. But let know me if there is any present in the database. We should add a test for that too.

@mr-tz
Copy link
Collaborator

mr-tz commented Mar 24, 2025

I think we could emit these features from the 'extract_insn_nzxor_characteristic_features' functions to avoid duplicate code.

@v1bh475u
Copy link
Contributor Author

v1bh475u commented Mar 24, 2025

I think we could emit these features from the 'extract_insn_nzxor_characteristic_features' functions to avoid duplicate code.

I think the extract_insn_nzxor_charateristic_features is its own particular route and we should try to emit these features from it. Instead, what we can do is abstract away their logic into new helper functions and use them in place of raw checking. Shall I change to that?

@mr-tz
Copy link
Collaborator

mr-tz commented Mar 25, 2025

Interesting idea, can you show these changes for the vivisect extractor as an example?

Pro of the suggestion is a better separation, con is slightly more code.

@v1bh475u
Copy link
Contributor Author

@mr-tz , can you please review the example implementation? If you approve this, I can do the same for other backends too.

@v1bh475u
Copy link
Contributor Author

@mr-tz , can you please review the example implementation? If you approve this, I can do the same for other backends too.

@mr-tz ?

@v1bh475u
Copy link
Contributor Author

v1bh475u commented Mar 27, 2025

Reason: The previous commit contained a logical bug, fixed in this commit.

return insn.opers[0] == insn.opers[1]


def is_zxor(insn: envi.Opcode):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we're fine without this helper here? The other ones are fine I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fair enough

@v1bh475u
Copy link
Contributor Author

@williballenthin & @mr-tz , I have implemented the requested changes. Please review them whenever you are free.
Also, please run the tests for binary ninja backend and let know me if you run into any issue.

@v1bh475u
Copy link
Contributor Author

v1bh475u commented Apr 2, 2025

@mr-tz @williballenthin, I've made the requested changes. Please review whenever you are free.

Copy link
Collaborator

@mr-tz mr-tz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you ensure CI passes as well please?

@v1bh475u
Copy link
Contributor Author

v1bh475u commented Apr 2, 2025

Done with the changes

@v1bh475u v1bh475u requested a review from mr-tz April 3, 2025 14:33
@v1bh475u
Copy link
Contributor Author

@mr-tz @williballenthin, can you please take a look at the changes that I have made whenever you are free and tell me if there is a need for further modifications?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

emit number(0) (offset(0)??) for instructions like "XOR EAX, EAX"
3 participants