-
Notifications
You must be signed in to change notification settings - Fork 608
emit number(0) (offset(0)??) for instructions like "XOR EAX, EAX" #2622
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
good idea. i tend to think 'number: 0' we should definitely add. offset i'm not so sure. |
As capa doesn’t track register values, so |
Adding number(0) makes sense. As a downside this will create lots of additional features potentially affecting performance?! Maybe we can create stats/benchmarks for a few samples to verify. |
I suspect we'd see this feature about once per function, since I've noticed many compilers will reserve a general purpose register for the zero value, when possible. Sometimes there might be a couple instances per function. imho, this probably won't be a noticable amount of garbage, but I agree we should figure out a way to benchmark and verify. |
Hey, @williballenthin, @mike-hunhoff and @mr-tz . I would like to work on this issue. I have explored how a few capa backends extract features from the file. I am thinking of using |
I would start by using something like hyperfine to benchmark the invocation of capa against a fairly complex sample, like mimikatz. Only if there's a measurable difference before/after the changes should we dive into the line-level profiling. I think its likely that the performance delta is so small it is hidden by random noise. |
The first one in each image is from current master branch and second one is from my implementation for if insn.mnem == "xor" and insn.opers[0].isReg() and insn.opers[1].isReg() and insn.opers[0].reg == insn.opers[1].reg:
# for pattern like:
#
# xor eax, eax
#
yield Number(0), ih.address
# this is for both x32 and x64
if not isinstance(oper, (envi.archs.i386.disasm.i386ImmOper, envi.archs.i386.disasm.i386ImmMemOper)):
return |
great @v1bh475u! the implementation looks nice and straightforward - good job. i'm not sure how to read the screenshots, since there's two of them :-) |
I think in the first screenshot, the 2nd test might have been influenced by other processes running on my machine. Let me retry to ensure that. |
one thing that might also come into play is CPU throttling, which may penalize whatever test is going second (when CPU is warmer so might be throttled). so, i'd recommend doing the testing on dedicated non-laptop hardware (if possible), or alternating the tests more so that both cases have a chance to run "second", or letting the system cool down before running new tests. |
Agreed. I will run further tests and update them here soon. |
@williballenthin @mike-hunhoff @mr-tz , shall I start working on modifications similar to above one for the backends which I can test for? |
yes, please do! and, please add a test case to demonstrate the new functionality. |
@williballenthin , I have an older version of binaryninja given by one of my friends (version: |
Most likely, @xusheng6 did various updates around feature extraction. |
Also, we want the tests to run as fast as possible, right? |
Yeah version 3.5 is a bit old. Could you please test it against version 4.2 and see if it works? |
@v1bh475u if you don't have a license for Binary Ninja, don't worry about it. Once you have a unit test and an implementation for IDA, I'd be happy to handle the Binary Ninja side. I expect it's only a few lines, like for IDA. |
@williballenthin I have made the PR and also added test case for |
+1 for this feature, just ran into this while working on mandiant/capa-rules#1046. i want to match on 3 arguments to a function being 0 but MSVC is emitting xor's to zero out the registers:
|
see mandiant/capa-rules#993 (comment)
The text was updated successfully, but these errors were encountered: