|  | 
|  | 1 | +#+title: Reverse Engineering | 
|  | 2 | + | 
|  | 3 | +* Start here | 
|  | 4 | +** General tips | 
|  | 5 | +- figure out what the goal is | 
|  | 6 | +  - there is usually a clear "win condition", such as printing a flag | 
|  | 7 | +- figure out what the input is | 
|  | 8 | +  - some parts of the program don't change depending on the input | 
|  | 9 | +  - it might not matter what the input is! | 
|  | 10 | +  - how does the input get used? | 
|  | 11 | +** A note about past meetings | 
|  | 12 | +SIGPwny has already ran two meetings on this topic! Check out [[https://sigpwny.com/meetings/fa2023/2023-09-17/][Reverse Engineering Setup]] and [[https://sigpwny.com/meetings/fa2023/2023-09-21/][Reverse Engineering I]]. We have slides and recorded meeting presentations, which you may prefer more than these notes. | 
|  | 13 | +* Basics | 
|  | 14 | +** What it is | 
|  | 15 | +Reverse engineering is the process of understanding computer programs. The goal is to figure out what the program does. Usually, programs are difficult to understand, either intentionally or unintentionally. | 
|  | 16 | +** Main types of analysis | 
|  | 17 | +- Static analysis: reading code, using tools to understand code /without running it/ | 
|  | 18 | +  - Good place to start, not great if there's a lot of code | 
|  | 19 | +- Dynamic analysis: running code, inspecting or modifying the program as it's running | 
|  | 20 | +  - Generally faster, captures entire program environment | 
|  | 21 | +** A word on abstractions | 
|  | 22 | +- Abstract (higher level) programs are easier to understand | 
|  | 23 | +- Languages like Python and JavaScript are higher level | 
|  | 24 | +- Languages like assembly and C are lower level | 
|  | 25 | +- As you modify a program to become more abstract (to better understand it), you lose some information in the process | 
|  | 26 | +* Tools | 
|  | 27 | +** Bytecode viewer | 
|  | 28 | +*** Installation | 
|  | 29 | +- see https://github.com/Konloch/bytecode-viewer | 
|  | 30 | +*** When to use | 
|  | 31 | +This program is used to decompile Java files, which usually have the .jar extension | 
|  | 32 | +*** How to use | 
|  | 33 | +Simply import the java jar program into the bytecode viewer and see the decompiled java code! This works by recovering the java code from the compiled java bytecode. | 
|  | 34 | +** Ghidra | 
|  | 35 | +*** Installation | 
|  | 36 | +- see [[https://sigpwny.com/meetings/fa2023/2023-09-17/][Reverse Engineering Setup]] | 
|  | 37 | +- or, just read the [[https://ghidra-sre.org/InstallationGuide.html][installation guide]] | 
|  | 38 | +*** When to use | 
|  | 39 | +Use this tool for binaries, not python scripts. Ghidra "decompiles", or simplifies, binary programs into more human-readable "pseudo-C" code. | 
|  | 40 | + | 
|  | 41 | +Ghidra is a *static analysis* tool. | 
|  | 42 | +*** Interface | 
|  | 43 | +[[./images/ghidra1.png]] | 
|  | 44 | + | 
|  | 45 | +Once you open a program in Ghidra, click "OK" for all the auto analyze popups (there should be several). Now, the interface should look like the above image. | 
|  | 46 | + | 
|  | 47 | +(1) is the decompiled code output. This is what you will be looking at for the most part. You can rename variables by clicking a variable and pressing =L=. Change the type by right clicking and selecting =Retype Variable=. | 
|  | 48 | + | 
|  | 49 | +(2) is the assembly instructions. This won't be very helpful if you don't know assembly, and can be mostly ignored for the challenges at Fall CTF. | 
|  | 50 | + | 
|  | 51 | +(3) is the "symbol tree". This shows you different named values that are present in the file. Click =Functions= and scroll down to select the =main= function. This shows you the first function that runs. | 
|  | 52 | + | 
|  | 53 | +[[./images/ghidra2.png]] | 
|  | 54 | + | 
|  | 55 | +Here we can see the =main= function in the symbol tree. If there is no =main=, click =_start= and see what that function calls. | 
|  | 56 | + | 
|  | 57 | +[[./images/ghdira3.png]] | 
|  | 58 | + | 
|  | 59 | +Above is a picture of the decompilation (disclaimer: this is not a challenge from Fall CTF). Almost every function you see will have an if statement with =__stack_chk_fail= at the bottom. This is a check for the "stack canary", which is not relevant to any challenges here. It may be of more interest in pwn challenge. The ~local_10 = *(long *)(in_FS_OFFSET + 0x28);~ line at the top sets up the stack canary and can also be ignored. | 
|  | 60 | + | 
|  | 61 | +Note that the variables are named with undescriptive names, such as =iVar1= and =local_28=. This is because the decompiler does not know the details of variables in the original function. As a result, it has to generate variable names. | 
|  | 62 | +** GDB | 
|  | 63 | +*** Installation | 
|  | 64 | +- see [[https://sigpwny.com/meetings/fa2023/2023-09-17/][Reverse Engineering Setup]] | 
|  | 65 | +*** When to use | 
|  | 66 | +Similarly to Ghidra, use this tool for binaries, not python scripts. GDB is a debugger that runs programs, giving you the ability to stop, inspect, and modify code as it is executing. | 
|  | 67 | + | 
|  | 68 | +GDB is a *dynamic analysis* tool. | 
|  | 69 | +*** Basics | 
|  | 70 | +Run =gdb ./chal= on the command line, where =chal= is the name of the program. Note that you must be on Linux (WSL works too). This will not work for Apple Silicon Mac users. | 
|  | 71 | + | 
|  | 72 | +GDB will launch you into a program with a different terminal prompt, where each line starts with =(gdb)=. You interact with the program by typing in commands | 
|  | 73 | +*** Commands | 
|  | 74 | +- misc | 
|  | 75 | +  - =help <command>=: get help about any of the commands listed here | 
|  | 76 | +- running | 
|  | 77 | +  - =run=: run the program from the start | 
|  | 78 | +  - =quit=: exit GDB | 
|  | 79 | +  - =start=: start the program and break on the =main= function | 
|  | 80 | +- breakpoints | 
|  | 81 | +  - =break <func>+<offset>=: set a breakpoint at the function =<func>= with an offset =<offset>=. Useful to get the offset from the =disas= command | 
|  | 82 | +- inspecting program | 
|  | 83 | +  - =disas <func>=: disassemble the =<func>= function | 
|  | 84 | +  - =info reg=: print all the registers | 
|  | 85 | +  - =x=: print data (see =help x= for more info) | 
|  | 86 | +    - =x/4gx 0x1234=: print 4 QWORDS (64-bit values) in hex starting at address =0x1234= | 
|  | 87 | +    - =x/10i $rip=: print 10 instructions starting at =$rip= (current instruction pointer) | 
|  | 88 | +    - =x/7wx $rsp=: print 7 WORDS (32-bit values) in hex starting at =$rsp= (stack pointer) | 
|  | 89 | +    - =x/8bd $rdi=: print 8 bytes in decimal starting at the address in =$rdi= | 
|  | 90 | +  - =set=: set values | 
|  | 91 | +    - ~set $rax=23~: sets =$rax= to 23 | 
|  | 92 | +    - ~set $rip+=4~: adds 4 to =$rip= | 
|  | 93 | +      - this skips the current instruction, if it is 4 bytes long | 
|  | 94 | +*** General workflow | 
|  | 95 | +- first, identify interesting places to set a breakpoint in Ghidra | 
|  | 96 | +- use the assembly instructions window in Ghidra to see the offset to break at | 
|  | 97 | +- run the program in GDB and set a breakpoint | 
|  | 98 | +- modify or print values as desired | 
|  | 99 | +- repeat until solved | 
0 commit comments