Skip to content

Security: eval() used on benchmark data allows code injection #52

@CrepuscularIRIS

Description

@CrepuscularIRIS

Bug Description

The MMLongBench-Doc benchmark loader uses eval() to parse evidence_pages and evidence_sources fields from dataset items. If the benchmark dataset is tampered with (e.g., man-in-the-middle on download, or a malicious dataset fork), arbitrary Python code will be executed.

Location

OmniSimpleMem/omni_memory/evaluation/benchmarks.py:1035,1043

Reproduction

# If an attacker modifies the benchmark dataset JSON to include:
# {"evidence_pages": "__import__('os').system('id')", ...}

# The eval() on line 1035 will execute: __import__('os').system('id')
# This runs arbitrary system commands
# To trigger, run the benchmark evaluation:
cd OmniSimpleMem
python -c "from omni_memory.evaluation.benchmarks import MMLongBenchDocBenchmark; b = MMLongBenchDocBenchmark('/path/to/malicious_data')"

Impact

Arbitrary code execution when loading benchmark data from untrusted sources.

Suggested Fix

# Replace eval() with ast.literal_eval() which only allows literals (lists, dicts, strings, numbers)
import ast

# Line 1035
evidence_pages = ast.literal_eval(evidence_pages)

# Line 1043
evidence_sources = ast.literal_eval(evidence_sources)

ast.literal_eval() safely parses Python literal expressions without executing arbitrary code.


Found via automated codebase analysis. Happy to submit a PR if confirmed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions