Security: eval() used on benchmark data allows code injection

## Bug Description

The MMLongBench-Doc benchmark loader uses `eval()` to parse `evidence_pages` and `evidence_sources` fields from dataset items. If the benchmark dataset is tampered with (e.g., man-in-the-middle on download, or a malicious dataset fork), arbitrary Python code will be executed.

## Location

`OmniSimpleMem/omni_memory/evaluation/benchmarks.py:1035,1043`

## Reproduction

```python
# If an attacker modifies the benchmark dataset JSON to include:
# {"evidence_pages": "__import__('os').system('id')", ...}

# The eval() on line 1035 will execute: __import__('os').system('id')
# This runs arbitrary system commands
```

```bash
# To trigger, run the benchmark evaluation:
cd OmniSimpleMem
python -c "from omni_memory.evaluation.benchmarks import MMLongBenchDocBenchmark; b = MMLongBenchDocBenchmark('/path/to/malicious_data')"
```

## Impact

Arbitrary code execution when loading benchmark data from untrusted sources.

## Suggested Fix

```python
# Replace eval() with ast.literal_eval() which only allows literals (lists, dicts, strings, numbers)
import ast

# Line 1035
evidence_pages = ast.literal_eval(evidence_pages)

# Line 1043
evidence_sources = ast.literal_eval(evidence_sources)
```

`ast.literal_eval()` safely parses Python literal expressions without executing arbitrary code.

---
Found via automated codebase analysis. Happy to submit a PR if confirmed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Security: eval() used on benchmark data allows code injection #52

Bug Description

Location

Reproduction

Impact

Suggested Fix

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Security: eval() used on benchmark data allows code injection #52

Description

Bug Description

Location

Reproduction

Impact

Suggested Fix

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions