Add additional dataset format support (pyarrow, csv, tsv, etc.)


**Summary**  
Support viewing non-JSONL datasets (PyArrow-based, CSV, TSV) in a way that still gives users a good inspection experience.

**Problem**  
Many real-world ML datasets aren’t stored as JSONL; they may be in PyArrow/Parquet, CSV, or TSV. Right now, the viewer is effectively JSONL-centric, which limits its usefulness for ML engineers working with diverse formats.

**Proposed Solution**  
- Define a minimal abstraction for “row-based dataset” independent of underlying storage.  
- For each supported format:
  - **CSV/TSV**:  
    - Parse headers as column names.  
    - Treat each row as a flat object; if a cell contains JSON, optionally detect and pretty-print it.  
  - **PyArrow / Arrow-backed formats** (initially optimistic / read-only assumptions):  
    - Use a Node/JS-accessible reader or a conversion step to present rows and columns.  
    - Preserve column names and basic types; optionally detect JSON strings similarly.  
- Keep the UI consistent: same pretty-printing and JQ-style key selection concepts where applicable.

**Acceptance Criteria**  
- User can open CSV/TSV files and:
  - See rows/columns with column headers.
  - Optionally expand JSON-like cells as nested JSON views.  
- Basic support for at least one Arrow-based dataset path (even if via a conversion step or a restricted subset).  
- Errors for unsupported or malformed files are clear and non-crashing.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add additional dataset format support (pyarrow, csv, tsv, etc.) #6

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Add additional dataset format support (pyarrow, csv, tsv, etc.) #6

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions