Skip to content

fix(data-grid): apply improvements from code review findings#1118

Merged
sadmann7 merged 15 commits intomainfrom
sadman/port-north-tsv-parser
Feb 25, 2026
Merged

fix(data-grid): apply improvements from code review findings#1118
sadmann7 merged 15 commits intomainfrom
sadman/port-north-tsv-parser

Conversation

@sadmann7
Copy link
Copy Markdown
Owner

Summary

Apply fixes and improvements discovered during code review and testing. These changes address edge cases and performance issues in the data grid implementation.

Changes

1. Fix prevIsEditingRef initialization

  • Initialize to false instead of isEditing in ShortTextCell, NumberCell, and UrlCell
  • Why: When a cell mounts already in edit mode (e.g., via keyboard navigation), the ref needs to start as false so the focus/selection logic runs correctly on first render

2. Preserve undefined/null entries in onDataUpdate

  • Explicitly preserve undefined/null values when building newData array
  • Why: Maintains correct array structure and indices, prevents data corruption if source data intentionally has gaps

3. Optimize onRowSelect implementation

  • Use single findIndex instead of findIndex + array access
  • Use rowId directly instead of getting currentRow.id
  • Handle edge case where last clicked row was filtered out
  • Why: Cleaner code, better performance on large tables

4. Filter non-navigable columns from copy operation

  • Skip checkbox/select columns when building TSV data for clipboard
  • Why: Prevents column misalignment when pasting, users don't want selection state in clipboard

5. Simplify cell persistence logic

  • Remove unnecessary fallback chains in data update loop
  • Use currentData.length instead of tableRowCount
  • Why: Clearer code, matches original fix intent

Testing

These changes address edge cases found through:

  • Static code analysis
  • Manual testing with filtering, sorting, and editing
  • Edge case testing (cells mounting in edit mode, filtered rows)

Risk Assessment

Low Risk - All changes are refinements to existing fixes that have already been deployed. They address edge cases and improve code quality without changing core behavior.

sadmann7 and others added 2 commits February 25, 2026 13:11
Replace tablecn's heuristic-based parseTsv with North's standard-compliant
parseTsvText implementation. The new parser properly handles:
- Quoted multiline cells
- Escaped quotes ("" -> ")
- Windows line endings (\r\n)
- Empty row filtering
- Mixed quoted and unquoted fields

This brings tablecn in line with TSV/CSV standards and improves paste
reliability for spreadsheet data.

Co-authored-by: Cursor <cursoragent@cursor.com>
Apply improvements identified during the North porting process:

1. Initialize prevIsEditingRef to false
   - Ensures focus/selection logic runs when cell mounts in edit mode
   - Fixes ShortTextCell, NumberCell, and UrlCell

2. Preserve undefined/null entries in onDataUpdate
   - Maintains array structure and indices correctly
   - Prevents data corruption from missing rows

3. Optimize onRowSelect to use single findIndex
   - Use rowId directly instead of getting currentRow first
   - Handle edge case where last clicked row was filtered out

4. Filter non-navigable columns from copy operation
   - Prevents checkbox/select columns in clipboard

5. Simplify cell persistence logic
   - Remove unnecessary fallback chains

These fixes were validated through North's code review and testing.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copilot AI review requested due to automatic review settings February 25, 2026 07:33
@vercel
Copy link
Copy Markdown

vercel bot commented Feb 25, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
tablecn Ready Ready Preview, Comment Feb 25, 2026 8:12am

Request Review

sadmann7 and others added 2 commits February 25, 2026 13:34
The new parseTsvText function follows standard TSV/CSV format where
multiline content must be quoted. Update test to use proper format.

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR completely rewrites the TSV (Tab-Separated Values) parsing implementation used for clipboard paste operations in the data grid. The function is renamed from parseTsv to parseTsvText and changes from a tab-counting approach to a character-by-character parser that supports RFC 4180 CSV/TSV format with quoted fields.

Changes:

  • Replaced parseTsv with parseTsvText implementing RFC 4180-style parsing with quote handling
  • Removed the fallbackColumnCount parameter, allowing variable-width rows
  • Added support for escaped quotes ("" within quoted fields) and mixed line endings

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.

File Description
src/lib/data-grid.ts Completely rewrites the TSV parsing function to support quoted fields and RFC 4180 format
src/hooks/use-data-grid.ts Updates the import and function call to use the renamed parseTsvText function

Note: The PR description mentions 5 different types of changes (cell ref fixes, onDataUpdate improvements, onRowSelect optimization, etc.), but none of these changes are present in the actual diff. The only changes are to the TSV parsing function.

Comments suppressed due to low confidence (2)

src/lib/data-grid.ts:309

  • The function uses string concatenation (currentField += char) in a character-by-character loop. For very large clipboard contents (e.g., thousands of rows), this could cause performance issues due to string immutability in JavaScript creating a new string object on each concatenation. Consider using an array to collect characters and joining them at the end of each field (e.g., currentFieldChars.push(char) then currentField = currentFieldChars.join("")), or document the expected size limitations of clipboard data.
  while (i < text.length) {
    const char = text[i];
    const nextChar = text[i + 1];

    if (inQuotes) {
      if (char === '"' && nextChar === '"') {
        currentField += '"';
        i += 2;
      } else if (char === '"') {
        inQuotes = false;
        i++;
      } else {
        currentField += char;
        i++;
      }
    } else {
      if (char === '"' && currentField === "") {
        inQuotes = true;
        i++;
      } else if (char === "\t") {
        currentRow.push(currentField);
        currentField = "";
        i++;
      } else if (char === "\n") {
        currentRow.push(currentField);
        if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
          rows.push(currentRow);
        }
        currentRow = [];
        currentField = "";
        i++;
      } else if (char === "\r" && nextChar === "\n") {
        currentRow.push(currentField);
        if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
          rows.push(currentRow);
        }
        currentRow = [];
        currentField = "";
        i += 2;
      } else {
        currentField += char;
        i++;
      }
    }

src/lib/data-grid.ts:309

  • The function handles \n (Unix) and \r\n (Windows) line endings, but does not handle standalone \r (legacy Mac OS 9 line endings). If a user pastes data with \r line endings, each \r character will be treated as regular content and included in the field value, resulting in incorrect parsing. While Mac OS 9 line endings are rare today, consider either adding support for them or documenting that only Unix and Windows line endings are supported.
      } else if (char === "\n") {
        currentRow.push(currentField);
        if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
          rows.push(currentRow);
        }
        currentRow = [];
        currentField = "";
        i++;
      } else if (char === "\r" && nextChar === "\n") {
        currentRow.push(currentField);
        if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
          rows.push(currentRow);
        }
        currentRow = [];
        currentField = "";
        i += 2;
      } else {
        currentField += char;
        i++;
      }
    }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

…ultiline

Support both standard TSV (quoted multiline) and Excel/Sheets format
(unquoted multiline with tab counting):

- Keep parseTsvText for standard quoted format
- Add parseTsvWithTabCounting for unquoted multiline (Excel)
- New parseTsv wrapper detects format and uses appropriate parser
- Add test for unquoted multiline paste

This handles real-world paste scenarios where Excel/Sheets don't
automatically quote multiline cells.

Co-authored-by: Cursor <cursoragent@cursor.com>
sadmann7 and others added 2 commits February 25, 2026 13:41
Make helper functions private - only export parseTsv:
- parseTsvWithQuotes (was parseTsvText) - private
- parseTsvWithTabCounting - private
- countTabs - private
- parseTsv - public (only exported function)

Cleaner API with single entry point for TSV parsing.

Co-authored-by: Cursor <cursoragent@cursor.com>
Combine parseTsvText and parseTsvWithTabCounting into single parseTsv:
- Detects quoted vs unquoted format internally
- Only two exported functions: countTabs and parseTsv
- Cleaner API, same functionality

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Use hasQuotedFields() to detect field-delimiting quotes instead of
just checking if quotes exist anywhere. This prevents false positives
when data contains JSON arrays or other quoted content that isn't
TSV field delimiters.

Detection now checks:
- If line starts with quote
- If line contains tab followed by quote

This correctly handles data with JSON values like ["React","AWS"]
while still supporting properly quoted multiline fields.

Co-authored-by: Cursor <cursoragent@cursor.com>
The tab-counting parser now buffers incomplete lines and checks when
the accumulated tab count reaches the expected column count. This
correctly handles multiline content in middle columns, not just the
last column. split("\t") naturally places the multiline content in
the right column since the newlines are between two tab boundaries.

Co-authored-by: Cursor <cursoragent@cursor.com>
sadmann7 and others added 2 commits February 25, 2026 13:57
Replace countTabs(buf) calls with a running bufTabs counter.
Avoids re-scanning the entire buffer after each buffered line,
reducing from O(n²) to O(n) for multiline cells.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Cover all parsing scenarios:
- Basic single/multi-row TSV
- Quoted multiline fields (standard TSV)
- Escaped quotes and Windows line endings
- Unquoted multiline in last column
- Unquoted multiline in middle columns
- JSON values not triggering quoted parser
- Edge cases (empty input, single cell, no tabs)

Co-authored-by: Cursor <cursoragent@cursor.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

Comments suppressed due to low confidence (2)

src/lib/data-grid.ts:302

  • The condition for skipping empty rows (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) may inadvertently skip valid single-column rows that contain only empty strings. If a TSV has a single column and that column's value is an empty string for a particular row, that row will be skipped. Consider whether this behavior is intentional. If single-column empty rows should be preserved, the condition should be adjusted.
          if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
            rows.push(currentRow);

src/lib/data-grid.ts:318

  • The quoted field parser only handles Windows line endings (\r\n) but not standalone carriage returns (\r) that were used in classic Mac OS. If pasted data contains standalone \r characters, they will be treated as regular field content rather than row delimiters. Consider adding a case to handle standalone \r for better compatibility, or document this limitation.
        } else if (char === "\r" && nextChar === "\n") {
          currentRow.push(currentField);
          if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
            rows.push(currentRow);
          }
          currentRow = [];
          currentField = "";
          i += 2;
        } else {
          currentField += char;
          i++;
        }

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Merge all improvements into a single branch for one PR.

Co-authored-by: Cursor <cursoragent@cursor.com>
@sadmann7 sadmann7 merged commit 05c55ba into main Feb 25, 2026
7 checks passed
@sadmann7 sadmann7 deleted the sadman/port-north-tsv-parser branch February 25, 2026 08:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants