fix(data-grid): apply improvements from code review findings#1118
fix(data-grid): apply improvements from code review findings#1118
Conversation
Replace tablecn's heuristic-based parseTsv with North's standard-compliant
parseTsvText implementation. The new parser properly handles:
- Quoted multiline cells
- Escaped quotes ("" -> ")
- Windows line endings (\r\n)
- Empty row filtering
- Mixed quoted and unquoted fields
This brings tablecn in line with TSV/CSV standards and improves paste
reliability for spreadsheet data.
Co-authored-by: Cursor <cursoragent@cursor.com>
Apply improvements identified during the North porting process: 1. Initialize prevIsEditingRef to false - Ensures focus/selection logic runs when cell mounts in edit mode - Fixes ShortTextCell, NumberCell, and UrlCell 2. Preserve undefined/null entries in onDataUpdate - Maintains array structure and indices correctly - Prevents data corruption from missing rows 3. Optimize onRowSelect to use single findIndex - Use rowId directly instead of getting currentRow first - Handle edge case where last clicked row was filtered out 4. Filter non-navigable columns from copy operation - Prevents checkbox/select columns in clipboard 5. Simplify cell persistence logic - Remove unnecessary fallback chains These fixes were validated through North's code review and testing. Co-authored-by: Cursor <cursoragent@cursor.com>
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
The new parseTsvText function follows standard TSV/CSV format where multiline content must be quoted. Update test to use proper format. Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Pull request overview
This PR completely rewrites the TSV (Tab-Separated Values) parsing implementation used for clipboard paste operations in the data grid. The function is renamed from parseTsv to parseTsvText and changes from a tab-counting approach to a character-by-character parser that supports RFC 4180 CSV/TSV format with quoted fields.
Changes:
- Replaced
parseTsvwithparseTsvTextimplementing RFC 4180-style parsing with quote handling - Removed the
fallbackColumnCountparameter, allowing variable-width rows - Added support for escaped quotes (
""within quoted fields) and mixed line endings
Reviewed changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 7 comments.
| File | Description |
|---|---|
| src/lib/data-grid.ts | Completely rewrites the TSV parsing function to support quoted fields and RFC 4180 format |
| src/hooks/use-data-grid.ts | Updates the import and function call to use the renamed parseTsvText function |
Note: The PR description mentions 5 different types of changes (cell ref fixes, onDataUpdate improvements, onRowSelect optimization, etc.), but none of these changes are present in the actual diff. The only changes are to the TSV parsing function.
Comments suppressed due to low confidence (2)
src/lib/data-grid.ts:309
- The function uses string concatenation (
currentField += char) in a character-by-character loop. For very large clipboard contents (e.g., thousands of rows), this could cause performance issues due to string immutability in JavaScript creating a new string object on each concatenation. Consider using an array to collect characters and joining them at the end of each field (e.g.,currentFieldChars.push(char)thencurrentField = currentFieldChars.join("")), or document the expected size limitations of clipboard data.
while (i < text.length) {
const char = text[i];
const nextChar = text[i + 1];
if (inQuotes) {
if (char === '"' && nextChar === '"') {
currentField += '"';
i += 2;
} else if (char === '"') {
inQuotes = false;
i++;
} else {
currentField += char;
i++;
}
} else {
if (char === '"' && currentField === "") {
inQuotes = true;
i++;
} else if (char === "\t") {
currentRow.push(currentField);
currentField = "";
i++;
} else if (char === "\n") {
currentRow.push(currentField);
if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
rows.push(currentRow);
}
currentRow = [];
currentField = "";
i++;
} else if (char === "\r" && nextChar === "\n") {
currentRow.push(currentField);
if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
rows.push(currentRow);
}
currentRow = [];
currentField = "";
i += 2;
} else {
currentField += char;
i++;
}
}
src/lib/data-grid.ts:309
- The function handles
\n(Unix) and\r\n(Windows) line endings, but does not handle standalone\r(legacy Mac OS 9 line endings). If a user pastes data with\rline endings, each\rcharacter will be treated as regular content and included in the field value, resulting in incorrect parsing. While Mac OS 9 line endings are rare today, consider either adding support for them or documenting that only Unix and Windows line endings are supported.
} else if (char === "\n") {
currentRow.push(currentField);
if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
rows.push(currentRow);
}
currentRow = [];
currentField = "";
i++;
} else if (char === "\r" && nextChar === "\n") {
currentRow.push(currentField);
if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
rows.push(currentRow);
}
currentRow = [];
currentField = "";
i += 2;
} else {
currentField += char;
i++;
}
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
…ultiline Support both standard TSV (quoted multiline) and Excel/Sheets format (unquoted multiline with tab counting): - Keep parseTsvText for standard quoted format - Add parseTsvWithTabCounting for unquoted multiline (Excel) - New parseTsv wrapper detects format and uses appropriate parser - Add test for unquoted multiline paste This handles real-world paste scenarios where Excel/Sheets don't automatically quote multiline cells. Co-authored-by: Cursor <cursoragent@cursor.com>
Make helper functions private - only export parseTsv: - parseTsvWithQuotes (was parseTsvText) - private - parseTsvWithTabCounting - private - countTabs - private - parseTsv - public (only exported function) Cleaner API with single entry point for TSV parsing. Co-authored-by: Cursor <cursoragent@cursor.com>
Combine parseTsvText and parseTsvWithTabCounting into single parseTsv: - Detects quoted vs unquoted format internally - Only two exported functions: countTabs and parseTsv - Cleaner API, same functionality Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Use hasQuotedFields() to detect field-delimiting quotes instead of just checking if quotes exist anywhere. This prevents false positives when data contains JSON arrays or other quoted content that isn't TSV field delimiters. Detection now checks: - If line starts with quote - If line contains tab followed by quote This correctly handles data with JSON values like ["React","AWS"] while still supporting properly quoted multiline fields. Co-authored-by: Cursor <cursoragent@cursor.com>
The tab-counting parser now buffers incomplete lines and checks when
the accumulated tab count reaches the expected column count. This
correctly handles multiline content in middle columns, not just the
last column. split("\t") naturally places the multiline content in
the right column since the newlines are between two tab boundaries.
Co-authored-by: Cursor <cursoragent@cursor.com>
Replace countTabs(buf) calls with a running bufTabs counter. Avoids re-scanning the entire buffer after each buffered line, reducing from O(n²) to O(n) for multiline cells. Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Cover all parsing scenarios: - Basic single/multi-row TSV - Quoted multiline fields (standard TSV) - Escaped quotes and Windows line endings - Unquoted multiline in last column - Unquoted multiline in middle columns - JSON values not triggering quoted parser - Edge cases (empty input, single cell, no tabs) Co-authored-by: Cursor <cursoragent@cursor.com>
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
Comments suppressed due to low confidence (2)
src/lib/data-grid.ts:302
- The condition for skipping empty rows
(currentRow.length > 1 || currentRow.some((f) => f.length > 0))may inadvertently skip valid single-column rows that contain only empty strings. If a TSV has a single column and that column's value is an empty string for a particular row, that row will be skipped. Consider whether this behavior is intentional. If single-column empty rows should be preserved, the condition should be adjusted.
if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
rows.push(currentRow);
src/lib/data-grid.ts:318
- The quoted field parser only handles Windows line endings (\r\n) but not standalone carriage returns (\r) that were used in classic Mac OS. If pasted data contains standalone \r characters, they will be treated as regular field content rather than row delimiters. Consider adding a case to handle standalone \r for better compatibility, or document this limitation.
} else if (char === "\r" && nextChar === "\n") {
currentRow.push(currentField);
if (currentRow.length > 1 || currentRow.some((f) => f.length > 0)) {
rows.push(currentRow);
}
currentRow = [];
currentField = "";
i += 2;
} else {
currentField += char;
i++;
}
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Merge all improvements into a single branch for one PR. Co-authored-by: Cursor <cursoragent@cursor.com>
Summary
Apply fixes and improvements discovered during code review and testing. These changes address edge cases and performance issues in the data grid implementation.
Changes
1. Fix
prevIsEditingRefinitializationfalseinstead ofisEditingin ShortTextCell, NumberCell, and UrlCellfalseso the focus/selection logic runs correctly on first render2. Preserve undefined/null entries in
onDataUpdatenewDataarray3. Optimize
onRowSelectimplementationfindIndexinstead offindIndex+ array accessrowIddirectly instead of gettingcurrentRow.id4. Filter non-navigable columns from copy operation
5. Simplify cell persistence logic
currentData.lengthinstead oftableRowCountTesting
These changes address edge cases found through:
Risk Assessment
Low Risk - All changes are refinements to existing fixes that have already been deployed. They address edge cases and improve code quality without changing core behavior.