refactor: input and output abstractions (WIP) #715

inishchith · 2025-09-15T19:30:26Z

Changelog

to be added

Additional context (e.g. screenshots, logs, links)

to be added

Checklist

Additional tests added
All CI checks passed
Relevant documentation updated

Copyleft License Compliance

Have you used any code that is subject to a Copyleft license (e.g., GPL, AGPL, LGPL)?
If yes, have you modified the code in the context of this project? please share additional details.

cursor · 2025-09-15T19:36:48Z

application_sdk/outputs/__init__.py

+                description="Number of errors while writing to files",
+            )
+            logger.error(f"Error writing pandas dataframe to files: {str(e)}")
+            raise


Bug: Missing Attributes in Output Class

The write_dataframe method in the Output base class, now a concrete implementation, attempts to use attributes like chunk_part and metrics, and calls methods such as path_gen, _flush_buffer, and _upload_file. These members are not defined in the Output base class, which causes AttributeErrors when the method executes.

cursor · 2025-09-15T19:36:48Z

application_sdk/io/parquet.py

+                # Get the generated file path and rename to final location
+                result_dict = result.to_pydict()
+                generated_file = result_dict["path"][0]
+                os.rename(generated_file, consolidated_file_path)


Bug: Consolidation Fails on Empty Daft Output

The consolidation logic in _consolidate_current_folder assumes daft_df.write_parquet always returns a dictionary with a "path" key containing a non-empty list. If Daft's output structure changes or is empty, accessing result_dict["path"][0] could cause a KeyError or IndexError.

Additional Locations (1)

application_sdk/outputs/parquet.py#L426-L429

inishchith · 2025-10-20T18:29:12Z

closing in reference to #755

inishchith and others added 22 commits September 8, 2025 18:37

fix: Parquet based reader and writers

e10992f

fix: add chunk logic to pandas json writer

80af1a2

fix: update daft json writer logic

6d734fc

fix: breaking json and parquet writers

05e34df

fix: batch bug

d87e440

Merge remote-tracking branch 'origin/main' into fix/chunked-writers

98816a7

fix: chunk_start comment

25d4cf1

fix: add factor

c0482e9

feat: add support for parquet file consolidation

ab4dcab

fix: tests

b0626f9

temp: pause transformation

58d01d9

temp: add back transformer

73eb815

feat: transformer batched processing (buffer_sized)

7f3af54

chore: remove comments

2753a83

feat: draft io json writer

9a12d36

draft: json.py working

6261ee7

fix: json output writer

73fc741

feat: refactor parquet and json outputs to writers

78b3f22

revert: tiny parts

410b2d0

revert: usage of writers and use output

66ac129

exp: check if buffered JsonOutput

1984e2b

chore: exp for buffered JsonOutput (env)

5045b18

cursor bot reviewed Sep 15, 2025

View reviewed changes

inishchith mentioned this pull request Sep 16, 2025

Improve IO Abstractions #718

Open

inishchith closed this Oct 20, 2025

inishchith deleted the refactor/io branch October 20, 2025 18:29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

refactor: input and output abstractions (WIP) #715

refactor: input and output abstractions (WIP) #715

Uh oh!

inishchith commented Sep 15, 2025

Uh oh!

cursor bot Sep 15, 2025

Uh oh!

cursor bot Sep 15, 2025

Uh oh!

inishchith commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

refactor: input and output abstractions (WIP) #715

refactor: input and output abstractions (WIP) #715

Uh oh!

Conversation

inishchith commented Sep 15, 2025

Changelog

Additional context (e.g. screenshots, logs, links)

Checklist

Copyleft License Compliance

Uh oh!

cursor bot Sep 15, 2025

Choose a reason for hiding this comment

Bug: Missing Attributes in Output Class

Uh oh!

cursor bot Sep 15, 2025

Choose a reason for hiding this comment

Bug: Consolidation Fails on Empty Daft Output

Uh oh!

inishchith commented Oct 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants