Skip to content

Allow UDFs to take all columns in reduce() #354

@hombit

Description

@hombit

Feature request

Currently, we require the user to specify all reduce() input columns explicitly, for the sake of performance. Let's allow a special case when no columns are specified and all items per row are passed as a single dictionary:

def udf1(base_col, sub_col):
    # base_col is a single value, sub_col is a numpy array
    ...

# Existing behavior:
nf.reduce(udf1, "ra", "lightcurve.time")

def udf2(row):
    assert isinstance(row, dict)
    assert "ra" in row and "lightcurve.time" in row
    # row is a dictionary: base columns are single values,
    # nested columns are numpy arrays
    ...

# New behavior:
nf.reduce(udf2)

Before submitting
Please check the following:

  • I have described the purpose of the suggested change, specifying what I need the enhancement to accomplish, i.e. what problem it solves.
  • I have included any relevant links, screenshots, environment information, and data relevant to implementing the requested feature, as well as pseudocode for how I want to access the new functionality.
  • If I have ideas for how the new feature could be implemented, I have provided explanations and/or pseudocode and/or task lists for the steps.

Metadata

Metadata

Assignees

Labels

enhancementNew feature or request

Type

No type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions