Skip to content

Conversation

tomreitz
Copy link
Collaborator

This PR implements support for the new-and-improved error messages introduced in Ed-Fi API 7.2.

Previously, POSTing an invalid payload to an Ed-Fi API endpoint would result in a response like

{
  "status": 400,
  "message": "Unable to resolve value `uri://ed-fi.org/EducationOrganizationCategoryDescriptor#Local Education AgencyX` to an existing `EducationOrganizationCategoryDescriptor` resource.",
  ...
}

(usually only the first encountered error is reported, even though the payload may have multiple issues)

In 7.2, a new error response format was introduced, like

{
  "detail": "Data validation failed. See 'validationErrors' for details.",
  "type": "urn:ed-fi:api:bad-request:data-validation-failed",
  "title": "Data Validation Failed",
  "status": 400,
  "correlationId": "ce0fba7c-5be5-48ac-a507-02451cff1ac9",
  "validationErrors": {
    "$.localEducationAgencyCategoryDescriptor": [
      "LocalEducationAgencyCategoryDescriptor value 'uri://ed-fi.org/LocalEducationAgencyCategoryDescriptor#CharterX' does not exist."
    ],
    "$.categories[0].educationOrganizationCategoryDescriptor": [
      "EducationOrganizationCategoryDescriptor value 'uri://ed-fi.org/EducationOrganizationCategoryDescriptor#Local Education AgencyX' does not exist."
    ]
  }
}

(multiple errors can be reported for a single POSTed payload)

Since lightbeam was previously looking only for the key messages in an error response payload, it would fail with an error on 7.2+ APIs.

This PR keeps support for the old error format and adds support for the new format. The main thing that's impacted is the --results-file: whereas previously it would report at most one error per file line-number:

{
    "started_at": "2025-07-24T14:06:46.011324",
    "working_dir": "/xxxxxxxxx/repos/lightbeam/example",
    "config_file": "./lightbeam.yml",
    "data_dir": "./",
    "api_url": "https://localhost/api",
    "namespace": "ed-fi",
    "resources": {
        "localEducationAgencies": {
            "failures": [
                {
                    "status_code": 400,
                    "message": "400: Unable to resolve value `uri://ed-fi.org/EducationOrganizationCategoryDescriptor#Local Education AgencyX` to an existing `EducationOrganizationCategoryDescriptor` resource.",
                    "file": "./localEducationAgencies.jsonl",
                    "line_numbers":[1],
                    "count": 1
                }
            ],
            "records_processed": 1,
            "records_skipped": 0,
            "records_failed": 1
        }
    },
    "command": "send",
    "completed_at": "2025-07-24T14:08:22.745435",
    "runtime_sec": 96.734111,
    "total_records_processed": 1,
    "total_records_skipped": 0,
    "total_records_failed": 1
}

now it will be possible (when communicating with a 7.2+ API) to get multiple errors per single line of a file:

{
    "started_at": "2025-07-24T13:13:13.041002",
    "working_dir": "/xxxxxxxxx/repos/lightbeam/example",
    "config_file": "./lightbeam.yml",
    "data_dir": "./",
    "api_url": "https://localhost/api",
    "namespace": "ed-fi",
    "resources": {
        "localEducationAgencies": {
            "failures": [
                {
                    "status_code": 400,
                    "message": "LocalEducationAgencyCategoryDescriptor value 'uri://ed-fi.org/LocalEducationAgencyCategoryDescriptor#CharterX' does not exist. (at $.localEducationAgencyCategoryDescriptor)",
                    "file": "./localEducationAgencies.jsonl",
                    "line_numbers":[1],
                    "count": 1
                },
                {
                    "status_code": 400,
                    "message": "EducationOrganizationCategoryDescriptor value 'uri://ed-fi.org/EducationOrganizationCategoryDescriptor#Local Education AgencyX' does not exist. (at $.categories[0].educationOrganizationCategoryDescriptor)",
                    "file": "./localEducationAgencies.jsonl",
                    "line_numbers":[1],
                    "count": 1
                }
            ],
            "records_processed": 1,
            "records_skipped": 0,
            "records_failed": 1
        }
    },
    "command": "send",
    "completed_at": "2025-07-24T13:13:13.364387",
    "runtime_sec": 0.323385,
    "total_records_processed": 1,
    "total_records_skipped": 0,
    "total_records_failed": 1
}

I think this makes sense, but am open to feedback/concerns.

The edu_data_quality_monitoring package unpacks this results-file format; I looked at it and it doesn't seem like this lightbeam change will affect that, but I could be wrong. @rlittle08 could perhaps confirm this change won't break the EDU package?

I've tested this against Ed-Fi API versions 5.3, 6.0, 7.0, 7.1, 7.2, and 7.3; all worked with no errors.

@tomreitz tomreitz requested a review from ejoranlienea July 24, 2025 21:10
@ejoranlienea
Copy link
Contributor

Re: log parsing, I believe this part will no longer be accurate with this change.

Specifically, the log parsing ultimately returns strings like: 49 records failed for studentAssessments; 2 records failed for assessments, dis-aggregating the 51 total_records_failed into failures by targeted resource. This depended on the assumption that count object inside a resource's failures would sum to the total number of resource failures, but here it does not, because we've moved from counting 'failed payloads' to counting 'errors within payloads'.

Now the records_failed object already answers that question here, so really there was no need to sum. The other object produced in that CTE (resource_summary) appears as though it would still work, but the counts there might be a bit confusing in the new scheme.

So anyway, I think changing the above-linked line from sum(error_count) to any_value(records_failed) would fix it, and be backwards compatible

…es but allow other Python errors to bubble up
@tomreitz tomreitz merged commit 89eadb1 into main Sep 25, 2025
@tomreitz tomreitz deleted the feature/support_edfi72_errors branch September 25, 2025 16:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants