fix: improve harvester reports access and log downloads#773
fix: improve harvester reports access and log downloads#773TahaKhan998 wants to merge 1 commit intoCERNDocumentServer:masterfrom
Conversation
d0ad3f5 to
da34f38
Compare
| if not run: | ||
| return {"message": "Run not found"}, 404 | ||
|
|
||
| full_index_name = prefix_index(current_app.config["JOBS_LOGGING_INDEX"]) |
There was a problem hiding this comment.
We should use the service layer here.
| ) | ||
| if status in ("FAILED", "PARTIAL_SUCCESS"): | ||
| banner = "Job failed" if status == "FAILED" else "Job partially succeeded" | ||
| header.append("") |
There was a problem hiding this comment.
How does the end result look like? Can you attach an example file?
There was a problem hiding this comment.
| can_create = [AuthenticatedRegularUser(), SystemProcess()] | ||
| can_read = RDMRecordPermissionPolicy.can_read + [ArchiverRead()] | ||
| can_search = RDMRecordPermissionPolicy.can_search + [ArchiverRead()] | ||
| can_search_revisions = RDMRecordPermissionPolicy.can_search_revisions + [ |
There was a problem hiding this comment.
So curators can use the "View Changes" button on the Harvester Reports page.
There was a problem hiding this comment.
can you make sure they can only see revisions made by system user?
if they can see more, let's discuss on how to tackle it before jumping to implementations
da34f38 to
ddd6fd8
Compare
| ) No newline at end of file | ||
| hidden_params=[ | ||
| ["action", "record.publish"], | ||
| ["user_id", "system"], |
There was a problem hiding this comment.
did you check by any chance what does the REST API return for audit logs? we need to make sure REST API does not return wrong entries for curators role (on audit logs endpoint directly, not on this admin one)
There was a problem hiding this comment.
Yeah I checked the audit logs API. The issue was in HarvesterCurator.query_filter in generators.py. It was looking at identity.provides the wrong way so the filter never really worked and curators could call GET /api/audit-logs and get way too much back. I fixed it to use Permission(harvester_admin_access_action).allows(identity) and only return system user and record.publish like Harvester Reports. Updated test_harvester_curator_permissions too.
| return raw | ||
|
|
||
| # Group by context.task_id in first-seen order (RunsLogs.js buildLogTree). | ||
| task_groups = {} |
There was a problem hiding this comment.
wasn't this grouping already done somewhere in the code? I am having flashbacks :)
There was a problem hiding this comment.
Yeah, the admin view already does that grouping in RunsLogs.js. The download is just plain text though, so I repeated the same task_id grouping here so the file looks like what you see in admin.
03a9627 to
ee0d0c1
Compare
ee0d0c1 to
87516aa
Compare
Closes #767
Email template, removed "View Full Details"
The email had a "View Full Details" button pointing to /administration/jobs/<job_id>/<run_id>, but that page requires administration-access, which most recipients don’t have → leads to a permission error.
Removed the button from both HTML and plain-text templates. Summary + safe links are still there.
"View Changes" works for curators
Curators could access the Harvester Reports page, but "View Changes" failed due to can_search_revisions not including the harvester-curator path (blank page / 403).
Added HarvesterCurator to can_search_revisions, so curators can now actually see revision history.
Download button now returns real run logs
Previously, download used audit logs - only showed what got published.
Now it queries JOBS_LOGGING_INDEX using run_id + job_id and formats output like the admin UI:
[timestamp] LEVEL message
grouped by task_id
includes run status + failure message
respects max results limit
The .log is now a proper offline copy of the run logs.
Harvester Reports shows only system publishes
Page was showing all record.publish logs (including manual ones).
Now also filters user_id = "system" (backend + frontend), so it only shows harvester activity.