Skip to content
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
16 changes: 12 additions & 4 deletions site/cds_rdm/tasks.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,15 +32,23 @@
@shared_task
def sync_users(since=None, **kwargs):
"""Task to sync users with CERN database."""
user_ids = users_sync(identities=dict(since=since))
reindex_users.delay(user_ids)
try:
user_ids = users_sync(identities=dict(since=since))
reindex_users.delay(user_ids)
except Exception as e:
db.session.rollback()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

won't this rollback all the updates for all the users? can we rollback only one user update?

Copy link
Copy Markdown
Contributor Author

@sakshamarora1 sakshamarora1 Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We have the DB commit() inside the functions here in invenio-cern-sync, so it will only rollback the errored user

current_app.logger.exception(e)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will also bubble up the exception to sentry, please make sure thats not the case

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But for this case, we do need the exception to get to sentry or not? Or will we go everyday and check the job logs for any errors?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we need to know that the job has failed. We don't need to know this from sentry necessarily, we have now email notifications on the finished tasks... My doubt there is that exceptions might be raised elsewhere incorrectly. Have you discussed with Alex or Nico on what is expected from their side regarding how the exceptions from tasks are handled? should they be in sentry? I think all the jobs should have consistent behaviour on this, so we need to align with the rest.

Copy link
Copy Markdown
Contributor Author

@sakshamarora1 sakshamarora1 Apr 13, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Form what I understand, Sentry exceptions should come up on the things we can fix. And for the spamming of the sentry for other jobs like ORCID, etc, I had a discussion with Pablo, and they seem to be aligned on that as well. That most of the sentry logs for the tasks that are basically 'spam' is due to the fact that we cannot do anything for them. For eg. if the full name is missing in the ORCID job.

^^ This mentioned above is a separate underlying issue in invenio-vocabularies and a potential fix there, not here.

So for this case, on the task level, if there is a duplicate account, we will see the sentry notification and go fix the DB, so we can keep it as an exception. But if we want to rely on email notifications, then yeah we can change it to warning. What do you think?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed IRL, we will keep this as exception for the reason above

Next step is to fix the DB and re-run the task



@shared_task
def sync_groups(since=None, **kwargs):
"""Task to sync groups with CERN database."""
group_ids = groups_sync(groups=dict(since=since))
reindex_groups.delay(group_ids)
try:
group_ids = groups_sync(groups=dict(since=since))
reindex_groups.delay(group_ids)
except Exception as e:
db.session.rollback()
current_app.logger.exception(e)


@shared_task()
Expand Down
Loading