Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bulk_update_or_create(model_instances) or bulk_update(model_instances, upsert=True)? #49

Open
candeira opened this issue May 1, 2016 · 9 comments

Comments

@candeira
Copy link

candeira commented May 1, 2016

For my current job we need bulk upsert of records, and I'm thinking of forking your package and implementing bulk_upsert myself. If/when I do that, I'd like to do it in the manner that's most likely to be accepted into your project, so as not to maintain an independent fork.

Which syntax do you prefer?

  • bulk_update_or_create(model_instances)
  • bulk_update(model_instances, upsert=True)
  • bulk_upsert(model_instances)

For now I'd only make my changes compatible with Postgres 9.5+, because that's what we're using and because I'm relatively new at this niche.

Any other advice/comment?

@aykut
Copy link
Owner

aykut commented May 2, 2016

Hi,

I'm not sure it is a good idea to include bulk_create into this project. Django already has built-in bulk_create method. Why not separate the objects into create and update, then use bulk_create and bulk_update explicitly?

@ckcollab
Copy link

ckcollab commented Jun 1, 2016

@candeira I'm way into that! I could use this on my project, for sure.

@aykut bulk_update_or_create is different from bulk_create?

@phlax
Copy link

phlax commented Mar 16, 2017

@candeira @aykut @ckcollab this would be amazingly helpful

@phlax
Copy link

phlax commented Mar 16, 2017

@aykut the problem with doing bulk_create is that you need to know in advance which ones exist already - so requires an additional query i think

@mehdipourfar
Copy link

I need this feature. Any news?

@arnau126
Copy link
Collaborator

arnau126 commented Aug 2, 2017

I think it's possible to add this feature. I would call it bulk_update_or_create because django already has a update_or_create for single instances.

But even if we implement this function here, we will also need to know which instances already exist (performing an additional query). bulk_update_or_create will actually split the list of instances and call bulk_create and bulk_update separately. So each batch will perform 3 queries.

Seems reasonable for you? Any better approach?

@mehdipourfar
Copy link

@abdulwahid24
Copy link

I do agree with @arnau126, Any update about this feature.

@Bartvds
Copy link

Bartvds commented Aug 31, 2017

The 3 query approach is a race condition; unless you can be sure your program is the only one writing to that table you'll have to add retry logic around the transaction (as records can get added and removed between your read and create step).

SQL level UPSERT is the way to go for atomic single query update/create.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

8 participants