Skip to content

Fast upsert mode #34

@jqnatividad

Description

@jqnatividad

Currently, DP+ like Datapusher and xloader, only does drop & replace and doesn't do upserts.

It'd be great if DP+ can support upserts in a performant way.

This can be done by:

  • adding a resource-level metadata field that the Data Publisher can set to enable upsert mode.
  • when a resource has upsert mode enabled, instead of drop & replace, DP+ will:
    • compare the schemas of the existing resource and the new CSV to see if they are identical (qsv can do this very quickly)
    • if they're not, DP+ will abort stating that the resource is in upsert mode and the schemas do not match
    • if the schemas are identical, do a PostgreSQL copy to a temporary table of the file to be pushed
    • then do a INSERT INTO ON CONFLICT DO UPDATE to upsert the temporary table into the existing resource
    • the temporary table is then deleted

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions