You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
DP+ 3.0 will make the Data Resource Upload Flow (DRUF) that was developed for the Texas Water Data Hub generally available.
The DRUF is interactive FAIRification, featuring:
ckanext-scheming 3.0 powered Dataset Form Pages with a standard Upload page in the beginning, and a Review page at the end.
metadata formulas - Excel like formula DSL using jinja2 that can calculate/suggest metadata. They can either be pre-calculated and assigned automatically or offered as suggestions.
Extensible formula helper library - since the formulas are written in jinja2, additional helpers can be easily added. The formula helpers will prioritize precalculating DCAT-US 3 recommended optional properties that are otherwise too hard to calculate manually (DCAT-US 3 in DP+ tracking issue #218)
suggestion UX/UI - a Suggestion UI using Bootstrap popovers that show the Suggestion and the Formula.
Commit after truncate to reduce lock contention #263 — merged 2025-11-23 (69f5481); truncate in its own txn so the AccessExclusive lock is held for the shortest possible window, eliminating UI hangs on large COPY-into-datastore jobs.
incorporate TX Form Pages implementation - specifically, the Registration Formpage and the Upload formpage in the beginning, and the Review Formpage at the end.
extend qsv describegpt so that it can use a controlled vocabulary for suggested tags. The controlled vocabulary will be configurable and point to the specified CKAN resource with the following columns tag, description, AI guidance. AI guidance will be customizable instructions when/how to use the associated tag.
Stash existing data dictionary before doing a DP+ job. If DP+job fails, the old Data Dictionary should be restored. #265 — stash + restore Data Dictionary across DP+ job failures, landed in fix(#265): stash Data Dictionary before delete, restore on rollback #307 (59e08d7). Analysis stage writes a small per-resource JSON snapshot to disk before deleting the existing datastore; _rollback_database reads it back on failure and re-creates the datastore with the original info dicts (and proper type derived from type_override) and zero rows. A retry branch in AnalysisStage._parse_stats re-loads the stash when no live datastore exists, so the dictionary survives even when the failure happened outside the database transaction. Stash mtime is surfaced in restore logs so operators can spot stale-restore vs. genuine retry. New knob ckanext.datapusher_plus.dictionary_stash_dir (defaults to <tempdir>/dpp_dict_stash).
DP+ 3.0 will make the Data Resource Upload Flow (DRUF) that was developed for the Texas Water Data Hub generally available.
The DRUF is interactive FAIRification, featuring:
(DO THIS FIRST, BEFORE ITEMS BELOW)
69f5481); truncate in its own txn so theAccessExclusivelock is held for the shortest possible window, eliminating UI hangs on large COPY-into-datastore jobs.fdc3714), scheming UI in feat(ai-suggestions): scheming UI for AI-derived metadata (PR #253 follow-up) #302 (ad28ebc), qsv-flag + envelope-shape fixes from LM Studio E2E in fix(ai-suggestions): real-qsv schema + CLI flag corrections (PR #301/#302 follow-up) #303, and Vitest + jsdom unit suite in test(js): vitest + jsdom unit tests for scheming-ai-suggestions.js (+ fixes 1 polling bug) #304 (which also caught a production bug where the polling JS was readingSTATUSfrom the wrong path and would never have terminated). Opt-in viackanext.datapusher_plus.enable_ai_suggestions(off by default; needs an OpenAI-compatible endpoint configured in qsv's describegpt prompt-file). Per-field opt-in viaai_suggestion: trueon the scheming field config; AI buttons appear next to opted-in fields on markdown / text / select form snippets, polling JS reads suggestions frompackage["dpp_suggestions"]["ai_suggestions"]and applies on click.tag,description,AI guidance.AI guidancewill be customizable instructions when/how to use the associated tag.efb254f;format_converter.py:319-340now sniffs and normalizes non-comma CSVs59e08d7). Analysis stage writes a small per-resource JSON snapshot to disk before deleting the existing datastore;_rollback_databasereads it back on failure and re-creates the datastore with the originalinfodicts (and propertypederived fromtype_override) and zero rows. A retry branch inAnalysisStage._parse_statsre-loads the stash when no live datastore exists, so the dictionary survives even when the failure happened outside the database transaction. Stash mtime is surfaced in restore logs so operators can spot stale-restore vs. genuine retry. New knobckanext.datapusher_plus.dictionary_stash_dir(defaults to<tempdir>/dpp_dict_stash).