This repository contains materials related to the mutual mapping between CCMM and DataCite metadata standards, including the technical XSLT transformation scripts.
Important
This XSLT transformation is subject to minor ongoing improvements. While it is fully available for custom modification and use, the final responsibility for the transformation results lies with the user. We highly recommend testing and fine-tuning the script against your own datasets to ensure compatibility with specific metadata profiles.
2026-05-14
DataCite to CCMM XSLT transformation:
- DataCite copyright goes to CCMM license instead of access rights.
- Dynamic scheme IRI resolution for identifiers: Replaced hardcoded DOI scheme IRI with dynamic resolution.
This document describes the background and the methodology for the design of CCMM and DataCite mapping. The motivation for mapping between CCMM and DataCite is the need to align CCMM metadata with DataCite metadata and vice versa, i.e., a DataCite-compliant representation of CCMM metadata and a CCMM-compliant representation of DataCite metadata. The need arises because metadata catalogs can support either the CCMM or the DataCite model.
"The Czech Core Metadata Model for Research Data (abbreviated as CCMM) is a core metadata model for research data description in the Czech Republic, it is an output of the Czech Academic and Research Discovery Services project, hereinafter referred to as CARDS." [https://www.ccmm.cz/en/core-model-ccmm/model-purpose-and-objectives/]
- Web: https://model.ccmm.cz/research-data/en/#Dataset
- XML Schema: https://model.ccmm.cz/research-data/dataset/schema.xsd
- Examples: https://github.com/techlib/CCMM/tree/main/_metadata-samples/xml
- Visualization of the schema along with emphasizing mandatory parts:

"The DataCite Metadata Schema is a list of core metadata properties chosen for accurate and consistent identification of a resource for citation and retrieval purposes, with recommended use instructions in the documentation. The DataCite Metadata Schema is suitable for a wide range of resource types—from samples and images to data and preprints." [https://datacite-metadata-schema.readthedocs.io/en/4.7/introduction/about-schema/] It is currently, one of the most widely used Semantic Web vocabularies for describing datasets and data catalogues.
- Web: https://datacite-metadata-schema.readthedocs.io/en/4.7/
- Examples and XML Schema: https://datacite-metadata-schema.readthedocs.io/en/4.7/xml/
- The XML Schema: https://schema.datacite.org/meta/kernel-4.7/metadata.xsd
- Full Example: https://schema.datacite.org/meta/kernel-4.7/example/datacite-example-full-v4.xml
\resource\identifier
\resource\creators
\resource\titles
\resource\publisher
\resource\publicationYear
\resource\resourceType
CCMM DataCite Mapping and Transformation Methodology is as follows
-
Initial alignment is based on high-level (vocabulary-level) comparison of the metadata elements defined in CCMM 1.1.0 (https://model.ccmm.cz/research-data/en/dsv.ttl) and in DataCite in-house vocabulary representation in OWL (https://model.ccmm.cz/vocabulary/datacite/model.owl.ttl)
-
XML crosswalks
2.1. XML crosswalks based on sample examples for CCMM metadata and DataCite metadata (initial mappings).
2.2. XML crosswalks (exhaustive mappings using the full list of elements and attributes for CCMM and DataCite)
-
Operationalize XML crosswalks using XSLT transformation, aiming at transforming metadata:
3.1. CCMM metadata to DataCite metadata, i.e., DataCite-compliant representation of CCMM metadata
3.2. DataCite metadata to CCMM metadata, i.e., CCMM-compliant representation of DataCite metadataa
The whole approach is iterative.
Documented in the initial-mapping branch.
Partly documented in the initial-mapping branch.
Once the initial mapping and XML crosswalks were completed, we shifted to an exhaustive search that began with DataCite elements and attributes (queried via XPath) to identify the corresponding CCMM nodes.
Complete list of XML crosswalks in one table or in TSV file
Operationalize XML crosswalks using XSLT transformation aiming at transforming metadata:
-
XSLT transformation for transforming CCMM metadata to DataCite metadata, i.e. DataCite-compliant representation of CCMM metadata
-
XSLT transformation for transforming DataCite metadata to CCMM metadata, i.e. CCMM-compliant representation of DataCite metadata
[!NOTE] This XSLT transformation is subject to minor ongoing improvements. While it is fully available for custom modification and use, the final responsibility for the transformation results lies with the user. We highly recommend testing and fine-tuning the script against your own datasets to ensure compatibility with specific metadata profiles.
For validating the XSLT transformation workflow, we employ round‑tripping tests, ensuring that the transformed output can be converted back to its original form:
+-----------------+ XSLT +------------------+
| CCMM XML (in) | ----------------> | DataCite XML |
+-----------------+ +------------------+
|
| XSLT
v
+------------------+
| CCMM XML (out) |
+------------------+
Using https://oai.datacite.org/oai the following real DataCite metadata records have been gathered for XSLT transformation testing:
15 "randomly" selected DataCite metadata files (v 4.6) out of 17500 DataCite metadata records.
We can see that HEPData repository dominates (11 times). At least, next records from different repositories (Nakala and National Research Council Canada).
The validation report is available.
in progress
in progress
- The XSLT transformations have been validated using our CCMM XML sample datasets and subset01-DataCite-4.6. Ongoing work focuses on evaluating the transformations against subset01-DataCite-4.4, subset01-DataCite-4.5.
- Mapping between DataCite controlled vocabularies and CCMM controlled vocabularies in both directions, with application in XSLT transformations. Mapping of resource types from DataCite to COAR has already been implemented; mapping from COAR to DataCite is currently in progress. At present, values from controlled vocabularies are otherwise taken directly from the source metadata within the XSLT transformations.