Darwin Core Data Package guide
Darwin Core Data Package guide
- Title
- Darwin Core Data Package guide
- Date version issued
- 2026-04-17
- Date created
- 2025-08-12
- Part of TDWG Standard
- http://www.tdwg.org/standards/450
- This version
- http://rs.tdwg.org/dwc/doc/dp/2026-04-17
- Latest version
- http://rs.tdwg.org/dwc/dp/
- Previous version
- http://rs.tdwg.org/dwc/doc/dp/2025-09-10
- Abstract
- Specification for creating a Darwin Core Data Package.
- Contributors
- Peter Desmet (INBO), Tim Robertson (Global Biodiversity Information Facility), John Wieczorek (Rauthiflor LLC, Global Biodiversity Information Facility)
- Creator
- Darwin Core Maintenance Group
- Bibliographic citation
- Darwin Core Maintenance Group. 2026. Darwin Core Data Package guide. Biodiversity Information Standards (TDWG). http://rs.tdwg.org/dwc/doc/dp/2026-04-17.
1 Introduction
Darwin Core Data Package (hereafter referred to as “DwC-DP”) is a community-developed container format to exchange biodiversity data. It extends the Data Package specification (developed by Frictionless Data) as an implementation for the Darwin Core Conceptual Model. This document specifies the requirements for a dataset to comply with DwC-DP.
1.1 Audience (non-normative)
This guide is intended for biodiversity data providers, curators, aggregators, researchers, software implementers, and standards developers who prepare or consume datasets using Darwin Core. It assumes familiarity with tabular data, but not with the Data Package specification. Where helpful, it references relevant parts of the Data Package specification and the Darwin Core standard.
1.2 Status of the content of this document
All sections of this document are normative (define what is required to comply with the standard), except for sections that are explicitly marked as non-normative (support understanding, but are not binding).
1.3 RFC 2119 key words
The key words “MUST”, “MUST NOT”, “REQUIRED”, “SHALL”, “SHALL NOT”, “SHOULD”, “SHOULD NOT”, “RECOMMENDED”, “MAY”, and “OPTIONAL” in this document are to be interpreted as described in RFC 2119.
2 DwC-DP Data Package example (non-normative)
Consider a dataset containing four bird Occurrences observed during a single parent Event. The data can be captured in two CSV files, each representing a DwC-DP table:
event.csv
eventID,eventDate,locationID
S229876476,2025-04-26T20:57:00+02:00,https://ebird.org/hotspot/L43523233
occurrence.csv
occurrenceID,eventID,scientificName,organismQuantity,organismQuantityType
1,S229876476,Apus apus,3,individuals
2,S229876476,Troglodytes troglodytes,1,individuals
3,S229876476,Turdus merula,1,individuals
4,S229876476,Erithacus rubecula,1,individuals
This dataset can be described as a DwC-DP with the following descriptor (datapackage.json):
{
"profile": "http://rs.tdwg.org/dwc-dp/1.0/dwc-dp-profile.json",
"id": "https://doi.org/10.9999/dwc-dp-example-dataset-doi",
"created": "2025-09-08T09:52:03-03:00",
"version": "1.0",
"resources": [
{
"name": "event",
"path": "event.csv",
"profile": "tabular-data-resource",
"format": "csv",
"mediatype": "text/csv",
"schema": {
"fields": [
{
"name": "eventID",
"title": "Event ID",
"description": "An identifier for a dwc:Event.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/eventID",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/eventID-2023-06-28"
},
{
"name": "eventDate",
"title": "Event Date",
"description": "A date or time interval during which a dwc:Event occurred.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/eventDate",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/eventDate-2025-06-12"
},
{
"name": "locationID",
"title": "Location ID",
"description": "An identifier a dcterms:Location.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/locationID",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/locationID-2023-06-28"
}
],
"primaryKey": ["eventID"]
}
},
{
"name": "occurrence",
"path": "occurrence.csv",
"profile": "tabular-data-resource",
"format": "csv",
"mediatype": "text/csv",
"schema": {
"fields": [
{
"name": "occurrenceID",
"title": "Occurrence ID",
"description": "An identifier for a dwc:Occurrence.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/occurrenceID",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/occurrenceID-2023-06-28"
},
{
"name": "eventID",
"title": "Event ID",
"description": "An identifier for a dwc:Event.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/eventID",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/eventID-2023-06-28"
},
{
"name": "scientificName",
"title": "Scientific Name",
"description": "A full scientific name, with authorship and date information if known. When forming part of a dwc:Identification, this should be the name in lowest level taxonomic rank that can be determined. This term should not contain identification qualifications, which should instead be supplied in dwc:verbatimIdentification.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/scientificName",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/scientificName-2023-06-28"
},
{
"name": "organismQuantity",
"title": "Organism Quantity",
"description": "A number or enumeration value for the quantity of dwc:Organisms.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/organismQuantity",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/organismQuantity-2023-06-28"
},
{
"name": "organismQuantityType",
"title": "Organism Quantity Type",
"description": "A type of quantification system used for the quantity of dwc:Organisms.",
"type": "string",
"format": "default",
"dcterms:isVersionOf": "http://rs.tdwg.org/dwc/terms/organismQuantityType",
"dcterms:references": "http://rs.tdwg.org/dwc/terms/version/organismQuantityType-2023-06-28"
}
],
"primaryKey": ["occurrenceID"],
"foreignKeys": [
{
"fields": "eventID",
"predicate": "happened during",
"reference": {
"resource": "event",
"fields": "eventID"
}
}
]
}
}
]
}
Together with an eml.xml file containing dataset-level metadata, the dataset would consist of the following files that could be zipped for easier transfer:
datapackage.json
eml.xml
event.csv
occurrence.csv
3 DwC-DP content
-
A DwC-DP MAY have an EML metadata file named
eml.xml, which describes the scientific meaning, provenance, stewardship, and contextual interpretation of a dataset. A metadata file MUST follow the Ecological Metadata Language specification. -
A DwC-DP MUST have a JSON descriptor file named
datapackage.json, which describes the structure and relational mechanics of the data in the dataset. A descriptor file MUST follow the Data Package specification. See section 3.1. -
A DwC-DP MUST have at least one resource file that represents a data package table and contains data for a dataset. See section 3.2. Resource files MAY be at the root level of the data package.
-
A DwC-DP MAY have other resource files that do not represent data package tables. See section 3.2.3. Other resource files MAY be at the root level of the data package.
The entire contents of a data package MAY be compressed using gzip (only) and if compressed must have a file name that ends with .gz (e.g., example-dwc-dp.gz). If a compressed data package is unzipped, the metadata (eml.xml) and descriptor (datapackage.json) files MUST be at the root level of the data package and MUST NOT be individually compressed.
Whether the entire contents of a data package is compressed or not, resource files MAY be individually compressed using gzip (only). An individually compressed resource file MUST have a name that appends .gz to the name of the compressed file (e.g., event.csv becomes event.csv.gz).
3.1 Descriptor content
A DwC-DP descriptor file (named datapackage.json) contains a reference to the profile the dataset conforms to, a list of data files (resources) and (optionally) dataset-level metadata in addition to or instead of the metadata in an eml.xml file. The requirements for these elements of a descriptor file are described below.
All requirements and examples in this guide use version 1 of the Data Package specification, which is RECOMMENDED for DwC-DPs.
-
The descriptor MUST have a
profileproperty, with a URL referencing the profile the dataset conforms to. This MUST be a string representing the URL to a DwC-DP profile served fromhttp://rs.tdwg.org. The URL MUST include the version of the profile (e.g.,http://rs.tdwg.org/dwc-dp/1.0/dwc-dp-profile.json, where1.0is the version).(non-normative) The DwC-DP profile imports all Data Package requirements. A dataset that conforms to the DwC-DP profile will therefore also conform to the Data Package requirements. In other words: a DwC-DP is also a Data Package.
-
The descriptor SHOULD have an
idproperty, with an identifier for the dataset, preferably a DOI. Theidproperty MUST follow the Data Package specification. -
The descriptor SHOULD have a
createdproperty, with a timestamp indicating when the dataset was created. Thecreatedproperty MUST follow the Data Package specification. -
The descriptor SHOULD have a
versionproperty, indicating the version of the dataset. Theversionproperty MUST follow the Data Package specification. -
The descriptor MUST have a
resourcesproperty, with an array of data files that are considered part of a dataset. Theresourcesproperty MUST follow the Data Package specification and MUST contain at least one data resource. See section 3.2 for details. -
The descriptor MAY have additional package-level properties. This includes dataset-level metadata defined by the Data Package specification (e.g.,
title,description,contributors,sources,licenses) or custom properties.
3.2 Resources
Each data file included in DwC-DP is a resource. Each resource MUST follow the Data Resource specification.
Of special interest are resources with data organized in tables that implement the Darwin Core Conceptual Model (DwC-CM)]. These resources/tables (hereafter referred to as “DwC-DP table files”) have additional requirements.
3.2.1 DwC-DP table files
A data file representing a DwC-DP table MUST be a delimited text files (hereafter referred to as “CSV files”, irrespective of the chosen dialect). Table files MUST follow RFC 4180, with the following exceptions:
-
A table file MUST be encoded as UTF-8 OR, when deviating from that encoding, the MUST have an appropriate
encodingproperty that MUST follow the Data Resource specification. -
When a table file deviates from RFC 4180 regarding dialect (e.g., line terminators, field delimiters, quote characters), the DwC-DP table MUST have a
dialectproperty describing the dialect. That property MUST follow the CSV Dialect specification. Only dialect properties deviating from the default SHOULD be provided. If the CSV file follows all defaults, adialectproperty SHOULD NOT be provided.
3.2.2 DwC-DP table properties
-
A DwC-DP table MUST have a
nameproperty, with the name of the table. Thenameproperty MUST follow the Data Resource specification and MUST be one of the reserved table names defined in the DwC-DP profile (e.g.,event,occurrence). See section 4 for an overview. -
A DwC-DP table MUST have a
pathproperty, with the path to the data file. Thepathproperty MUST follow the Data Resource specification. -
A DwC-DP table MUST have a
profileproperty, indicating the type of resource. Theprofileproperty MUST be the valuetabular-data-resource, thereby indicating that it follows the Tabular Data Resource specification. -
A DwC-DP table SHOULD have a
formatproperty, indicating the standard file extension of the data file (e.g.,csv,tsv). Theformatproperty MUST follow the Data Resource specification. -
A DwC-DP table MUST have a
mediatypeproperty, indicating the mediatype of the data file (e.g.,text/csv). Themediatypeproperty MUST follow the Data Resource specification and MUST be the valuetext/csv. -
A DwC-DP table MUST have a
schemaproperty, with a table schema describing the fields and relationships of the table. Theschemaproperty MUST follow the Data Resource specification, and MUST be an object representing the schema (not merely a string referencing it). See section 3.3 for details.(non-normative) By verbosely including the
schema, a descriptor does not rely on externally hosted files (except for the DwC-DP profile) to describe the data it represents. -
A DwC-DP table MAY have additional properties. This includes those defined by the Data Resource specification (e.g.,
bytes,hash) or custom properties.
3.2.3 Other resources
A DwC-DP MAY include other resources that do not represent a DwC-DP table. They MUST NOT have a name that is one of the reserved table names defined in the DwC-DP profile. See section 4 for an overview.
3.3 Table Schemas
A table schema describes the fields, relationships and missing values of a tabular data file. A table schema MUST follow the Table Schema specification.
Table schemas are provided at rs.tdwg.org for each DwC-DP table. See section 4. These table schemas include all possible fields, primary keys and foreign key relationships a table can have. Use these to select the fields and keys that are applicable to your data.
-
A DwC-DP table schema MUST have a
fieldsproperty, with an array of field descriptors describing the fields/columns in the data file. Thefieldsproperty MUST follow the Table Schema specification. In addition, the order and number of elements infieldsMUST be the order and number of fields in the CSV file. See section 3.4 for details. -
Each field in a DwC-DP table schema MUST be described with the field descriptor of the table schema provided at
rs.tdwg.orgfor that table. For example, if you want to describe aneventIDfield in aneventtable, you MUST use the field descriptor foreventIDin the table schema foreventprovided atrs.tdwg.org. Fields MUST NOT be misrepresented. Custom fields SHOULD NOT be added. -
A DwC-DP table schema SHOULD have a
primaryKeyproperty indicating the field(s) that act as primary keys. AprimaryKeyproperty MUST follow the Table Schema specification. TheprimaryKeyproperty is REQUIRED if the field is referenced by another table.primaryKeyvalues MUST be one or more of theprimaryKeyvalues defined in the table schema provided atrs.tdwg.orgfor that table (i.e., do not define primary keys not defined there). See section 3.3.1. -
A DwC-DP table schema SHOULD have a
foreignKeysproperty with an array of relationships the table has with other tables. It MUST follow the Table Schema specification. If a table has a foreign key relationship with another table, then theforeignKeysproperty is REQUIRED and every relationship MUST be expressed therein.foreignKeysvalues MUST be one or more of theforeignKeysvalues defined in the table schema provided atrs.tdwg.org(i.e., do not define foreign key relationships not defined there).foreignKeysMAY have apredicateproperty to document relationship semantics. See section 3.3.1. -
A DwC-DP table schema MAY have a
missingValuesproperty, indicating what values should be interpreted asnull. AmissingValuesproperty MUST follow the Table Schema specification. -
A DwC-DP table schema MAY have custom properties.
3.3.1 Relationships example (non-normative)
Consider an event table with the following table schema:
{
"fields": [],
"primaryKey": "eventID",
"foreignKeys": [
{
"fields": "eventConductedByID",
"predicate": "conducted by",
"reference": {
"resource": "agent",
"fields": "agentID"
}
},
{
"fields": "parentEventID",
"predicate": "happened during",
"reference": {
"resource": "",
"fields": "eventID"
}
}
]
}
For brevity, let’s name fields as table_name.field_name (e.g., event.eventID refers to the eventID field in the event table). The above schema expresses:
-
A relationship between the
eventandagenttables. For each value inevent.eventConductedByIDa corresponding value is expected inagent.agentID, linking those records. -
A relationship between the
eventtable and itself. For each value inevent.parentEventIDa corresponding value is expected inevent.eventID, linking those records. -
Since
event.eventIDis the target of a foreign key relationship, it must be a primary key.
3.4 Field descriptors
A field descriptor describes a single field in a table schema (e.g., its name, description, format, constraints).
-
A field descriptor MUST have a
nameproperty, with the name of the field (e.g.,eventID). Anameproperty MUST follow the Table schema specification and SHOULD correspond to the name of field/column in the data file (if a header is present). -
A field descriptor MUST have a
titleproperty, with the label of the field (e.g.,Event ID). Atitleproperty MUST follow the Table schema specification. -
A field descriptor MUST have a
descriptionproperty, with a human-readable description of the field, such as a Darwin Core definition. Adescriptionproperty MUST follow the Table schema specification. The definition of a field MAY differ (in this context) from the original definition for the corresponding term found in thedcterms:isVersionOfproperty. -
A field descriptor MAY have a
commentsproperty, with context-specific usage notes. -
A field descriptor MAY have a
examplesproperty, with context-specific examples of content appropriate for the field. -
A field descriptor MUST have a
typeproperty, indicating the data type of values in the field (e.g.,string,number). Atypeproperty MUST follow the Table schema specification. -
A field descriptor SHOULD have a
formatproperty, indicating how values should be parsed. Aformatproperty MUST follow the Table schema specification. -
A field descriptor MAY have a
namespaceproperty, with an abbreviation of the namespace of the source term (e.g.,dwc,dcterms). -
A field descriptor MUST have a
dcterms:isVersionOfproperty, with the URL of the unversioned source term the field is based on (e.g.,http://rs.tdwg.org/dwc/terms/eventID). -
A field descriptor MAY have a
dcterms:referencesproperty, with the URL of the version of the source term the field is based on (e.g.,http://rs.tdwg.org/dwc/terms/version/eventID-2023-06-28). -
A field descriptor MAY have an
rdfs:commentproperty, which MUST contain the canonical definition of the source term found in thedcterms:referencesproperty. -
A field descriptor MAY have a
constraintsproperty, indicating value requirements that SHOULD be used in validation. Aconstraintsproperty MUST follow the Table Schema specification. -
A field descriptor MAY have additional properties, including optional properties defined by the Table Schema specification or custom properties.
(non-normative) You will be guaranteed to meet the requirements for field descriptors by copying field descriptors directly from the table schemas provided at rs.tdwg.org.
4. DwC-DP tables (non-normative)
- Reserved table names: see the
enumvalues fordwc-dp-resource-namesin the Darwin Core Profile at http://rs.tdwg.org/dwc-dp/1.0/dwc-dp-profile.json - Table schemas: see the
tableSchemasat http://rs.tdwg.org/dwc-dp/1.0/table-schemas