Skip to content

Create a new Metadata Block

By default there are six different metadata blocks available in the BSC Dataverse, as described in the User Guide here.

If for your BSC department and/or group you think that a specific new Metadata Block is needed, you can contact the BSC Data Management team with the information explained below and we will create the new metadata block for you.

Step by step instructions

1) Download this spreadsheet.

2) Populate the spreadsheet (TSV format) with the information for the new metadata block that you want to request.

3) Once the spreadsheet is fully populated, send it to BSC Data Management team. Please, make sure you are sending the spreadsheet as a TSV.

How to populate the spreadsheet?

You can find all the relevant information about metadata customization in the original documentation (Admin Guide).

Below, you will find the information related to the Metadata Block TSV. For your convenience, you can download a real example with the Citation metadataBlock here.

The spreadsheet is divided into three sections:

  • metadataBlock
  • datasetField
  • controlledVocabulary

Here we list and describe the purposes of each section and property of the metadata block TSV.

1) metadataBlock

* Purpose: Represents the metadata block being defined.
* Cardinality: 
    * 0 or more per Dataverse installation
    * 1 per Metadata Block definition

2) datasetField

* Purpose: Each entry represents a metadata field to be defined within a metadata block.
* Cardinality: 1 or more per metadataBlock

3) controlledVocabulary

* Purpose: Each entry enumerates an allowed value for a given datasetField.
* Cardinality: zero or more per datasetField


Each of the three main sections own sets of properties:

metadataBlock properties

Property Purpose Allowed values and restrictions
name A user-definable string used to identify a #metadataBlock. No spaces or punctuation, except underscore. By convention, should start with a letter, and use lower camel case. Must not collide with a field of the same name in the same or any other #datasetField definition, including metadata blocks defined elsewhere.
dataverseAlias If specified, this metadata block will be available only to the Dataverse collection designated here by its alias and to children of that Dataverse collection. Free text.
displayName Acts as a brief label for display related to this #metadataBlock. Should be relatively brief. The limit is 256 characters, but very long names might cause display problems.
displayFacet Label displayed in the search area when this #metadataBlock is configured as a search facet for a collection. Should be brief. Long names will cause display problems in the search area.
blockURI Associates the properties in a block with an external URI. Properties will be assigned the global identifier blockURI<name> in the OAI:ORE metadata and archival Bags. The citation #metadataBlock has the blockURI https://dataverse.org/schema/citation/ which assigns a default global URI to terms such as https://dataverse.org/schema/citation/subtitle.

datasetField (field) properties

Property Purpose Allowed values and restrictions
name A user-definable string used to identify a #datasetField. Maps directly to field name used by Solr. (from DatasetFieldType-java) The internal DDI-like name, no spaces, etc.
(from Solr) Field names should consist of alphanumeric or underscore characters only and not start with a digit. This is not currently strictly enforced, but other field names will not have first-class support from all components and back compatibility is not guaranteed. Names with both leading and trailing underscores (e.g. _version_) are reserved.
Must not collide with a field of the same name in another #metadataBlock or any name already included as a field in the Solr index.
title Acts as a brief label for display related to this #datasetField. Should be relatively brief.
description Used to provide a description of the field. Free text.
watermark A string to initially display in a field as a prompt for what the user should enter. Free text.
fieldType Defines the type of content that the field, if not empty, is meant to contain. Allowed values:
- none
- date
- email
- text
- textbox
- string
- url
- int
- float
See below for fieldType definitions.
displayOrder Controls the sequence in which the fields are displayed, both for input or presentation. Non-negative integer.
displayFormat Controls how the content is displayed for presentation (not entry). The value may contain special variables (see below). HTML tags may be used in conjunction with these values to control web UI display. See below for displayFormat variables.
advancedSearchField Specify whether this field is available in advanced search. TRUE (available) or FALSE (not available).
allowControlledVocabulary Specify whether the possible values of this field are determined by values in the #controlledVocabulary section. TRUE (controlled) or FALSE (not controlled).
allowMultiples Specify whether this field is repeatable. TRUE (repeatable) or FALSE (not repeatable).
facetable Specify whether the field is facetable (i.e., if the expected values are useful search terms for this field). TRUE (facetable) or FALSE (not facetable).
Recommended for enumerated/controlled vocabulary fields, identifiers (IDs, names, email addresses), and other fields likely to share values across entries.
Not recommended for descriptions, floating point numbers, or unique values.
displayOnCreate Designate fields that should display during the creation of a new dataset, even before the dataset is saved. TRUE (display during creation) or FALSE (don’t display during creation).
required For primitive fields: specify whether the field is required.
For compound fields: specify if one or more subfields are required or conditionally required.
Primitive fields: TRUE (required) or FALSE (optional).
Compound fields:
- To make one or more subfields optional: parent and subfields = FALSE.
- To make one or more subfields required: parent and subfields = TRUE.
- To make one or more subfields conditionally required: parent = FALSE, required subfields = TRUE.
At least one instance of a required field must be present. More than one instance may be allowed depending on allowMultiples.
parent For subfields, specify the name of the parent/containing field. Must not result in a cyclical reference.
Must reference an existing field in the same #metadataBlock.
metadatablock_id Specify the name of the #metadataBlock that contains this field. Must reference an existing #metadataBlock.
Best practice: reference the block in the current definition (though technically possible to reference another existing block).
termURI Specify a global URI identifying this term in an external community vocabulary. Overrides the default created by appending the property name to the blockURI of the #metadataBlock. Example: The citation #metadataBlock defines the property title as http://purl.org/dc/terms/title, meaning it can be interpreted as the Dublin Core term title.

controlledVocabulary (enumerated) properties

Property Purpose Allowed values and restrictions
DatasetField Specifies the #datasetField to which this entry applies. Must reference an existing #datasetField. Best practice: reference a #datasetField in the current metadata block definition (though technically possible to reference a field from another metadata block).
Value A short display string, representing an enumerated value for this field. If the identifier property is empty, this value is used as the identifier. Free text. For booleans, recommended values are True and False; Unknown can be added if needed.
identifier A string used to encode the selected enumerated value of a field. If empty, the value of the “Value” field is used as the identifier. Free text.
displayOrder Controls the order in which the enumerated values are displayed for selection. New values don’t need to be added at the end; existing values can be renumbered to update the display order. Non-negative integer.

FieldType definitions

FieldType Definition
none Used for compound fields, in which case the parent field has no value and displays no data entry control.
date A date, expressed in one of three resolutions: YYYY-MM-DD, YYYY-MM, or YYYY.
email A valid email address. Not indexed for privacy reasons.
text Any text other than newlines may be entered. May also be used to define a boolean (see “Value” under #controlledVocabulary properties).
textbox Any text may be entered. Input is presented as a multi-line area that accepts newlines. Any HTML is permitted, but only a subset of tags will be rendered in the UI. (See Supported HTML Tags section.)
string Any text may be entered. The value is stored and indexed exactly as provided, with no analysis or transformations.
url If not empty, the field must contain a valid URL.
int An integer value destined for a numeric field.
float A floating-point number destined for a numeric field.

displayFormat variables

These are common ways to use the displayFormat to control how values are displayed in the UI. This list is not exhaustive.

Variable Description
(blank) The displayFormat is left blank for primitive fields (e.g., subtitle) and fields that do not take values (e.g., author), since displayFormats do not work for these fields.
#VALUE The value of the field (instance level).
#NAME The name of the field (class level).
#EMAIL For displaying emails.
<a href="#VALUE">#VALUE</a> Displays the value as a link (if the value entered is a link).
<a href='URL/#VALUE'>#VALUE</a> Displays the value as a link, with the value included in the URL.
Example: if the URL is http://emsearch.rutgers.edu/atlas/#VALUE_summary.html and the value entered is 1001, the field is displayed as 1001 hyperlinked to http://emsearch.rutgers.edu/atlas/1001_summary.html.
<img src="#VALUE" alt="#NAME" class="metadata-logo"/><br/> Displays an image from an entered image URL. Used to display images in producer and distributor logos metadata fields.
#VALUE:
- #VALUE:
(#VALUE)
Appends and/or prepends characters to the field value.
Example: if the displayFormat for distributorAffiliation is (#VALUE) and the value entered is University of North Carolina, the UI will display (University of North Carolina).
;
:
,
Displays the given character (semicolon, colon, comma, etc.) between values of fields in a compound field.
Example: if the displayFormat for the compound field series is :, and the values entered are seriesName = IMPs and seriesInformation = A collection of NMR data, the UI displays IMPs: A collection of NMR Data.