Skip to content

Datasets

⚠️ Check out the original documentation from the Dataverse Project related to dataset management.

A dataset in a Dataverse installation is a container for your data, documentation, code, and the metadata describing this Dataset.

Figure 1: Dataset structure.

A dataset contains three levels of metadata:

  • Citation metadata: any metadata that would be needed for generating a data citation and other general metadata that could be applied to any dataset;
  • Domain specific metadata: with specific support currently for Social Science, Life Science, Geospatial, and Astronomy datasets; and
  • File-level metadata: varies depending on the type of data file - for more details see File management section below).

Create a dataset

  1. Navigate to the Dataverse collection in which you want to add a dataset.

  2. Click on the “Add Data” button and select “New Dataset” in the dropdown menu.

💡If you are on the root Dataverse collection, your My Data page or click the “Add Data” link in the navbar, the dataset you create will be hosted in the root Dataverse collection. You can change this by selecting another Dataverse collection you have proper permissions to create datasets in, from the Host Dataverse collection dropdown in the create dataset form. This option to choose will not be available after you create the dataset.
Figure 2: Dataset creation button.
  1. To quickly get started, enter at minimum all the required fields with an asterisk (e.g., the Dataset Title, Author Name, Description Text, Point of Contact Email, and Subject) to get a Data Citation with a permanent link.

  2. Scroll down to the “Files” section and click on “Select Files to Add” to add all the relevant files to your Dataset. You can also upload your files directly from your Dropbox.

💡You can drag and drop or select multiple files at a time from your desktop directly into the upload widget. Your files will appear below the “Select Files to Add” button where you can add a description and tags (via the “Edit Tag” button) for each file. Additionally, an MD5 checksum will be added for each file. If you upload a tabular file a Universal Numerical Fingerprint (UNF) will be added to this file.
  1. Click the “Save Dataset” button when you are done. Your unpublished dataset is now created.

  2. Add additional metadata once you have completed the initial dataset creation by clicking the Edit button and selecting Metadata from the dropdown menu.

Edit a dataset

Once you’ve created a dataset, you can edit it at several levels to keep it up to date and well-organized. Here’s how you can make changes and what each option means for managing your dataset:

Figure 3: Dataset edition button.
  • Files (Upload): You can add new files to your dataset, replace outdated ones, or remove files that are no longer needed. This helps ensure your dataset reflects the most current version of your research.

  • Metadata: If you need to update details like the title, description, or authors of your dataset, you can do so by editing the metadata. This allows you to keep your dataset accurate and up-to-date.
    Moreover, once your dataset is created, editing it will also reveal additional metadata fields that weren’t visible when you first created the dataset. These extra fields allow you to provide more detailed information about your dataset. If you selected a specific domain when creating the Dataverse (e.g., Social Science or Astronomy), the new metadata fields shown during editing will include domain-specific options. Adding this extra metadata makes your dataset more complete and easier for others to find and understand.

  • Terms: You can revise the terms of use for your dataset at any time. This includes setting or updating restrictions, licenses, and disclaimers to clarify how others can access and use your data. See detailed documentation here.

⚠️ Datasets should not be published under CC0. By default, make sure to set the License/Data Use Agreement field to CC-BY 4.0. The BSC IPR team will review the licensing of your dataset, and some changes may be made if necessary.
Figure 3: Datasets Terms, CC-BY License.
  • Permissions: Permissions allow you to control who can access or modify your dataset. For instance, you can grant collaborators editing privileges, restrict file access, or make the dataset publicly available. Read more about managing permissions in the Dataset-level section in the original documentation.

  • Private URL: If you need to share your dataset privately before publishing it, you can generate a private URL. This link lets others view or download the dataset even if it isn’t publicly accessible yet. Learn how to create and use private links in the Dataverse Documentation.

  • Thumbnails and Widgets: Make your dataset stand out with a thumbnail image or embed widgets to share it dynamically. A thumbnail adds a visual identifier for your dataset, while widgets allow for live previews or sharing functionality. For setup instructions, see the Dataverse Thumbnails + Widgets section.

  • Delete Dataset: If a dataset is no longer needed, you can delete it. However, keep in mind that deleting a dataset is permanent and can only be done by users with the right permissions. Before deleting, make sure it is no longer required for ongoing projects or collaborations.