Skip to content

Supported file format

The BSC Dataverse supports the ingestion of any file format, allowing researchers the flexibility to upload a wide range of data. However, to maximize the usability and longevity of your data, we strongly encourage the use of standardized, widely-used file formats recognized by your research community.

Examples of recommended formats:

File Type Preferred File Formats Non-Preferred File Formats
Audio - Uncompressed and lossless WAV or AIFF (.wav, .aiff)
- Compressed and lossless FLAC (.flac)
- Compressed MP3 (.mp3)
- AAC (.m4a)
- Ogg Vorbis (.ogg)
- Windows Media Audio (.wma)
Image - Uncompressed TIFF (.tif, .tiff)
- Compressed and lossless PNG (.png)
- Compressed JPEG (.jpg, .jpeg)
- Adobe Photoshop (.psd)
- Windows Bitmap (.bmp)
- Raw Image Data (.raw)
Geospatial Data - ESRI Shapefile (.shp)
- GeoJSON (.geojson)
- NetCDF (.nc)
- Proprietary or unsupported geospatial formats
Simulation Data - NetCDF (.nc)
- HDF5 (.h5)
- VTK (.vtk)
- ParaView (.pvd)
- Custom binary formats without documentation
Spreadsheet/Tabular Data - Plain text with UTF-8 encoding, tab-separated or comma-separated (.tsv, .csv) - Excel (.xlsx)
Text - Plain text (.txt, .md)
- XML (.xml)
- PDF/A (.pdf) combined with the original file
- Microsoft Word (.docx)
Code/Scripts - Python (.py)
- R (.R, .RData)
- MATLAB (.m, .mat)
- Jupyter Notebooks (.ipynb)
- Non-commented or non-documented scripts
Video - MPEG-4 (.mp4) - AVI (.avi)
- Flash Video (.flv)
- Windows Media Video (.wmv)
Genomics Data - FASTA (.fasta)
- FASTQ (.fastq)
- VCF (.vcf)
- Unsupported proprietary genomic file formats
Statistical Data - R scripts and data (.R, .RData)
- SPSS scripts (.sps)
- STATA scripts (.do)
- SPSS proprietary data (.sav)
- STATA proprietary data (.dta)
Qualitative Data - Plain text (.txt)
- PDF/A (.pdf) combined with original file
- Workspace dumps without clear documentation
Compressed or Archive Files - ZIP (.zip)
- GZIP/TAR (.tar.gz)
- Proprietary or encrypted compression formats without password-sharing information
Visualization Files - PNG (.png)
- JPEG (.jpg, .jpeg)
- Scalable Vector Graphics (.svg)
- Complex 3D visualization files without metadata
Transcription - PDF/A (.pdf) combined with tab-separated or comma-separated values (.csv, .tsv) - Word documents (.docx)
  • Preferred formats: These formats are widely used, well-supported, and promote long-term accessibility.
  • Non-preferred formats: These formats are less suitable due to their proprietary nature, lack of documentation, or reduced compatibility.
  • Metadata and documentation: Always provide sufficient metadata or documentation to accompany datasets, regardless of file type.
  • Anonymization: Ensure no sensitive or personal data is included in submissions.