Frequently Asked Questions | Experts Research Data

Who can I contact to assist me in creating Archives of my research data?Please click button to expand or collapse content

You can contact the following address zdv-forschungsdaten@uni-mainz.de to arrange a meeting with our technical support team.
They can help you clear any other questions and guide you through the process.

What is the difference between Archiving my data and making a Backup?Please click button to expand or collapse content

Archiving is meant for research data that is part of a publication that has been published. Archived data is meant to be used to replicate research results.
It can also be made public to be used by others or can be kept private to simply conform with grant requirements.
If you wish to backup a laboratory PC, go to www.zdv.uni-mainz.de/datensicherung

Do I need to change my daily workflow or how I process my data?Please click button to expand or collapse content

In order to Archive research data you do not need to change the way you conduct your research or any of your existing data workflows.
The data can always be archived afterward by producing relevant Metadata and Technical metadata and then uploading it into the Archive.
The Metadata summarizes essential information about research like: Title, Description, Type, Format, License, Keywords, Contributor, Reference, etc.
Technical metadata describes what the contents of the files represent, how they are used in the research field, how they related to each other and how they can be processed.
All this information is useful to others in order to find and understand the research data.

How do I prepare my data to be Archived?Please click button to expand or collapse content

All the research data of a single publication that is meant to be Archived should be gathered within a single parent directory.
The data within the parent directory can then be sorted into different directories based on some useful logical hierarchy chosen by the authors.
Metadata of the gathered data should then be defined, i.e: Title, Description, Type, Format, License, Keywords, Contributor, Reference, etc.
Technical descriptions of the research data (technical metadata) should also be prepared and included in the parent directory as human readable files like “.txt “.
These technical metadata files should describe what the contents of the files represent, how they are used in the research field, how they related to each other and how they can be processed.
The parent directory should be compressed to an open format if the storage space requirements allow it.

How do I Archive my data?Please click button to expand or collapse content

Once your research data has been gathered and prepared with metadata, this steps [link] can be followed to Archive it.

Do I need to Archive every single file of my data individually?Please click button to expand or collapse content

Multiple large files can be archived individually while large amounts of small files that belong to the same data set should be compressed into a single archive (zip, tar, rar) file.
If possible all files of the same dataset should be gathered together in the same archived (compressed) file.
This makes it easier to retrieve complete datasets that are meant to be distributed together.

Where do my files need to be stored before they can be Archived?
Can I Archive data if my files are in my personal computer?
Which operating system do I need to Archive data?Please click button to expand or collapse content

The data should be available to be copied from either:

a Linux computer installation provided by the ZDV [link]
the Mogon HPC system of the JGU [link]
any computer with mounted group storage network drives of the JGU [link]
a personal computer with the Kerberos Client authentication packages installed
any computer with SSH access to linux.zdv.uni-mainz.de [link] or the MOGON HPC system [link]
a personal Windows computer using the Linux subsystems or Linux virtual machines.

Can I Archive the source code used for my data?Please click button to expand or collapse content

The source code needed to process, generate or understand your research data should be included with the data itself whenever available or possible.
The Archive itself is meant to be used for long term non-changing data, and

For how long will my data be available after Archiving?Please click button to expand or collapse content

The Archive can hold the data for as long as the user requests it.
The Archive is meant to be able to store data for decades in order to help researches comply with funding requirements and to be able to reproduce results in the future.

How big can my data be for Archival?Please click button to expand or collapse content

For archiving datasets over 1TB it is recommended to get in contact with us under zdv-forschungsdaten@uni-mainz.de to assist you, assure smooth performance and stability during the Archival process.
There is currently no storage limit to the amount of datasets that can be Archived, but individual files or compressed datasets should not be over ~8TB.
If the archived datasets are meant to be retrieved through HTTP interfaces, a maximum size of 5GB per file is recommended.
The limitations in capacity are only given by the storage location of the original data to be copied, i.e: 100GB on Home at the MOGON systems.

Is there a limit to the amount of files that can be Archived from my data?Please click button to expand or collapse content

Currently there is no limitation on the amount of files to be archived.
If possible all files of the same dataset should be gathered together in the same compressed file.
This makes it easier to retrieve a complete datasets that is meant to belong together.

How do I give my colleagues access to my data?Please click button to expand or collapse content

To give other users access to your research data Archives you must follow these steps [link].

How do I Publish data and make it available to the public?
Can I make my previously Archived data public?Please click button to expand or collapse content

Your research data can be Published publicly during the Archival process or afterwards following these steps [link].

How do I describe my data (files) with Metadata so it can be found by others?Please click button to expand or collapse content

Metadata first needs to be gathered i.e: Title, Description, Type, Format, License, Keywords, Contributor, Reference, etc.
After defining this and any other helpful descriptive fields, you can follow this steps [link] to add metadata.

How do I add technical Metadata to describe in lengthy detail what the content of my files mean?Please click button to expand or collapse content

Technical Metadata needs to be generated by the owner of the research data.
Technical descriptions of the research data should then be included as human readable files like “.txt “ alongside normal data.
These technical metadata files should describe what the contents of the files represent, how they are used in the research field, how they related to each other and how they can be processed.

Can I update my Archives by adding or deleting files?Please click button to expand or collapse content

Under normal circumstances the Archive is meant to hold non-changing data for decades.
The Archived research data can nevertheless be updated to add or change existing files under certain special cases.
This can happen when wrong data is accidentally uploaded or when the underlaying published research changes.
Research data that has been Archived long term to comply with government or funding requirements cannot and should not be deleted.

How is the Archived data stored?Please click button to expand or collapse content

iRODS uses 'Resources' to archive the collections (directories) and data objects (files). The resources are organized hierarchically. The iRODS Archive from the ZDV currently has a compound resource consisting of a cache (unix filesystem) and an archive tape universal mass storage system. The cache has a size of 8TB and once it is fills up, the oldest data objects are deleted on the cache. If required, they are fetched back from the tape archive.

replResc:replication
├── cephfsResc:unixfilesystem
└── compResc:compound
    ├── netappResc:unixfilesystem
    └── tsmResc:univmss