Skip to Main Content

Research Data Management

Best Practices for Storage and Backup

Where you will store your data throughout your project's lifecycle is an important decision.  There are several things to consider:

  1. Are your data sensitive? It is important that you have an appropriate storage plan for your data that addresses your and your team’s needs for access and data security during the life of the project.  Duke does have a formal data classification standard and has requirements for sensitive data storage.
  2. How much data do you anticipate generating? Size matters and may have an impact on budget, storage options, processing speed, ease of access and backup strategies. 
  3. What type of active data storage environment do you need? Do you need access to a customized virtual research environment, cluster access, cloud storage, or have a protected network environment set up?  Would you benefit from working with a project management tool like LabArchives or the Open Science Framework? Know your resources and talk to people that can help such as your local department IT support, Duke OIT, or Duke Research Computing.
  4. Institute a backup plan. Duke departments, OIT and Research Computing and others have different ways to ensure that data are backed up regularly and appropriately. Be sure to establish a backup plan (PDF document) at your project's outset.  A general rule for backup is the 3-2-1 rule and does scale depending on how backups are structured - 3 copies (1 original, two backup), on 2 different types of media, with at least 1 off-site (physically) or in separate, dedicated cloud storage.
  5. Determine project roles to establish governance. Define project roles to ensure that read, write or execute permissions are assigned appropriately. Come up with a storage hierarchy that addresses these permissions as a means to preserve workflow order.
  6. Determine where your data will be stored for preservation and access over the long-term.  Depending on your data sharing and preservation needs there are various discipline-based repositories which may be used (or required to be used depending on your funder. Locally, you may use the Duke Research Data Repository which has a professionally managed curation program.