7  Research Data Management

Research Data management is:

“the active curation of data throughout the research lifecycle”. University of York

7.1 Importance of Data Management

Why is data management so important for your research project?

  • It saves you time and resources by avoiding the duplication of effort
  • It allows you to provide consistency in data handling across your project
  • It provides mechanisms to validate your analysis and ensures research integrity
  • It helps to facilitate the long term preservation of your research datasets
  • To make your data FAIR (see above)
  • By planning ahead there is a greater chance of re-use (by you and others)

7.2 Effective Data management planning

There are a lot of things to consider when you are considering how to effectively manage your research data. Essentially there are two key tasks to complete: understanding your data and organising your data.

To break this down, the questions below are things you should ask yourself to see if you are effectively understanding and organising your data.

Understanding your data

  • What is its Purpose?
  • What is its Content?
  • What is the Method of Creation?
  • What Attributes does the data have?
  • What Relationships does the data have?
  • What Constraints & Dependencies does the data have?

Organising your data

  • What is your File Structure?
  • How do you define your File names?
  • How do you organise Version Control?
  • What is your Discard policy?
  • What are your Backup and Security policies?
  • What Documentation do you need to create?

The easiest and clearest way to approach your data in this way is to create what is called a data management plan (or DMP).

7.3 Develop a Data Management plan

What is the purpose of a data management plan?

“A data management plan helps achieve optimal handling, organising, documenting and enhancing of research data. It is particularly important for facilitating data sharing, ensuring the sustainability and accessibility of data in the long-term and allowing data to be reused for future research.” UK Data Service

In basic terms, a Data Management Plan is an iterative document describing how you plan to manage the data gathered through the delivery of a specific project, and what will happen to that data once the project is complete.

The key aspects here are:

  • A DMP is iterative, by which is changes and develops as the research project progresses
  • That the DMP should include all of the data gathered by the projects, which as we discussed earlier, could include a lot of different data types from different stages of your project
  • That you should consider what happens to the data once your project is completed - can and should you preserve that data for the future?

A good definition of what a data management plan is and what it does is provided by the UK Data Service.

7.3.1 Seven key areas of a Data Management Plan

The requirements for Data Management Plans can vary between funding bodies (more below), but overall there are seven key areas of any DMP.

Note

The titles listed above might be slightly different in different organisations but these seven headings are broadly applicable across all research institutions and funding bodies.

7.3.2 Context

  • Project name
  • Unique identifiers (e.g. DOI)
  • Name of Funding bodies,
  • Creation and edited date of the document
  • Version number
  • Any related policies/documentation

7.3.3 Roles and Responsibilities

  • Responsible person(s)
  • Roles for activities within the Data Management Plan
  • Names and roles of any Collaborative partners
  • Details of any agreements and/or contracts between partners

7.3.4 Data Description

  • What data types/formats will be collected as part of the research
  • Volume/size of data collected
  • Software used for research and justification for each
  • Details about any existing data that is being collated

7.3.5 Standards and Methods

  • Method of data collection/creation
  • Discipline specifics for data standards and methodologies
  • Data structure
  • Version control for data
  • Quality assurance
  • What metadata will be collected

7.3.6 Ethics and IPR

  • Copyright holders
  • Data sharing agreements
  • Ethical considerations/reviews undertaken
  • Personal/sensitive data
  • How are you compliant with appropriate legislation
  • Data sharing restrictions

7.3.7 Sharing, Access and Security

  • Appropriate storage provision
  • Security/backup procedures
  • Data recovery procedures
  • Control access and transfer between collaborators.

7.3.8 Preservation and Access

  • Selection strategy
  • What will happen to data not selected
  • Details of data relationship to Publication or Dissemination
  • Where will your data be deposited and why
  • Data licence
  • Future reuse