RDM/Background Research on RDM/Summary of Findings

Variation in RDM Practices

The main outcome of our study of RDM practices is a multidimensional typology of data handling practices in research disciplines. As the analysis demonstrates, data sharing can be expected to vary along a minimum of three dimensions: 

  • Dimension 1: Data Intensity (high versus low amounts/complexity of data)
  • Dimension 2: Style of Data Handling (intensive care vs. discard)
  • Dimension 3: Reproducibility and Replicability

Intensity refers a) to the amounts of data research produces which varies greatly from research group to research group and b) the complexity of the data (in terms of e.g. dimensionality). Both variables impact the time and resources needed to manage the data (e.g. defining metadata, uploading data sets to repositories, etc.). Style of data handling refers to the time devoted to data handling (i.e. whether data are kept for the long term or – at the other end of the spectrum – discarded after analysis. The third dimension refers to the value accorded to ensuring reproducibility which varies greatly across research fields and based on research aims. Each case (discipline) can be accorded a place in this projected three-dimensional space. The three dimensions of data practice explain the value accorded to data and hence the propensity of individual researchers to share their data under certain circumstances. 
Key findings are summarized below:

  • Variation in Data Practices: Faculties, institutes, and research groups differ with respect to data amounts/complexity, data collection and analysis, and consequently the extent to which data are archived and which databases are used; respondents thus feel that RDM policies should be discipline-specific where needed but as general as possible
  • Disciplinary variation necessitates discipline-specific services in terms of e.g. data stewardship; many respondents pointed out that institutes need help in very specific tasks; very often, this is an effect of specialization; respondents desire support at the departmental level
  • Data Collection: The bulk of research data are collected by PhD candidates; fluctuation among PhD positions is thus a huge continuity problem for data management; accordingly, there is a desire that more effort be put into proper RDM training for PhDs; in general, more time needs to be allowed/planned for to guarantee adequate data management
  • Metadata: In many fields, there is no consensus as to which data to share and how to develop metadata schemes; this is especially pertinent in disciplines where there is no culture of sharing research data; here, researchers said they need support, especially with regards to funder mandates (e.g. DMPs); where there are established metadata schemas, researchers want to be able to search repositories by metadata
  • Data Analysis: Some use docker images to organize their data analysis; this is considered highly desirable as a way to organize the entire research process (storage of data and scripts)
  • Technical Aspects of Data Management: Data loss is a concern among all respondents; as a consequence, data security measures and backups are desired across all faculties; these should be paid for by the university (which respondents recognize requires a cultural shift)
  • Opt-in: All these support structures are preferred as opt-in versions, free to use for those who need them and without introducing any additional (administrative) burden
  • Publishing Research/Archiving Data: The publishing process is rather similar across the faculties (from planning an article to selecting a journal to submission to uploading data). Accordingly, support structures could be bundled in one organizational unit which would help to free up researchers’ time
  • Data Security and Backups: Data loss is a big concern among researchers and research administrators; this can have two major causes: fluctuation among PhD positions, and inability (technical or on the part of data handlers) to secure data; here backup options with adequate funding are highly desired
  • Data Sharing: The propensity to share data depends on what data are perceived as valuable and how. The value of data in turn depends on the effort that needs to be invested in data collection/data processing. This seems to be especially pertinent in the Life Sciences  where data sharing is a big priority (more so than in other disciplines we studied). In general, the value ascribed to data in a given field can be explained by reference to three factors:
    • Intensity (amount and complexity of data)
    • Handling (resources put into data handling)
    • Reproducibility (research style: what is the aim of the field)
  • Only disciplines “scoring” high on all three dimensions can be expected to developed culture of data sharing, and consequently, there is a lot of variation across TU Graz with respect to data management practices 

Contact
image/svg+xml

Dr. Tony Ross-Hellauer
Digitale TU Graz-Handlungsfeld Forschung

Brockmanngasse 84, 8010 Graz
Phone: +43 316 873 32800
Email: ross-hellauernoSpam@tugraz.at