To attain a representative view of the state-of-play of Research Data Management at TU Graz, potential candidates for interviews were selected based on two factors: Their affiliation (faculty/department) and their position. The target was to interview a minimum of two people per faculty (regardless of department). This target has not been met, in part due to recruiting issues, although it should be noted that we ended up interviewing more candidates than we had planned. Additionally, departmental structures do not necessarily reflect differences in data management; therefore, the interviews we have been able to conduct provide a good overview of data management practices at TU Graz regardless of institutional affiliations. In total, 13 formal interviews were conducted with a total of 18 respondents holding various positions at their departments/faculties. Three formal meetings held at the Faculties for Architecture, Electrical Engineering, and Mechanical Engineering were included in the analysis (protocols were crafted during the meetings). Interview partners were identified by manually scanning TUGonline by faculty, initially identifying one-two researchers per faculty, all researchers with a teaching qualification according to their TUGonline profiles (Professors and Associate Professors), though it must be stressed that this strategy was not always successful. Additionally, deans of faculty were approached to name potential interviewees, but this strategy has not proved very effective. Researchers were contacted via email requesting an interview on data practices and, foregoing a positive response, to name alternative candidates. Those who declined did so for one of two reasons: (self-ascribed) lack of competence in the subject of data management, or general refusal to give interviews. Fortunately, those in the first group shared names of potential interviewees they considered to be a better fit.
For the interviews, we used a formalized list of interview questions along with possible follow-up questions to fall back on should the conversation come to a halt at any point. The questions were formulated in an open fashion to be able to record as much potential variation in the answers between cases as possible. All interviews have been professionally transcribed and coded using the software package Rqda. Coding was done by one researcher, with supervision and feedback from other researchers. The material was analysed paying special attention to data practices (types of data used by disciplines, methods of data collection, storage, and analysis, data sharing routines - or lack thereof). The interview questionnaire was designed to allow reconstruction of data handling practices in their wider institutional, disciplinary and practical context. The semi-standardized interview questionnaire contained broad questions about data in the context of research, (typical) research aims, data management practices, roles, and responsibilities, data storage and data sharing, and research culture more broadly (e.g. publication routines, reputation and credit, etc.). In keeping with the findings from initial gatekeeper contact, the interview questions refrained from using terms such as “research data management”, “data management”, and “policy”, and instead focused on understanding what researchers do with their data, a strategy which has proved worthwhile.
Based on preliminary interview findings as well as faculty visits and policy working group consultations, a survey was designed to gain quantitative insights into the way data intensity, data handling and research styles play out with respect to RDM at TU Graz. The survey was hosted on LimeSurvey and sent out via email to all members of scientific staff at TU Graz in September 2019. In total, the survey was kept open for 5 weeks. Two reminders were sent, one after two weeks and the second one week before the end of the survey. Additionally, an announcement was sent out one week in advance. Consultations were held with the responsible bodies at TU Graz to follow established protocols for surveys and to clarify issues of data protection. The survey was sent out to 1784 scientific staff members. 498 respondents started the survey, and of those, 259 completed the questionnaire. These were included in the analysis. No incentives were given out to encourage participation. Survey respondents are from all 7 faculties and from all academic positions. A more fine-grained analysis (e.g. at the departmental level) was impossible due to data-protection restrictions at TU Graz.
The survey consisted of 27 questions in five groups:
The first group of questions concerned research outputs generated by researchers at TU Graz, the kinds of data formats typically used, as well as research outputs other than data (e.g. physical samples of any kind). The second group contained questions regarding typical data amounts per year and (average) storage space required. Data handling refers to practices of data sharing/handling as well as attitudes towards e.g. data reuse and repositories. Obstacles to Research Data Management refers to researchers’ experiences with RDM. Demographics contained three questions: faculty, position, and role of the respondent. These three items were designed to ensure data protection was observed but the answers would still allow for meaningful analysis. The survey items were adapted, in part, from a survey on RDM knowledge and practices among ERC grant winners commissioned by the European Research Council and written by the Public Policy and Management Institute (PPMI), Digital Curation Centre (DCC), Georg-August-Universität, Göttingen and Science-Metrix (PPMI 2018). The survey went through several rounds of refinement (formulations of items, order of items, translation). The survey was developed in English and then translated into German after items were finalized. One pretest was commissioned where volunteers from the policy working group were asked to complete and comment on the survey. For the pretest, the survey version was hosted on LimeSurvey. 10 responses were received which contained valuable criticisms and hints as to what should be amended. These suggestions were incorporated into the final survey design.
The main outcome of our study of RDM practices is a multidimensional typology of data handling practices in research disciplines. As the analysis demonstrates, data sharing can be expected to vary along a minimum of three dimensions:
Intensity refers a) to the amounts of data research produces which varies greatly from research group to research group and b) the complexity of the data (in terms of e.g. dimensionality). Both variables impact the time and resources needed to manage the data (e.g. defining metadata, uploading data sets to repositories, etc.). Style of data handling refers to the time devoted to data handling (i.e. whether data are kept for the long term or – at the other end of the spectrum – discarded after analysis. The third dimension refers to the value accorded to ensuring reproducibility which varies greatly across research fields and based on research aims. Each case (discipline) can be accorded a place in this projected three-dimensional space. The three dimensions of data practice explain the value accorded to data and hence the propensity of individual researchers to share their data under certain circumstances. Key findings are summarized below: