Open Science at CMU Libraries

Ana Van Gulick at Open Science Symposium

What do examining bee behavior, classifying digital document types, analyzing road segmentation for self-driving cars, and decoding language in the brain from MRI scans all have in common? They all require large amounts of data that is well documented and reusable. Fostering this type of research and data curation is one goal of the growing Open Science movement and of the Carnegie Mellon University Libraries' Open Science Program.

Open Science is a broad term that can refer to many different aspects of research that is collaborative, transparent, and openly accessible. It is important to note that these same principles can apply to any field of scholarly research, including well beyond science and social science, despite the catchphrase Open Science. Open Science may incorporate many shifts in the way research is conducted and disseminated including open access publishing; open educational resources; open source software; digital tools and platforms for research; reproducible research; open data; citizen science; diversity, equity, and inclusion in training and research; preservation and reuse; and metrics and evaluation. These practices allow research to be more collaborative and to harness the power of data science. Some research communities including particle physics and genetics were early adopters of open science practices such as sharing preprints of papers before publication or sharing standardized datasets in a repository. Other disciplines such as psychology and biomedical science are grappling with the related issue of reproducibility and replication in their work (click here for examples: 1, 2, 3).

Encouraging open research practices has in the past several years become a growing priority for research communities as well as for research funders, publishers, and institutions. The National Academies of Science, Engineering, and Medicine have recently released reports on Open Science by Design and Reproducibility and Replicability in Science. The Association of American Universities (AAU) and Association of Public & Land-Grant Universities (APLU) have held working groups and summits on public access to research findings and data (report). Funders including NIH, NSF, the Gates Foundation, and Wellcome Trust have been rolling out new requirements for data management planning and public access to publications and other research products, including data generated from funded work. Similarly, journals including, but not limited to, PLOS, eLife, and PNAS are requiring datasets and code that support published work be made available together with papers. While many of these requirements come at the end of a research project, research communities are recognizing that incorporating open practices such as documentation and scientific programming into training and projects early on benefits the science and the individual, and allows for easy sharing of work with collaborators and the public as well.

In September of 2018, the CMU Libraries launched an Open Science Program to support collaborative, transparent, openly accessible, and reproducible research across all disciplines at Carnegie Mellon University. The program recognizes that having well documented and automated research workflows, code, and datasets is essential to making research more interdisciplinary, efficient, and reusable as well as allowing researchers to leverage data science techniques. This program provides services and infrastructure for open research at CMU through digital tools, training opportunities for research tools and practices, special events and advocacy, and a team of experts available as research consultants and collaborators.

The program team is comprised of Libraries faculty and staff with expertise in a variety of research methods and disciplines, many who have been researchers themselves and understand the challenges of complex research projects and the time constraints of implementing new lab workflows. The tools they support include open source programs as well as some tools licensed for campus by the CMU Libraries including LabArchives electronic research notebook, Open Science Framework, KiltHub Research Repository, and Protocols.io. Libraries workshops focus on trainings for these tools as well as research and publishing practices. In addition, 2-day workshops that are part of The Carpentries, an international non-profit teaching introductory scientific programming skills including Software Carpentry and Data Carpentry with Python and R, are taught several times a year and have proved to have wide appeal across the CMU community.

In October 2019, the program hosted the 2nd Carnegie Mellon Open Science Symposium, which brought together researchers from across the life sciences as well as 12 guest speakers from across the country to discuss the challenges and opportunities of open research practices. As at the first symposium in November 2018, researchers from cancer biology, neuroscience, psychology, genetics, as well as practitioners from information science, publishing, and open software and data sharing presented on their broad ranging experiences conducting open science. Video of talks and slides can be found online on Open Science Framework: OSS 2018, OSS 2019. Informal networking and demo sessions allowed researchers to meet and share their ideas, tools, and experiences with one another. Stay tuned for dates for our 2020 events!

The Libraries Open Science Program is looking forward to growing our services and support more in 2020 and is eager to partner with specific labs, departments, and student organizations on their research and training initiatives. You can contact the team at openscience@andrew.cmu.edu.

Open Science at CMU Libraries

Tags