From ae5176c1cdbe6bcf13f4f446b60a5ea63bec6e80 Mon Sep 17 00:00:00 2001 From: David Ragnar Nelson <35697532+drnelson6@users.noreply.github.com> Date: Thu, 9 May 2024 10:00:07 -0400 Subject: [PATCH] Add extended section on reproducible research --- content/lessons/repro_research.md | 28 ++++++++++++++++++++++++++++ 1 file changed, 28 insertions(+) diff --git a/content/lessons/repro_research.md b/content/lessons/repro_research.md index ae93b86..297d0ff 100644 --- a/content/lessons/repro_research.md +++ b/content/lessons/repro_research.md @@ -19,6 +19,34 @@ There are several factors that contribute to producing reproducible research. Th - Using a version control system to facilitate collaboration on the project - Recording your environment so others can run your code on their machine +### Reproducible research + +Reproducible research is a moving target in the humanities. While the social sciences have recently reckoned with a so-called "[replication crisis](https://en.wikipedia.org/wiki/Replication_crisis)," the humanities are only beginning to think about how their research can be reproducible. As the humanities increasingly works with large data sets and computational tools that exceed what can be manually verified by a third-party observer, we need to agree upon best practices that will ensure our peers can trust the validity of our results. + +As digital humanists, we can learn several lessons from the social sciences and hard sciences to avoid a "replication crisis" in the humanities. A big step towards producing more reproducible research is writing better code that others can reuse to produce the same results. + +Reproducibility comes in multiple forms: + +- Someone else wants to download my data and code to verify my results independently +- Someone else wants to use my code on new data to produce their own research +- Someone wants to modify my data and code to test edge cases in my results + +Some key aspects of reproducible research include: + +- publication of the raw underlying data used to achieve the results; +- clear documentation of the steps taken to achieve the results; +- open source release of the code used for data gathering, analysis, and other steps; +- separating code based on function (i.e. modular code development) so others can interpret and reuse your code; +- documents key decisions and changes in the project (i.e. version control); +- means to ensure the tools do what they are supposed to (i.e. tests, code review). + +In addition, reproducible research follows a set of community-defined best practices to ensure that your project can be understood and used by others. These practices may evolve and change over time, but these sets of lessons contribute a set of basic principles that can guide the development of reproducible research in the Digital Humanities. + +##### Resources + +- Rik Peels, "Replicability and replication in the humanities." _Research Integrity and Peer Review_ 4 (2019). [https://doi.org/10.1126/science.aac4716.](https://doi.org/10.1126/science.aac4716.) +- Joseph Flanagan, "Reproducible research: Strategies, tools, and workflows." _Studies in Variation, Contacts and Change in English_, eds. Turo Hiltunen, Joe McVeigh, Tanja Säily (Helsinki: Research Unit for Variation, Contacts and Change in English, 2017). + ### Organizing your project Your project will involve a number of components. This may include raw data, processed data, documentation, source code, and code for dependencies. You may be working as part of a team, or you may be writing code that others will use in the future. In either case, your project should be organized so that someone unfamiliar with the project can quickly find the information they need, whether these are datasets or functions in your code. In order for your research to be reproducible and make sense to others, you should organize your project in a consistent, predictable way.