And this file may look like: # This is the requirements.txt fileĬonda has an annoying add r-* at the front to distinguish r packages from python ones. One is a requirements.txt file that specifies the R libraries you want. First to start, in your project directory at the root, have two files. I have a github folder to show the steps, but just here they are quite simple. Again, what is good for reproducible science is good for reproducing my work in different environments at my workplace. (And there are issues with even using dates to try to forensically recreate environments, see the Hackernews thread.) But you can use conda directly to set up a reproducible environment from the get-go. Groundhog doesn’t really solve this all by itself – it doesn’t specify the version of R for example. We currently are 100% python for machine learning, but you can also use the same workflow for R environments (or have a mashup of R/python). The way we do this at work is either via conda environments (for persistent environments) or docker images (for ephemeral environments). If this stuff is over your head, please feel free to email/ask a question and I can try to help.Īt work I have to solve a very similar problem to scientific reproducibility I need to write code in one environment (a dev environment, or sometimes my laptop), and then have that code run in a production environment. And scientists are not professional programmers – understanding all of this stuff takes time and training often in short supply in academia (hence me blogging about boring stuff like creating environments and using github). Even if there are different standards of replicability, some code is quite a bit better than no code. What I am going to talk about in this post is to create an environment from the get-go that has the info necessary for others to replicate.įirst before I get to that though, I have come across people critiquing open science using essentially ‘the perfect is the enemy of the good’ arguments. This is more from a perspective of “I have this historical code, how can I try to replicate that researchers environment to get the same results”. DataColada have a recent blog about their groundhog package, intended to aid in reproducible science.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |