Reproducibility Guidelines

Reproducible work means that you — or someone else — can run your analysis again later and get the same result. On a shared, evolving cluster this takes a little discipline, but the building blocks are all available on Anunna.

Pin your software versions

Load specific module versions rather than defaults, and record the bucket and module versions you used. A job that loads Python/3.11.3 from the 2024 bucket will behave the same next year; one that loads whatever happens to be current may not.
For your own environments, pin versions too — record them in a requirements.txt, environment.yml, or equivalent. See Installing Personal Software and Python.

Use containers for full reproducibility

A container captures an entire software stack — operating-system libraries, tools, and dependencies — in a single image that runs the same anywhere. On Anunna, use Apptainer (formerly Singularity). A container is the strongest guarantee that your environment will not drift over time.

Keep your code in version control

Track your scripts and pipelines with Git so you have a history of what changed and can return to any version.

WUR runs a GitLab instance at git.wur.nl where you can host your repositories. Set up SSH keys for it the same way as for any SSH service — see SSH Access for generating a key pair — and add the public key to your GitLab account.

Automate the steps

A workflow engine records the exact steps, their order, and their dependencies, so the whole pipeline can be rerun from scratch. This is far more reproducible than running commands by hand.

Reproducibility Guidelines

Contents

Pin your software versions

Use containers for full reproducibility

Keep your code in version control

Automate the steps

See also

Navigation menu

Reproducibility Guidelines

Pin your software versions

Use containers for full reproducibility

Keep your code in version control

Automate the steps

See also

Navigation menu

Search