If you’ve used GerryChain to do some analysis or research, you may want to ensure that your analysis is completely repeatable by anyone else on their own computer. This guide will walk you through the steps required to make that possible.
Use the same versions of all of your dependencies¶
You will want to make sure that anyone who tries to repeat your analysis by running your code will have the exact same versions of all of the software and packages that you use, including the same version of Python.
The easiest way to do this is to use conda to manage all of your dependencies.
You can use conda to export an
environment.yml file that anyone can use to replicate your
environment by running the command
conda env create -f environment.yml. For instructions on
how to do this, see Sharing your environment and Creating an environment from an environment.yml file
in the conda documentation.
If you’ve published your code on GitHub, it is a good idea to include your
file in the root folder of your code repository.
gerrychain.random is the single place where GerryChain imports the built-in Python
random and sets a random seed. This makes sure that all randomness is used after the seed
is set. If you use the
random module anywhere in your own code (say, in your own proposal function),
replace the line
import random with
from gerrychain.random import random. This will ensure
that your code uses the same random seed as GerryChain.
GerryChain sets a random seed of
2018 after it imports
random. If you wish to use a different
random seed, set it immediately after importing
gerrychain.random, and before you
import anything else. That will look like this:
from gerrychain.random import random random.seed(12345678) from gerrychain import MarkovChain, Partition # and so on...
In addition to the randomness provided by the
random module, Python uses a random
seed for its hashing algorithm, which affects how objects are stored in sets and dictionaries.
This must happen the same way every time in order for GerryChain runs to be repeatable.
The way to accomplish this is to set the environment variable
If you are using conda for managing packages, dependencies, and environments, you can save environment variables in your conda environment.
Otherwise, in macOS or Linux environments you can accomplish this by running the command
in the Terminal or bash shell before running your code.
In a Windows 10 environment using PowerShell, you can accomplish this by running
before running your code.