From cd10351a677c56d569c8819384a514eb580344b1 Mon Sep 17 00:00:00 2001
From: Thomas Kluyver <thomas@kluyver.me.uk>
Date: Wed, 25 Aug 2021 14:10:24 +0100
Subject: [PATCH] Update README about running & re-running calibration

---
 README.rst | 75 +++++++++++++++++++++++++-----------------------------
 1 file changed, 35 insertions(+), 40 deletions(-)

diff --git a/README.rst b/README.rst
index 28a68631c..44369bdf1 100644
--- a/README.rst
+++ b/README.rst
@@ -349,56 +349,51 @@ unmodified sections of code being looked at.
 Python Scripted Calibration
 ***************************
 
-**Do not run this on the Maxwell gateway**. Rather, ``salloc`` a node for
-yourself first:
+To launch correction or characterisation jobs, run something like this::
 
-.. code::
-
-  salloc -p exfel/upex -t 01:00:00
+    xfel-calibrate AGIPD CORRECT \
+    --in-folder /gpfs/exfel/exp/SPB/202131/p900215/raw --run 591 \
+    --out-folder /gpfs/exfel/data/scratch/kluyvert/agipd-calib-900215-591 \
+    --karabo-id SPB_DET_AGIPD1M-1 --karabo-id-control SPB_IRU_AGIPD1M1 \
+    --karabo-da-control AGIPD1MCTRL00 --modules 0-4
 
-where `-p` gives the partition to use: exfel **or** upex and `-t` the duration
-the node should be allocated. Then ``ssh`` onto that node.
+The first two arguments refer to a *detector* and an *action*, and are used to
+find the appropriate notebook to run. Most of the optional arguments are
+translated into parameter assignments in the notebook, e.g. ``--modules 0-4``
+sets ``modules = [0, 1, 2, 3]`` in the notebook.
 
-Then activate your environment as described above (or just continue if you are
-not using a venv).
+This normally submits jobs to Slurm to do the work; you can check their status
+with ``squeue --me``. If you are working on a dedicated node, you can use the
+``--no-cluster-job`` option to run all the work on that node instead.
 
-If running headless (i.e. without X forwarding), be sure to set
-``MPLBACKEND=Agg``, via:
-
-.. code::
+The notebooks will be used to create a PDF report after the jobs have run.
+This will be placed in ``--out-folder`` by default, though it can be overridden
+with the ``--report-to`` option.
 
-  export MPLBACKEND=Agg
-
-Then start an ``ipcluster``. If you followed the steps above this can be done
-via:
-
-.. code::
-
-  ipcluster start --n=32
-
-
-Finally run the script:
-
-.. code::
+Reproducing calibration
+=======================
 
-  python3 calibrate.py --input /gpfs/exfel/exp/SPB/201701/p002012/raw/r0100 \
-    --output ../../test_out --mem-cells 30 --detector AGIPD --sequences 0,1
+The information to run the calibration code again is saved to a directory next to
+the PDF report, named starting with ``slurm_out_``. It can be run as a new job
+like this::
 
-Here ``--input`` should point to a directory of ``RAW`` files for the detector
-you are calibrating. They will be output into the folder specified by
-``--output``, which will have the run number or the last folder in the hierarchy
-of the input appended. Additionally, you need to specify the number of
-``--mem-cells`` used for the run, as well as the ``--detector``. Finally, you
-can optionally specify to only process certain ``--sequences`` of files,
-matching the sequence numbers of the `RAW` input. These should be given as a
-comma-separated list.
+python3 -m xfel_calibrate.repeat \
+    /gpfs/exfel/data/scratch/kluyvert/agipd-calib-900215-591/slurm_out_AGIPDOfflineCorrection \
+    --out-folder /gpfs/exfel/data/scratch/kluyvert/agipd-calib-900215-591-repro
 
-Finally, there is a ``--no-relgain`` option, which disables relative gain
-correction. This can be useful while we still further characterize the detectors
-to provide accurate relative gain correction constants.
+The information in the directory includes a Pip ``requirements.txt`` file
+listing the packages installed when this task was first set up. For better
+reproducibility, use this to create a similar environment, and pass
+``--python path/to/bin/python`` to run notebooks in that environment.
+Future work will automate this step.
 
-You'll get a series of plots in the output directory as well.
+.. note::
 
+   Our aim here is to run the same code as before, with the same parameters,
+   in a similar software environment. This should produce essentially the same
+   results, but not necessarily exactly identical. The code which runs may
+   use external resources, or involve some randomness, and even different
+   hardware may make small differences.
 
 Appendix
 ********
-- 
GitLab