Store calibration_metadata.yml primarily in work directory

added Waiting for review label

I have a separate branch where I have written a script to go back and fix calibration_metadata.yml files already affected by this issue. My idea is that we'll run this on all proposals from October 2021 up until we deploy the change in this MR, to get valid metadata where there were multiple detectors being corrected.

https://git.xfel.eu/detectors/pycalibration/-/blob/calmeta-fixup/calmeta-fixup.py

The script tries to fix:

Parameters (taken from a notebook file using nbparameterise)
Notebook title & author (extracted from Markdown in notebook using code from calibrate.py)
Report path (taken from run_calibrate.sh by finding --report-to parameter)

Most of the other stuff saved in the yml file should either be consistent between detectors (like the Python version used), or options I don't think we're using (like the --not-reproducible flag for running offline correction online). There are a couple of exceptions:

If REMI processing happens alongside another detector's corrections, one of them may get a bogus 'concurrency' section. @schmidtp if you know of runs where this might be the case, we can check for it and make a rough fix.
Any detector corrected alongside AGIPD may have got the retrieved constants for AGIPD in the YAML file. This shouldn't be a practical issue, because the other corrections don't yet look in that file for constants, and would probably ignore the 'AGIPD00' names even if they did check.

added 1 commit

c6feeae2 - Use . for metadata folder in notebooks

Compare with previous version

changed milestone to %3.5.2

Since you've looked at this while working on the PR, could you add a small summary of where things get saved at the various stages of processing?

While reading through the code there were some spots where I got a bit confused about what the relative and temporary directories are actually relative to in the different scenarios of execution via the webservice, a call to xfel-calibrate, a call to repeat, or execution of code inside the notebooks.

Maybe this is so simple that it doesn't need to be explained, but the CLI tools CWD is the directory you were in when executing them, but the CLI then executes notebooks, where the CWD is where the notebook is running(?), but if you execute a python script in a different directory then the CWD is where you were when you executed the script. For me keeping that in mind and thinking of what runs what and what the CWD is at that point is pretty confusing.

Yup, no problem.

The basic picture is that for code run in notebooks, the CWD should always be the directory where that notebook file is. So when you run the notebooks interactively, it's a subfolder in this repo. Both xfel-calibrate and repeat work by copying notebooks to a new directory, which is then also the CWD for running them. We variously call the this the 'work dir', 'slurm out', 'metadata dir', run_tmp_path, and probably half a dozen other names. I'll do an MR at some point to clean up the naming!

When you run xfel-calibrate, almost everything that gets created is in this 'work dir'. The exceptions are:

The HDF5 output files, which go in out_folder
The PDF report, which ends up adjacent to the work dir

There's a minor anomaly with calibration_metadata.yml. It's mostly created by the machinery in xfel-calibrate, but the AGIPD notebooks also want to write it. If you run those notebooks interactively, without xfel-calibrate, they'll use the specified out_folder for their YAML file to reduce the risk of accidentally carrying over the wrong constants. I don't especially like that, but I haven't found a solution I do like.

Sweet, thanks for clearing it up! That's about what I thought.

LGTM

approved this merge request

merged

mentioned in commit 93dac475

mentioned in merge request !674 (merged)

removed Waiting for review label

mentioned in merge request !680 (merged)

Store calibration_metadata.yml primarily in work directory

Description

How Has This Been Tested?

Relevant Documents (optional)

Types of changes

Checklist:

Reviewers

Activity

Store calibration_metadata.yml primarily in work directory

Description

How Has This Been Tested?

Relevant Documents (optional)

Types of changes

Checklist:

Reviewers

Merge request reports

Activity