Support for saving metadata fragments & merging into calibration_metadata.yml

Perhaps when merging, we should warn if there are conflicting values in the metadata?

Yes, this would be useful. Also if possible show the conflicting pieces or leave the conflicting file for easy debugging.

added 2 commits

c7bcd46f - Check for conflicts when merging metadata
02e1d3ab - Add test for recursive_update()

Compare with previous version

OK, now if there are conflicts in the metadata when merging, it will print a message and leave the conflicting file (whichever one it hadn't already merged) in the directory for investigation.

mentioned in merge request !767 (merged)

added Testing Waiting for review labels

mentioned in merge request !775 (merged)

Given the uncertainties for 2), I gravitate towards 1). It would have the additional benefit of preserving the result for each correction notebook if something goes awry (who knew that working diligently for reproducibility also yields more debugging data).

I understand the correction notebook would then use the fragment API to save the constants they want to use, and the finalize job gathers those fragments. In an effort to DRY the correction notebooks a bit, it would be great if this is a single line for them.

In an effort to DRY the correction notebooks a bit, it would be great if this is a single line for them.

As this PR stands, the correction notebooks would do something like:

CalibrationMetadata(metadata_folder or output_folder).add_fragment({...})

Though the added fragment then wouldn't be visible through the CalibrationMetadata dict-like interface, until the finalize job merges them all together.

Do you happen to know where this is used? It's a cute idea, but I'm not convinced to be really necessary.

I'm a bit confused - it's a new addition in this MR, so it's not used anywhere yet. We can easily make the API different if you prefer.

I'm sorry, I meant the dict-like interface part and how much would change in those sites.

Given with this MR we can get rid of pre-notebooks, the question is how much correction notebooks themselves reference the CalibrationMetadata and how we can make those access include the fragments.

I've just searched through the correction & summary notebooks, and it appears that the only information they read from the metadata YAML file is the constants found by the pre notebooks, so it should be straightforward to remove the pre notebooks and have the correction notebooks do the lookup themselves.

The AGIPD correction has a custom mechanism to let each correction notebook write its constant timestamps to a separate file, and then pull them together in the summary notebook. This is only for when the pre notebook isn't used, so it's not running normally. We can probably replace it with the mechanism in this PR.

So I don't think there's any great need for the correction notebooks to get back the data they've added to CalibrationMetadata. But perhaps it's confusing anyway that the addition is 'invisible' from the CalibrationMetadata API? I could either make it visible with some more code, building on ChainMap, or change the add_fragment method to a separate function so it's clearer what's going on.

I think I would prefer if the notebooks themselves don't make particular use of the calibration_metadata.yaml file themselves, if it can be avoided. A solution using a ChainMap sounds like the right choice technically, but quite possibly you can safe yourself the time to do that.

Slightly tangential to this discussion, I would really like this metadata to be actually part of the EXDF file... e.g. as a virtual source pointing to the constants. I do not want to replace the .yaml file, it'll stay for compatibility and it's neat mechanism to rest reproducibility on - but it's not particularly accessible or findable by users.

I think I would prefer if the notebooks themselves don't make particular use of the calibration_metadata.yaml file themselves, if it can be avoided.

I think that can quite easily be avoided once this is in. To be clear, are you asking for a change here to separate the add_fragment function from the CalibrationMetadata class?

Slightly tangential to this discussion, I would really like this metadata to be actually part of the EXDF file... e.g. as a virtual source pointing to the constants.

Especially as we move towards calibrated data as a separate source, rather than just replacing the raw data, I think we can expose a lot of these things in RUN as 'properties' of the correction 'device'.

added 1 commit

59ed4a1b - Include Slurm job ID in temporary YAML filenames

Compare with previous version

added 1 commit

d2c92ce2 - Include Slurm job ID in temporary YAML filenames

Compare with previous version

added 1 commit

7b734c8d - Gather metadata fragments in finalize job

Compare with previous version

added 199 commits

7b734c8d...e73a5bc9 - 194 commits from branch master
f86be23d - Add support for saving metadata fragments & merging into calibration_metadata.yml
513b600c - Check for conflicts when merging metadata
a43920b9 - Add test for recursive_update()
cb9ba471 - Include Slurm job ID in temporary YAML filenames
f17abcf1 - Gather metadata fragments in finalize job

Compare with previous version

mentioned in merge request !821 (merged)

Thank you @kluyvert, LGTM

merged

mentioned in commit 4922b975

removed Waiting for review label

changed milestone to %3.10.0

Support for saving metadata fragments & merging into calibration_metadata.yml

Description

How Has This Been Tested?

Types of changes

Checklist:

Reviewers

Activity

Support for saving metadata fragments & merging into calibration_metadata.yml

Description

How Has This Been Tested?

Types of changes

Checklist:

Reviewers

Merge request reports

Activity