Skip to content
Snippets Groups Projects

Reset calibration_metadata.yml before saving the initial calibration metadata

Merged Karim Ahmed requested to merge fix/saving_malformed_calibration_metadata_yml into master
2 unresolved threads

There was a reported issue for multiple dark processing that the second dark request for AGIPD fails.

I faced the same issue last days while testing for our next deployment for LPD, DSSC and AGIPD.

Description

When there is a dark request we create a new calibration_metadata.yml file. At the end of DSSC, LPD and AGIPD dark processing a key for modules-mapping is added by Overallmodules_darks_summary notebook.

After reprocessing again the same run using the same out-folder the calibration_metadata.yml file is not created properly.

https://git.xfel.eu/detectors/pycalibration/-/blob/master/src/xfel_calibrate/calibrate.py#L659

I don't really understand why this issue happening even though we are using "w" while create the yml file.

https://git.xfel.eu/detectors/pycalibration/-/blob/master/src/cal_tools/tools.py#L794

This is just a fix to reset the file and save the empty initialized dict before saving the metadata from calibrate.py. Until I understand more why this happens.

How Has This Been Tested?

Relevant Documents (optional)

I have added the malformed yml files. Line 84 is the line with the issue.

calibration_metadata_dssc.yml

calibration_metadata_lpd.yml

Types of changes

Checklist:

  • Add a comment pointing to this discussion before merging

Reviewers

@kluyvert @schmidtp

Edited by Karim Ahmed

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Karim Ahmed changed the description

    changed the description

    • In the meeting today, I think you were able to reproduce the problem consistently? Can you share the commands you ran to do that?

    • So basically any dark request pointing to a an out_folder with previous dark processing should do

      This is the one that I was using.

      xfel-calibrate AGIPD DARK --out-folder /gpfs/exfel/data/scratch/ahmedk/test/deployed_3.5.0a1/HED_DET_AGIPD500K2G/HED_DET_AGIPD500K2G-DARK-ADAPTIVE --in-folder /gpfs/exfel/exp/HED/202131/p900228/raw --run-high 25 --run-med 26 --run-low 27 --sequences 1 --karabo-id-control HED_EXP_AGIPD500K2G --karabo-da-control AGIPD500K2G00 --karabo-id HED_DET_AGIPD500K2G --slurm-mem 750 --h5path-ctrl '/CONTROL/{}/MDL/FPGA_COMP' --report-to HED_DET_AGIPD500K2G-DARK-ADAPTIVE_220210_135604 --slurm-name HED_DET_AGIPD500K2G-DARK-ADAPTIVE

      And you can see the slurm logs for the runs that were reported on xcal temp.

      /home/xcal/deployments/development/git.xfel.eu/detectors/pycalibration/current/temp/slurm_out_AGIPD_DARK_t220210_122248

      /home/xcal/deployments/development/git.xfel.eu/detectors/pycalibration/current/temp/slurm_out_AGIPD_DARK_t220209_184136

      Edited by Karim Ahmed
    • Please register or sign in to reply
    • Resolved by Karim Ahmed

      I managed to reproduce this once, then I started fiddling with the code to figure out what might be happening, and it disappeared. I tried resetting back to the code in master, and so far it still doesn't want to happen again.

      I'm almost wondering if this is somehow an issue in GPFS - if the file contents and the metadata telling it what size it is get out of sync. I can't imagine it would go wrong on something as simple as truncating and overwriting a file, but I also can't see how it could happen if the filesystem was working properly. :confused:

  • Karim Ahmed changed title from if new save the empty dictionary to reset calibration_metadata.yml to Reset calibration_metadata.yml before saving the initial calibration metadata

    changed title from if new save the empty dictionary to reset calibration_metadata.yml to Reset calibration_metadata.yml before saving the initial calibration metadata

  • Karim Ahmed changed the description

    changed the description

  • Karim Ahmed changed the description

    changed the description

  • Karim Ahmed added 1 commit

    added 1 commit

    • fff5a3f2 - add a comment pointing to the MR discussion

    Compare with previous version

  • Thank you for the review and thanks @kluyvert for testing and communicating the issue with ITDM. Hopefully we resolve this soon.

    Merging the MR after adding a link to this discussion in the code.

    Edited by Karim Ahmed
  • merged

  • Karim Ahmed mentioned in commit 955e86ac

    mentioned in commit 955e86ac

Please register or sign in to reply
Loading