Skip to content
Snippets Groups Projects

Feat/add badpixelff jungfrau

Merged David Hammer requested to merge feat/add_badpixelff_jungfrau into master

Overview

This MR will add retrieval and svaing of flat-field badpixel masks for the Jungfrau correction notebook as required in https://git.xfel.eu/gitlab/detectors/calibration_workshop/issues/242. Additional refactoring will be included, subject to time constraints.

Tasks

The main focus of this MR is adding BadPixelsFF. There are some refactoring steps which would be fairly obvious to include when modifying the notebook - if time allows. I've listed them below just to keep track / in case someone has opinions on style.

  • Replace ipyparallel with multiprocessing
  • Add BadPixelsFF
    • Load if available
    • OR with existing badpixel mask
    • Investigate data format inconsistency observed in test data (currently moving axis to compensate)
  • Refactor get_constants_for_module
    • Make more multiprocessing-appropriate
    • DRY
    • (Optional) move to library
    • Investigate debug data format inconsistency observed in test data
  • Introduce pathlib
  • Refactor correction cell
    • Use pathlib
    • Untangle nesting
  • Test

Issues

As mentioned under Tasks, I ran into data format issues when trying to run with some test data. These have been resolved for now.

Resolved: Non-matching shapes between badpixel masks for dark and FF

The two badpixel masks found have slighly different shapes. Specifically, the BadPixelsDark mask has shape (512, 1024, 1, 3) whereas the BadPixelsFF one has shape (1024, 512, 1, 3). Update: as it turns out, one is simply stored with rows, columns, rest whereas the other is stored with columns, rows, rest. Until that changes, switching around the axes of one should suffice (in the code, the mask_ff has axes moved around and is OR'ed with the "existing" mask from darks).

Resolved: Seemingly wrong h5path loading image data

The variable h5path describes where to find image data in the h5 files. I did not sufficiently understand how this was generated and had changed between different cycles. With appropriately set h5path and receiver_id, no complicated changes need to be made.

How this has been tested

During development, the following variables were set in the first code cell: Looking around for modules which have badpixelff calibration constants available, I am currently testing using the following parameters in the first cell:

in_folder = "/gpfs/exfel/exp/SPB/202002/p002697/raw" # the folder to read data from, required
out_folder =  "/gpfs/exfel/data/scratch/hammerd/issue-242"  # the folder to output to, required
sequences = [0, 1, 2, 3] # sequences to correct, set to -1 for all, range allowed
run = 14 # runs to process, required

karabo_id = "SPB_IRDA_JF4M" # karabo prefix of Jungfrau devices
karabo_da = ['JNGFR01'] # data aggregators
receiver_id = "RECEIVER-{}" # inset for receiver devices
receiver_control_id = "CONTROL" # inset for control devices
path_template = 'RAW-R{:04d}-{}-S{:05d}.h5'  # template to use for file name, double escape sequence number
karabo_da_control = "JNGFRCTRL00" # file inset for control data

Test with "new" variables from notebook (run where BadPixelsFF gets found):

TIMESTAMP=$(date "+%Y-%m-%d-%H-%M")
MYNAME=$(basename "$0")
xfel-calibrate JUNGFRAU CORRECT \
        --slurm-name test-242 \
        --in-folder /gpfs/exfel/exp/SPB/202002/p002697/raw \
        --out-folder "/gpfs/exfel/data/scratch/hammerd/issue-242/$MYNAME-$TIMESTAMP-data" \
        --report-to "/gpfs/exfel/data/scratch/hammerd/issue-242/$MYNAME-$TIMESTAMP-report" \
        --run 14 \
        --karabo-id SPB_IRDA_JF4M \
        --karabo-da JNGFR01 \
        --receiver-id "JNGFR{:02d}" \
        --karabo-da-control JNGFRCTRL00 \
        --db-module Jungfrau_M275 \
        --sequences-per-node 1

run-test-1.sh-2021-02-18-17-05-report.pdf

Test with "old" variables from notebook (FXE run where BadPixelsFF does not get found):

TIMESTAMP=$(date "+%Y-%m-%d-%H-%M")
MYNAME=$(basename "$0")
xfel-calibrate JUNGFRAU CORRECT \
        --slurm-name test-242 \
        --in-folder /gpfs/exfel/exp/FXE/201901/p002210/raw \
        --out-folder "/gpfs/exfel/data/scratch/hammerd/issue-242/$MYNAME-$TIMESTAMP-data" \
        --report-to "/gpfs/exfel/data/scratch/hammerd/issue-242/$MYNAME-$TIMESTAMP-report" \
        --run 249 \
        --karabo-id FXE_XAD_JF1M \
        --karabo-da JNGFR01 \
        --receiver-id "RECEIVER-{}" \
        --karabo-da-control JNGFR01 \
        --db-module Jungfrau_M233 \
        --sequences-per-node

run-test-2.sh-2021-02-18-17-06-report.pdf

Currently, get_constant_from_db_and_time (by way of calibrationDBRemote outputs complaints to stdout when it fails to find a constant. @ahmedk what is the preferred way to avoid this? Or rather: in which way would it be best to let the user know that there were no BadPixelFF constants found? Right now, the report will show that the creation time was None, but the scary sounding "error" from calibrationDBRemote also shows up.

Reviewers

@ahmedk @danilevc

Edited by David Hammer

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • David Hammer
  • David Hammer changed the description

    changed the description

  • David Hammer changed the description

    changed the description

  • David Hammer changed the description

    changed the description

  • David Hammer marked the checklist item Refactor with pathlib as completed

    marked the checklist item Refactor with pathlib as completed

  • David Hammer added 1 commit

    added 1 commit

    • a5aa371e - Add pathlib, improve formatting

    Compare with previous version

    • Resolved by David Hammer

      @hammerd regarding the mismatching shapes, it is due to the different conventions used to generate the offset map and the gain map: the first one uses (row, column, cell, gain), the second (column, row, cell, gain). Unfortunately this double standard has propagated in the database long before the issue was raised, and it was handled in both online and offline pipelines with ad hoc solutions (i.e. axes swaps). This problem should be addressed sooner or later, but probably not now.

      Edited by Marco Ramilli
  • David Hammer added 1 commit

    added 1 commit

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    Compare with previous version

  • David Hammer changed the description

    changed the description

  • David Hammer changed the description

    changed the description

  • David Hammer marked the checklist item DRY as completed

    marked the checklist item DRY as completed

  • David Hammer marked the checklist item (Important) debug data format inconsistency observed in test data (currently moving axis to compensate) as completed

    marked the checklist item (Important) debug data format inconsistency observed in test data (currently moving axis to compensate) as completed

  • David Hammer added 1 commit

    added 1 commit

    Compare with previous version

  • David Hammer changed the description

    changed the description

    • Resolved by David Hammer

      Seemingly wrong h5path loading image data

      The variable h5path describes where to find image data in the h5 files. With the formatting as programmed in the existing notebook, the variable gets set to a path that does not exist in /gpfs/exfel/exp/SPB/202002/p002697/raw/r0014/RAW-R0014-JNGFR01-S00000.h5:

      /INSTRUMENT/SPB_IRDA_JF4M/DET/RECEIVER-1:daqOutput/data

      which instead contains the path:

      /INSTRUMENT/SPB_IRDA_JF4M/DET/JNGFR01:daqOutput/data

      I remember there were some changes to the paths inside h5 files for many detectors as AGIPD and Jungfrau

      https://git.xfel.eu/gitlab/detectors/calibration_configurations I would check the YAML file for this proposal and see what is the expected receiver. If the proposal and cycle are missing for these specific YAML files (as the case for 202002/p002697) then the configurations are read from default.yaml (data-mapping key)

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Please register or sign in to reply
    Loading