Skip to content

WIP: Validate sequences prior to correction

Cyril Danilevski requested to merge fix/empty_sequences_bug into master

Description

We ran into an issues where sequences without data would raise a ValueError and make a slurm job hang. The following is an attempt at fixing that.

The dataset /gpfs/exfel/exp/SPB/202030/p900138/raw/r0167 has no data in the sequences 2:

/gpfs/exfel/exp/SPB/202030/p900138/raw/r0167/RAW-R0167-AGIPD00-S00002.h5 (1 attributes)
├INSTRUMENT
│ └SPB_DET_AGIPD1M-1
│   └DET
│     └0CH0:xtdf
│       ├header (3 attributes)
│       │ ├magicNumberBegin     [int8: 0 × 8]
│       │ ├majorTrainFormatVersion      [uint32: 0]
│       │ ├minorTrainFormatVersion      [uint32: 0]
│       │ ├trainId      [uint64: 0]
│       │ ├linkId       [uint64: 0]
│       │ ├pulseCount   [uint64: 0]
│       │ ├dataId       [uint64: 0]
│       │ └reserved     [uint8: 0 × 16]
│       ├image (3 attributes)
│       │ ├data [uint16: 0 × 2 × 512 × 128]
│       │ ├pulseId      [uint64: 0 × 1]
│       │ ├status       [uint16: 0 × 1]
│       │ ├length       [uint32: 0 × 1]
│       │ ├cellId       [uint16: 0 × 1]
│       │ └trainId      [uint64: 0 × 1]
│       ├detector (3 attributes)
│       │ ├data [uint8: 0 × 5408]
│       │ └trainId      [uint64: 0]
│       └trailer (3 attributes)
│         ├checksum     [int8: 0 × 16]
│         ├magicNumberEnd       [int8: 0 × 8]
│         ├status       [uint64: 0]
│         └trainId      [uint64: 0]

As this is the last sequence in the folder, I theorisize that acquistion was stopped just in time before data was available for this sequence.

Here, we aim to curb this issue by validating sequences with data before calibrating it: when the sequences are not explicitely defined, we check each file for data.

This has two shortcomings.

  • This fix is AGIPD specific, it would be better to have a generic function.
  • If one file does not have data, the whole sequence is skipped, even for other modules of a detector.

I would prefer for a file to be skipped if there's no data in it and continue with the other files in the bunch, but it's proven tricky to do.

How Has This Been Tested?

There's an integration test using the "broken" run.

I'm also using the following to run a complete test:

xfel-calibrate AGIPD CORRECT \
    --slurm-mem 750 \
    --slurm-name spb_debug \
    --report-to /gpfs/exfel/data/scratch/danilevc/debug/ \
    --receiver-id {}CH0 \
    --h5path-ctrl /CONTROL/{}/MDL/FPGA_COMP \
    --in-folder /gpfs/exfel/exp/SPB/202030/p900138/raw/ \
    --out-folder /gpfs/exfel/data/scratch/danilevc/debug/ \
    --karabo-id SPB_DET_AGIPD1M-1 \
    --karabo-id-control SPB_IRU_AGIPD1M1 \
    --karabo-da-control AGIPD1MCTRL00 \
    --run 167 \

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the code style of this project.

Reviewers

@ahmedk @kamile

Edited by Cyril Danilevski

Merge request reports