feat[jungfrau][correct]: use new correct data source and link to old data source
Description
Update Jungfrau to store data in new output correct data source and link to legacy sources.
Relevant to this issue: https://git.xfel.eu/calibration/planning/-/issues/170
How Has This Been Tested?
CORR-R9034-JNGFR02-S00000.h5
+ NEW: INDEX/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/count
+ NEW: INDEX/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/first
+ NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/adc
+ NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/frameNumber
+ NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/gain
+ NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/mask
+ NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/memoryCell
+ NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/trainId
- MISSING: INDEX/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/count
- MISSING: INDEX/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/first
- MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/adc
- MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/frameNumber
- MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/gain
- MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/mask
- MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/memoryCell
- MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/trainId
~ CHANGED: METADATA/dataSources/dataSourceId (Shape: (2,) -> (3,))
~ CHANGED: METADATA/dataSources/deviceId (Shape: (2,) -> (3,))
~ CHANGED: METADATA/dataSources/root (Shape: (2,) -> (3,))
Relevant Documents (optional)
- INSTRUMENT
FROM
└FXE_XAD_JF1M
├DET
│ └JNGFR01:daqOutput
│ └data
│ ├adc [float32: 500 × 16 × 512 × 1024]
│ ├frameNumber [uint64: 500 × 16]
│ ├gain [uint8: 500 × 16 × 512 × 1024]
│ ├mask [uint32: 500 × 16 × 512 × 1024]
│ ├memoryCell [uint8: 500 × 16]
│ └trainId [uint64: 500]
└ROIPROC
└JNGFR01:output
└data
├roi1
│ └data [float32: 500 × 16 × 512]
└roi2
└data [float32: 500 × 16 × 512]
to
└FXE_XAD_JF1M
├CORR
│ └JNGFR01:daqOutput
│ └data
│ ├adc [float32: 500 × 16 × 512 × 1024]
│ ├frameNumber [uint64: 500 × 16]
│ ├gain [uint8: 500 × 16 × 512 × 1024]
│ ├mask [uint32: 500 × 16 × 512 × 1024]
│ ├memoryCell [uint8: 500 × 16]
│ └trainId [uint64: 500]
├DET
│ └JNGFR01:daqOutput -> /INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR01:daqOutput
└ROIPROC
└JNGFR01:output
└data
├roi1
│ └data [float32: 500 × 16 × 512]
└roi2
└data [float32: 500 × 16 × 512]
- INDEX
FROM
├FXE_XAD_JF1M
│ ├DET
│ │ └JNGFR01:daqOutput
│ │ └data
│ │ ├count [uint64: 500]
│ │ └first [uint64: 500]
│ └ROIPROC
│ ├JNGFR01
│ │ ├count [uint64: 500]
│ │ └first [uint64: 500]
│ └JNGFR01:output
│ └data
│ ├count [uint64: 500]
│ └first [uint64: 500]
├flag [int32: 500]
├origin [int32: 500]
├timestamp [uint64: 500]
└trainId [uint64: 500]
TO
├FXE_XAD_JF1M
│ ├CORR
│ │ └JNGFR01:daqOutput
│ │ └data
│ │ ├count [uint64: 500]
│ │ └first [uint64: 500]
│ ├DET
│ │ └JNGFR01:daqOutput -> /INDEX/FXE_XAD_JF1M/CORR/JNGFR01:daqOutput
│ └ROIPROC
│ ├JNGFR01
│ │ ├count [uint64: 500]
│ │ └first [uint64: 500]
│ └JNGFR01:output
│ └data
│ ├count [uint64: 500]
│ └first [uint64: 500]
├flag [int32: 500]
├origin [int32: 500]
├timestamp [uint64: 500]
└trainId [uint64: 500]
Types of changes
- New feature (non-breaking change which adds functionality)
Checklist:
Reviewers
Merge request reports
Activity
mentioned in merge request !1028 (merged)
added 11 commits
-
0e77a9e9...1d83b9db - 9 commits from branch
master
- 47d51b87 - feat[jungfrau][correct]: use new correct data source and link to old data source
- 56b9be43 - use create legacy source
-
0e77a9e9...1d83b9db - 9 commits from branch
added Test CAL Data Affected Testing labels
assigned to @ahmedk
- Resolved by Karim Ahmed
- Resolved by Karim Ahmed
- Resolved by Karim Ahmed
- Resolved by Karim Ahmed
added 1 commit
- e44e048e - refactor: rename outp_source to instr_src_group
- Resolved by Karim Ahmed
added 1 commit
- 3325d9cc - refactor: improve and place variable in 1st nb cell
added 1 commit
- 6a215845 - fix: move back metadata creation at thend and add comments
737 730 " if output_src != input_src:\n", 738 731 " outp_file.create_legacy_source(input_src_kda, output_src_kda)\n", 739 732 "\n", 733 " # Create METADATA datasets at the end to pick the roi, if configured.\n", 734 " outp_file.create_metadata(\n", 735 " like=seq_dc,\n", 736 " sequence=seq_file.sequence,\n", 737 " # preserve channels order\n", 738 " instrument_channels=sorted({f'{output_src_kda}/data', f'{input_src_kda}/data'})\n", changed this line in version 11 of the diff
And that's not entirely an assumption - train IDs used to be spread throughout the file, and I think we saw about a substantial improvement in the speed of reading them when we put them all together at the beginning.
I think GPFS works with chunks of large files. So ideally all the metadata we need is in one chunk at the beginning - but if it's one chunk at the beginning and one at the end, that's probably not too bad.
added 1 commit
- 5db19d25 - fix: remove adding instrument_channels as channels added automatically
mentioned in commit ffd7d60a
changed milestone to %3.15.0