Skip to content
Snippets Groups Projects

feat[jungfrau][correct]: use new correct data source and link to old data source

Merged Karim Ahmed requested to merge feat/new_corrected_data_source_jungfrau into master
1 unresolved thread

Description

Update Jungfrau to store data in new output correct data source and link to legacy sources.

Relevant to this issue: https://git.xfel.eu/calibration/planning/-/issues/170

How Has This Been Tested?

CORR-R9034-JNGFR02-S00000.h5
  + NEW: INDEX/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/count
  + NEW: INDEX/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/first
  + NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/adc
  + NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/frameNumber
  + NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/gain
  + NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/mask
  + NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/memoryCell
  + NEW: INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR02:daqOutput/data/trainId
  - MISSING: INDEX/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/count
  - MISSING: INDEX/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/first
  - MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/adc
  - MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/frameNumber
  - MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/gain
  - MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/mask
  - MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/memoryCell
  - MISSING: INSTRUMENT/FXE_XAD_JF1M/DET/JNGFR02:daqOutput/data/trainId
  ~ CHANGED: METADATA/dataSources/dataSourceId (Shape: (2,) -> (3,))
  ~ CHANGED: METADATA/dataSources/deviceId (Shape: (2,) -> (3,))
  ~ CHANGED: METADATA/dataSources/root (Shape: (2,) -> (3,))

Relevant Documents (optional)

  • INSTRUMENT

FROM

└FXE_XAD_JF1M
  ├DET
  │ └JNGFR01:daqOutput
  │   └data
  │     ├adc	[float32: 500 × 16 × 512 × 1024]
  │     ├frameNumber	[uint64: 500 × 16]
  │     ├gain	[uint8: 500 × 16 × 512 × 1024]
  │     ├mask	[uint32: 500 × 16 × 512 × 1024]
  │     ├memoryCell	[uint8: 500 × 16]
  │     └trainId	[uint64: 500]
  └ROIPROC
    └JNGFR01:output
      └data
        ├roi1
        │ └data	[float32: 500 × 16 × 512]
        └roi2
          └data	[float32: 500 × 16 × 512]

to

└FXE_XAD_JF1M
  ├CORR
  │ └JNGFR01:daqOutput
  │   └data
  │     ├adc	[float32: 500 × 16 × 512 × 1024]
  │     ├frameNumber	[uint64: 500 × 16]
  │     ├gain	[uint8: 500 × 16 × 512 × 1024]
  │     ├mask	[uint32: 500 × 16 × 512 × 1024]
  │     ├memoryCell	[uint8: 500 × 16]
  │     └trainId	[uint64: 500]
  ├DET
  │ └JNGFR01:daqOutput	-> /INSTRUMENT/FXE_XAD_JF1M/CORR/JNGFR01:daqOutput
  └ROIPROC
    └JNGFR01:output
      └data
        ├roi1
        │ └data	[float32: 500 × 16 × 512]
        └roi2
          └data	[float32: 500 × 16 × 512]
  • INDEX

FROM

├FXE_XAD_JF1M
│ ├DET
│ │ └JNGFR01:daqOutput
│ │   └data
│ │     ├count	[uint64: 500]
│ │     └first	[uint64: 500]
│ └ROIPROC
│   ├JNGFR01
│   │ ├count	[uint64: 500]
│   │ └first	[uint64: 500]
│   └JNGFR01:output
│     └data
│       ├count	[uint64: 500]
│       └first	[uint64: 500]
├flag	[int32: 500]
├origin	[int32: 500]
├timestamp	[uint64: 500]
└trainId	[uint64: 500]

TO

├FXE_XAD_JF1M
│ ├CORR
│ │ └JNGFR01:daqOutput
│ │   └data
│ │     ├count	[uint64: 500]
│ │     └first	[uint64: 500]
│ ├DET
│ │ └JNGFR01:daqOutput	-> /INDEX/FXE_XAD_JF1M/CORR/JNGFR01:daqOutput
│ └ROIPROC
│   ├JNGFR01
│   │ ├count	[uint64: 500]
│   │ └first	[uint64: 500]
│   └JNGFR01:output
│     └data
│       ├count	[uint64: 500]
│       └first	[uint64: 500]
├flag	[int32: 500]
├origin	[int32: 500]
├timestamp	[uint64: 500]
└trainId	[uint64: 500]

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

Reviewers

Edited by Karim Ahmed

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Karim Ahmed added 1 commit

    added 1 commit

    Compare with previous version

  • Karim Ahmed added 1 commit

    added 1 commit

    • e44e048e - refactor: rename outp_source to instr_src_group

    Compare with previous version

  • Karim Ahmed added 1 commit

    added 1 commit

    • 3325d9cc - refactor: improve and place variable in 1st nb cell

    Compare with previous version

  • Karim Ahmed changed the description

    changed the description

  • Karim Ahmed added 1 commit

    added 1 commit

    Compare with previous version

  • Karim Ahmed added 1 commit

    added 1 commit

    • 6a215845 - fix: move back metadata creation at thend and add comments

    Compare with previous version

  • 737 730 " if output_src != input_src:\n",
    738 731 " outp_file.create_legacy_source(input_src_kda, output_src_kda)\n",
    739 732 "\n",
    733 " # Create METADATA datasets at the end to pick the roi, if configured.\n",
    734 " outp_file.create_metadata(\n",
    735 " like=seq_dc,\n",
    736 " sequence=seq_file.sequence,\n",
    737 " # preserve channels order\n",
    738 " instrument_channels=sorted({f'{output_src_kda}/data', f'{input_src_kda}/data'})\n",
    • Doing create_metadata() after we've set up the sources, it shouldn't be necessary to pass the source/channel names here, because the file already knows about them.

    • Aha!

      Can you also remind me why we prefer creating metadata at the beginning?

    • For efficiency... we assume that GPFS will likely serve the first few bytes quicker than the rest when opening a file.

    • Karim Ahmed changed this line in version 11 of the diff

      changed this line in version 11 of the diff

    • Thank you both!

    • And that's not entirely an assumption - train IDs used to be spread throughout the file, and I think we saw about a substantial improvement in the speed of reading them when we put them all together at the beginning.

      I think GPFS works with chunks of large files. So ideally all the metadata we need is in one chunk at the beginning - but if it's one chunk at the beginning and one at the end, that's probably not too bad.

    • I suspect that HDF can directly jump the wherever the location of METADATA is and only attempt reading those bits? Or does it need to traverse some tree structure potentially scattered throughout the file?

    • There is a tree, but as METADATA is in the root group and that's quite small, it's hopefully just one or two layers of indirection to follow.

    • Please register or sign in to reply
  • Karim Ahmed added 1 commit

    added 1 commit

    • 5db19d25 - fix: remove adding instrument_channels as channels added automatically

    Compare with previous version

  • Thanks Karim, LGTM

  • Karim Ahmed changed the description

    changed the description

  • merged

  • Karim Ahmed mentioned in commit ffd7d60a

    mentioned in commit ffd7d60a

  • Thank you for the review!

  • Philipp Schmidt changed milestone to %3.15.0

    changed milestone to %3.15.0

  • Please register or sign in to reply
    Loading