[JUNGFRAU][CORRECT] Use DataCollection.from_paths for reading JF CORR files for plots

added Bug Testing Waiting for review labels

assigned to @ahmedk

changed title from use DataCollection.from_paths for reading JF CORR files for plots to [JUNGFRAU][CORRECT] Use DataCollection.from_paths for reading JF CORR files for plots

LGTM

I'm curious how big the performance difference of opening all sequences instead of just the first one is, but since the code's already there LGTM!

So here we don't avoid opening multiple sequences. We rather avoid opening the N sequence file for all N detectors in the same folder. And there is no clear performance difference between opening the 3 files of the 3 HED detectors compared to opening one.

The MR is about something else, sure, but you wouldn't need the fnmatch filter if you just open the list of all corrected files that you already got with corrected_files.

Ah, ok I thought you meant it for the MR changes.

So I used the fnmatch to keep the same behavior as before. I used include before for opening the first sequence file per node. And I did this because of several reasons:

Correction notebooks run per the number of sequences processed per node.
Previously (before EXtra-data) the notebook was designed to plot the first 100 trains. After, I tried to plot all trains for all sequences per node.
With a lot of trains per the N sequences per node, the plotting can fail because of memory issues. This is why I switched to reading and plotting the first sequence per node.

You have a good point that it is nice to know the performance difference from opening all sequences instead of one, but unfortunately, I don't remember related numbers.

removed Testing label

Thank you for the review!

merged

mentioned in commit 4e31209f

removed Waiting for review label

changed milestone to %3.10.0

[JUNGFRAU][CORRECT] Use DataCollection.from_paths for reading JF CORR files for plots

Description

How Has This Been Tested?

Relevant Documents (optional)

Types of changes

Checklist:

Reviewers

Activity

[JUNGFRAU][CORRECT] Use DataCollection.from_paths for reading JF CORR files for plots

Description

How Has This Been Tested?

Relevant Documents (optional)

Types of changes

Checklist:

Reviewers

Merge request reports

Activity