Skip to content
Snippets Groups Projects

[LPD] [CORRECT] Create virtual CXI files by sequence rather than by job chunk

Merged Philipp Schmidt requested to merge feat/LPD-cxi-per-sequence into master
2 unresolved threads

Description

After discussions with users, they prefer to have a CXI file per sequence rather than for job's batch of sequences (which may be more than one). There's also been an unfortunate bug in the prior CXI MR that re-used an LPD1M object that was created from only a single train...

How Has This Been Tested?

Running interactively on p3073, r10.

Types of changes

  • Bug fix (non-breaking change which fixes an issue)
  • Breaking change (fix or feature that would cause existing functionality to not work as expected)

Reviewers

@kluyvert @dallanto

Edited by Philipp Schmidt

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
608 608 " vcxi_folder = Path(create_virtual_cxi_in)\n",
609 609 " vcxi_folder.mkdir(parents=True, exist_ok=True)\n",
610 610 " \n",
611 " if not sequences or sequences == [-1]:\n",
612 " seqs = 'all'\n",
613 " else:\n",
614 " seqs = '_'.join(str(s) for s in sequences)\n",
611 " def sort_files_by_seq(by_seq, outp_path):\n",
612 " by_seq.setdefault(int(outp_path.stem[-5:]), []).append(outp_path)\n",
613 " return by_seq\n",
615 614 " \n",
616 " det.write_virtual_cxi(vcxi_folder / f'r{run}_seqs_{seqs}.cxi')"
615 " from functools import reduce\n",
616 " reduce(sort_files_by_seq, output_paths, output_by_seq := {})\n",
  • That's the crucial step to split the job sequences into single sequence, before actually retrieving the data collection, right? ... a method that's so far not in my "active vocabulary" but seems very useful.

    Edited by Fabio Dall'Antonia
  • There are less functional ways to do the same, but it seemed like a nice opportunity to use this function. I'm not super stoked on the method of retrieving the sequence number (proper regexp would be better), but let's assume this feature is only used with "proper" EuXFEL data files for now.

  • Please register or sign in to reply
  • I've looked into possibilties to make the path optionally relative to the proposal folder. The problem right now is that notebooks have no concept of the proposal folder, in_folder and out_folder only contain it implicitly, and may not in interactive use. It would be safe to add formatting placeholders and go up from either in the calibration configuration given it always runs via the webservice (e.g. {in_folder}/../scratch/cxi), but this does not work since in_folder is resolved to an entirely different partition, making such relative paths invalid.

    One could of course match either against a known pattern, but I'm not a huge fan of these things, at least not in notebook code. The proper solution might be to introduce a generic argument proposal_folder, which is empty by default but passed by the webservice. Until then specific paths are just fine, especially in this case it's mostly an academic question.

  • Philipp Schmidt changed title from Create virtual CXI files by sequence rather than by job chunk to [LPD] [CORRECT] Create virtual CXI files by sequence rather than by job chunk

    changed title from Create virtual CXI files by sequence rather than by job chunk to [LPD] [CORRECT] Create virtual CXI files by sequence rather than by job chunk

  • Hi @schmidtp, overall LGTM ... any comments from @kluyvert ?

  • I don't find the reduce() especially easy to follow, but it's not enough of an issue to be worth changing.

    LGTM

  • added 1 commit

    • 20dab91b - Add run and proposal_folder placeholder in create_virtual_cxi_in argument

    Compare with previous version

  • Philipp Schmidt mentioned in commit fa628d20

    mentioned in commit fa628d20

  • Please register or sign in to reply
    Loading