Skip to content
Snippets Groups Projects

Refactor stacking for reuse and overlappability

Merged David Hammer requested to merge refactor-stacking into master

Currently, the stacking feature adds a lot of code to various parts of ShmemTrainMatcher. It would be nice if this could be cleaner, for:

  • more maintainable ShmemTrainMatcher
  • reusability: the DetectorAssembler also does stacking except in its own less good way
  • reusability: we will at some point probably split ShmemTrainMatcher and some of the resulting matching-and-processing devices will want to stack
  • improvability: the current implementation is pretty naive and though the handle_source is intended to be run in a thread pool, only one train can be processed at a time; this requires rethinking of buffer handling to change
    • bonus concern: the stacking buffers, if we start overlapping processing and decoupling this from sending, would not satisfy the safeNDArray flag in the current version

Plan for this MR:

  • Move stacking functionality to "friend" class
  • Get rid of global stacking buffers; idea: create, per train, a stacking context with fresh buffers
  • Add some testing of stacking (not just utils, but the friend class working on tables)
  • (Out of scope) Test that it works in a overlapping trains setting
  • (Out of scope) Start overlapping train processing
  • Refactor DetectorAssembler to use StackingFriend to stack before assembly
  • Change the configuration that the manager passes to DetectorAssembler correspondingly
  • Nevermind, DetectorAssembler is not a great fit actually; maybe use xarray and exploit extra-geom's partial data assembly functionality (Out of scope; see !74 (closed))
  • Test everything with actual Karabo (pending node allocation)
Edited by David Hammer

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • assigned to @hammerd

  • Worth noting: if we do go for overlapping train processing, then we could also decide that we don't need multi-threading over sources within a given train. The current implementation uses that (thread pool used to parallelize over source) to be fast enough per-train, but implementation would be simpler if we accept higher latency and only parallelize over trains in a short queue. Then the stacking could be naive np.stack and such.

    As I'm offended by latency, I'll first try to get it running with the option to parallelize both over sources within a train and over trains potentially in a queue.

  • David Hammer added 1 commit

    added 1 commit

    • 4c1d521f - Simplify merge index setting, per-train preparatory steps

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    Compare with previous version

  • David Hammer added 4 commits

    added 4 commits

    • cdb91936 - Similarly start compartmentalizing frame selection functionality
    • d47a9039 - Handle configuration of frame selection
    • ca46ba88 - Always run thread pool, enable configuring number of workers
    • c3e1c48f - WIP: restructure / simplify stacking execution

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    • ad4606b8 - Finished overhaul of source stacking, added test

    Compare with previous version

  • David Hammer added 2 commits

    added 2 commits

    • 2b5117a6 - Improve testing, fix some edge cases
    • 20e9ce2f - Fix up key stacking, improve testing

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    • 95bbab69 - Include thread pool in tests

    Compare with previous version

  • David Hammer marked the checklist item Move stacking functionality to "friend" class (in progress) as completed

    marked the checklist item Move stacking functionality to "friend" class (in progress) as completed

  • David Hammer marked the checklist item Get rid of global stacking buffers; idea: create, per train, a stacking context with fresh buffers as completed

    marked the checklist item Get rid of global stacking buffers; idea: create, per train, a stacking context with fresh buffers as completed

  • David Hammer changed the description

    changed the description

  • So, the overall design for a stacking friend (and a frame selection friend but that one is trivial) is now in place. I ended up moving away from the context manager (would have held the buffers prepared for a single train) to instead let the friend do set up and process sources immediately, optionally using a passed thread pool (similar to assembly in extra_geom). This means more submissions to the thread pool, but has some advantages:

    • can run the entire stacking process after frame selection, so the two don't clash (shouldn't happen on the same device in the future, but nice that it's orthogonal)
    • it allows alternative execution pattern described below

    Other interesting change: stacking within handle_source used to be called based on every source present and so we'd want to keep track of who's missing and fill them in afterwards. The version in this MR will instead call auxiliary functions for each copy operation that was supposed to happen; if a given source is not there, filling missing values happens immediately. Also, would allow splitting work if multiple keys were stacked within a source. Key stacking not quite as sophisticated, though.

  • David Hammer added 2 commits

    added 2 commits

    • 867150d8 - Reformat
    • 0fde35f1 - Update ShmemTrainMatcher's use of stacking friend

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    • 9d5ba45f - Make the DetectorAssembler use stacking

    Compare with previous version

  • David Hammer added 1 commit

    added 1 commit

    • 45067015 - Add processing time measurements, add exponential averaging to ToF

    Compare with previous version

  • David Hammer marked the checklist item Refactor DetectorAssembler to use StackingFriend to stack before assembly as completed

    marked the checklist item Refactor DetectorAssembler to use StackingFriend to stack before assembly as completed

  • David Hammer marked the checklist item Change the configuration that the manager passes to DetectorAssembler correspondingly as completed

    marked the checklist item Change the configuration that the manager passes to DetectorAssembler correspondingly as completed

  • David Hammer changed the description

    changed the description

  • David Hammer added 2 commits

    added 2 commits

    • 76847eeb - Small fixes from testing with Karabo
    • 2af3eb7b - Revert "Make the DetectorAssembler use stacking"

    Compare with previous version

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading