Skip to content
Snippets Groups Projects

Fix stacking buffer shape issue

Merged Egor Sobolev requested to merge fix/stacking-buffer-shape into master
3 unresolved threads

This add stacking preparation which checks data across all stacking sources and prepares accordingly stacking buffers.

This fixes #31 (closed)

The original stacking code tries to assign data to the stacking buffer. If this assignment raises the IndexError then the code reallocates buffer with proper shape. The issue related to the numpy array broadcasting. If the current train has only one frame due to the filtering, then all arrays in image.* broadcast to buffer from previous train with any shape. If the previous train has zero frames than data just throw away.

@hammerd

Edited by Egor Sobolev

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Egor Sobolev changed title from Add stacking preparation which checks data across all stacking sources and... to Fix stacking buffer shape issue

    changed title from Add stacking preparation which checks data across all stacking sources and... to Fix stacking buffer shape issue

  • Egor Sobolev changed the description

    changed the description

347 individual_shape,
348 self._source_stacking_group_sizes[(new_source, key)],
349 axis=axis,
350 ),
351 dtype=dtype,
332 def _check_stacking_data(self, sources, frame_selection_mask):
333 if frame_selection_mask is not None:
334 orig_size = len(frame_selection_mask)
335 result_size = np.sum(frame_selection_mask)
336 stacking_data_shapes = {}
337 ignore_stacking = {}
338 for source, keys in self._source_stacking_sources.items():
339 if source not in sources:
340 for key, new_source, _, _ in keys:
341 ignore_stacking[(new_source, key)] = "Some source is missed"
342 continue
  • Comment on lines +339 to +342

    We could actually use this to keep track parts of the resulting stacked array not getting filled (there was a TODO about zeroing unfilled entries).

  • Author Maintainer

    Indeed, I forgot that we have such an option. So we need instead to record the missed indices and fill buffer with zeros later.

  • Zeros or something configurable (could even add a column to the stacking configuration table for what missing entries should be in each stacking group). In my own test environment, I've at some point hotpatched to zero-initialize for testing LPD + CrystFEL because geometry expects shape for 16 modules worth of data - with only 15 modules, there would be a lot of arbitrarily initialized data coming through. But generally, yes, we should handle missing data per train.

  • Author Maintainer

    I've added filling with zeros

  • Please register or sign in to reply
  • 342 continue
    343 data_hash, timestamp = sources[source]
    344 filtering = (
    345 frame_selection_mask is not None and
    346 self._frame_selection_source_pattern.match(source)
    352 347 )
    348 for key, new_source, merge_method, axis in keys:
    349 merge_data_shape = None
    350 if key in data_hash:
    351 merge_data = data_hash[key]
    352 merge_data_shape = merge_data.shape
    353
    354 if merge_data_shape is None:
    355 ignore_stacking[(new_source, key)] = "Some data is missed"
    356 continue
    357
    • Comment on lines +349 to +357

      Can avoid checking the shape variable for exiting;

      if key in data_hash:
          merge_data = data_hash[key]
          merge_data_shape = merge_data.shape
      else:
          ignore_stacking[(new_source, key)] = "Some data is missed"
          continue
    • Author Maintainer

      Added this, thanks

    • Please register or sign in to reply
  • David Hammer
  • Nice catch that implicit broadcasting breaks the try/except solution! The stacking implementation is generally a bit verbose and not very self-contained (have been meaning to refactor), but checking up front and then stacking is a reasonable approach to fix the current issue. Have added some comments.

  • Egor Sobolev mentioned in commit 0ef3efad

    mentioned in commit 0ef3efad

  • Egor Sobolev added 3 commits

    added 3 commits

    • af4f1f23 - Add filling of places for missed sources in stacked data with zeros
    • 0ef3efad - Avoid shape check as David suggested in !72 (merged)
    • 828bbce4 - Add logging of the stacking failures

    Compare with previous version

  • 339 self._source_stacking_group_sizes[(new_source, key)],
    340 axis=axis,
    341 ),
    342 dtype=dtype,
    343 )
    344 else:
    345 self._stacking_buffers[(new_source, key)] = np.empty(
    346 shape=utils.interleaving_buffer_shape(
    347 individual_shape,
    348 self._source_stacking_group_sizes[(new_source, key)],
    349 axis=axis,
    350 ),
    351 dtype=dtype,
    332 def _check_stacking_data(self, sources, frame_selection_mask):
    333 if frame_selection_mask is not None:
    334 orig_size = len(frame_selection_mask)
  • The fix LGTM.

  • Egor Sobolev added 1 commit

    added 1 commit

    • cb7013db - Ignore filtering if mask doesn't match data size

    Compare with previous version

  • Egor Sobolev added 1 commit

    added 1 commit

    • 65f724b0 - Ignore filtering if mask doesn't match data size

    Compare with previous version

  • merged

  • David Hammer mentioned in commit c3a136fc

    mentioned in commit c3a136fc

  • Please register or sign in to reply
    Loading