Skip to content

WIP: Throw pasha at computational bottleneck

David Hammer requested to merge feat/try-pasha-agipd-dark into master

Overview

While working on the AGIPD darks notebook (https://git.xfel.eu/gitlab/detectors/pycalibration/merge_requests/438), I found that a lot of time was being spent inside characterize_module with a single thread basically just running np.median for each memory cell. Seems ripe for parallelization.

The offending loop:

for cc in np.unique(cellIds[cellIds < mcells]):
    cellidx = cellIds == cc
    offset[...,cc] = np.median(im[..., cellidx], axis=2)
    noise[...,cc] = np.std(im[..., cellidx], axis=2)
    gains[...,cc] = np.median(ga[..., cellidx], axis=2)
    gains_std[...,cc] = np.std(ga[..., cellidx], axis=2)

Can be replaced with (note that psh refers to pasha):

def process_cell(worker_id, array_index, cell_number):
    cell_slice_index = (cellIds == cell_number)
    im_slice = im[..., cell_slice_index]
    offset[..., cell_number] = np.median(im_slice, axis=2)
    noise[..., cell_number] = np.std(im_slice, axis=2)
    ga_slice = ga[..., cell_slice_index]
    gains[..., cell_number] = np.median(ga_slice, axis=2)
    gains_std[..., cell_number] = np.std(ga_slice, axis=2)
psh.map(process_cell, np.arange(num_cells))

And this already makes things way faster (see changes for full diff). Messing around with this in isolation, it seemed to speed up this loop around 5x. Also, the fancy indexing creates copies, so I try reusing it at least—am too lazy to figure out a less copying approach for now.

Experiment

With new "default values" in the notebook, I ran a darks job for one module:

#!/usr/bin/env bash
TIMESTAMP=$(date "+%Y-%m-%d-%H-%M")
MYNAME=$(basename "$0")
xfel-calibrate AGIPD DARK \
    --slurm-name "$MYNAME" \
    --in-folder "/gpfs/exfel/d/raw/CALLAB/202031/p900113" \
    --out-folder "/gpfs/exfel/data/scratch/hammerd/test/agipd-fixed-gain/$MYNAME-$TIMESTAMP-data" \
    --modules 0 \
    --run-high 9985 \
    --run-med 9984 \
    --run-low 9983 \
    --karabo-id "HED_DET_AGIPD500K2G" \
    --karabo-id-control "HED_EXP_AGIPD500K2G" \
    --karabo-da-control "AGIPD500K2G00" \
    --h5path-ctrl "/CONTROL/{}/MDL/FPGA_COMP" \
    --no-db-output

I ran this with the current version in this MR ("par"), with the version from https://git.xfel.eu/gitlab/detectors/pycalibration/merge_requests/438 ("seq"), and with master of pycalibration ("master"). I compared the resulting .h5 files (const_BadPixelsDark..., const_ThresholdsDark_...) using h5diff and they did not differ except for the report paths.

The plot below shows the calibration job runtime as found in {out_folder}/calibration_metadata.yml:

image

Pretty sweet. Helps max-exfl189 stay warm: 2021-03-05-124619_3328x2026_scrot

Edited by David Hammer

Merge request reports