WIP: Throw pasha at computational bottleneck
Overview
While working on the AGIPD darks notebook
(https://git.xfel.eu/gitlab/detectors/pycalibration/merge_requests/438),
I found that a lot of time was being spent inside characterize_module
with a single thread basically just running np.median
for each memory
cell. Seems ripe for parallelization.
The offending loop:
for cc in np.unique(cellIds[cellIds < mcells]):
cellidx = cellIds == cc
offset[...,cc] = np.median(im[..., cellidx], axis=2)
noise[...,cc] = np.std(im[..., cellidx], axis=2)
gains[...,cc] = np.median(ga[..., cellidx], axis=2)
gains_std[...,cc] = np.std(ga[..., cellidx], axis=2)
Can be replaced with (note that psh
refers to
pasha):
def process_cell(worker_id, array_index, cell_number):
cell_slice_index = (cellIds == cell_number)
im_slice = im[..., cell_slice_index]
offset[..., cell_number] = np.median(im_slice, axis=2)
noise[..., cell_number] = np.std(im_slice, axis=2)
ga_slice = ga[..., cell_slice_index]
gains[..., cell_number] = np.median(ga_slice, axis=2)
gains_std[..., cell_number] = np.std(ga_slice, axis=2)
psh.map(process_cell, np.arange(num_cells))
And this already makes things way faster (see changes for full diff). Messing around with this in isolation, it seemed to speed up this loop around 5x. Also, the fancy indexing creates copies, so I try reusing it at least—am too lazy to figure out a less copying approach for now.
Experiment
With new "default values" in the notebook, I ran a darks job for one module:
#!/usr/bin/env bash
TIMESTAMP=$(date "+%Y-%m-%d-%H-%M")
MYNAME=$(basename "$0")
xfel-calibrate AGIPD DARK \
--slurm-name "$MYNAME" \
--in-folder "/gpfs/exfel/d/raw/CALLAB/202031/p900113" \
--out-folder "/gpfs/exfel/data/scratch/hammerd/test/agipd-fixed-gain/$MYNAME-$TIMESTAMP-data" \
--modules 0 \
--run-high 9985 \
--run-med 9984 \
--run-low 9983 \
--karabo-id "HED_DET_AGIPD500K2G" \
--karabo-id-control "HED_EXP_AGIPD500K2G" \
--karabo-da-control "AGIPD500K2G00" \
--h5path-ctrl "/CONTROL/{}/MDL/FPGA_COMP" \
--no-db-output
I ran this with the current version in this MR ("par"), with the version
from
https://git.xfel.eu/gitlab/detectors/pycalibration/merge_requests/438
("seq"), and with master
of pycalibration
("master"). I compared the
resulting .h5
files (const_BadPixelsDark...
,
const_ThresholdsDark_...
) using h5diff
and they did not differ
except for the report paths.
The plot below shows the calibration job runtime as found in
{out_folder}/calibration_metadata.yml
: