Skip to content

[AGIPD] Add native implementation for transposition of constants

Philipp Schmidt requested to merge feat/AGIPD-cython-transpose into master

Description

Profiling indicates that transposing constants amounts to approximately half of the runtime of AgipdCorrections.init_constants (12s-15s for all constants) after applyingthe prior sanitization bugfix. While probably not really necessary in all cases (why are they saved in a different order than they are used?), such a change could be fairly intrusive. Instead, this MR introduces a special native function to do the transposition in place in Cython rather than using np.transpose. It is parallelized with OpenMP and outperforms the numpy function even at a single core (np.transpose takes ~1.1s for a (3, 352, 512, 128) constant):

transpose_openmp_scaling

I've optimized the loop order by empirical testing and this ended up as the fastest, quite likely due to things like cache locality. Preliminary searching indicates that this could be optimized a bit further by running the second-innermost loop block wise, but especially with OpenMP that benefit will diminish significantly.

It brings the runtime of init_calibrations down to 6-8s for all corrections, reducing the total runtime for single train HED runs to < 30s. The next biggest runtime contribution is now actually finding the median in the slopes PC constant.

How Has This Been Tested?

Comparing the result with np.transpose as implemented before, passing np.testing.assert_allclose

Types of changes

  • Refactor (refactoring code with no functionality changes)

Reviewers

@ahmedk @kluyvert @hammerd

Merge request reports