Skip to content
Snippets Groups Projects

Speed up prediction and outlier detection.

Merged Danilo Enoque Ferreira de Lima requested to merge speedup into main

Using threading for parallel processing speeds up prediction by a factor of 100: most of the time consumed was on passing data to the processes. Switched from the EllipticEnvelope to an IQR-based sigma estimation, followed by a chi^2 test. Since this happens after the PCA, the data is already decorrelated.

The chi^2 test produces the test statistics and uses the number of degrees of freedom to calculate the chi^2 variance and apply a cut at chi^2 mean + sigma * sqrt(chi^2 variance). This implies that the PCA-decorrelated data should be Gaussian, which is not true, since we know it is Poisson. Nevertheless, in the limit on which we have a lot of data (ie: the XGM intensity is above the 500 uJ cut-off), this is probably a good approximation.

Edited by Danilo Enoque Ferreira de Lima

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
Please register or sign in to reply
Loading