Using joblib instead of multiprocessing
With the multiprocessing approach we are limited to some ~60 bins before the binning crashes. Switching to joblib allows 257 bins or more.
To be noted, tqdm doesn't work too well now, reaching 100% before all jobs are done.