Skip to content
Snippets Groups Projects
Commit 37123a4a authored by Astrid Muennich's avatar Astrid Muennich
Browse files

put in concurrency

parent c8a40c96
No related branches found
No related tags found
1 merge request!15Tutorial
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
# Tutorial Calculation # # Tutorial Calculation #
Author: Astrid Muennich, Version 0.1 Author: Astrid Muennich, Version 0.1
A small example how to adapt a notebook to run with the offline calibration package "pycalibation". A small example how to adapt a notebook to run with the offline calibration package "pycalibation".
The first cell contains all parameters that should be exposed to the command line. The first cell contains all parameters that should be exposed to the command line.
To run this notebooks with several different input parameters in parallel by submitting multiple slurm jobs, for example for various random seed we can do the following:
xfel-calibrate TUTORIAL TEST --random-seed 1,2,3,4
or
xfel-calibrate TUTORIAL TEST --random-seed 1-5
will produce 4 jobs:
Parsed input 1,2,3,4 to [1, 2, 3, 4]
Submitted job: 1169340
Submitted job: 1169341
Submitted job: 1169342
Submitted job: 1169343
Submitted the following SLURM jobs: 1169340,1169341,1169342,1169343
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
out_folder = "/gpfs/exfel/data/scratch/amunnich/tutorial" # output folder out_folder = "/gpfs/exfel/data/scratch/amunnich/tutorial" # output folder
sensor_size = [10, 30] # defining the picture size sensor_size = [10, 30] # defining the picture size
random_seed = 2345 # random seed for filling of fake data array. Change it to produce different results. random_seed = [2345] # random seed for filling of fake data array. Change it to produce different results, range allowed
runs = 500 # how may iterations to fill histograms runs = 500 # how may iterations to fill histograms
cluster_profile = "tutorial" cluster_profile = "tutorial"
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
First include what we need and set up the cluster profile. Everything that has a written response in a cell will show up in the report, e.g. prints but also return values or errors. First include what we need and set up the cluster profile for parallel processing on one node utilising more than one core.
Everything that has a written response in a cell will show up in the report, e.g. prints but also return values or errors.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
import matplotlib import matplotlib
%matplotlib inline %matplotlib inline
import numpy as np import numpy as np
import matplotlib.pyplot as plt import matplotlib.pyplot as plt
# if not using slurm: make sure a cluster is running with # if not using slurm: make sure a cluster is running with
# ipcluster start --n=4 --profile=tutorial # ipcluster start --n=4 --profile=tutorial
# give it a while to start # give it a while to start
from ipyparallel import Client from ipyparallel import Client
print("Connecting to profile {}".format(cluster_profile)) print("Connecting to profile {}".format(cluster_profile))
view = Client(profile=cluster_profile)[:] view = Client(profile=cluster_profile)[:]
view.use_dill() view.use_dill()
``` ```
%% Output %% Output
Connecting to profile tutorial Connecting to profile tutorial
<AsyncResult: use_dill> <AsyncResult: use_dill>
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Create some random data ## Create some random data
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
def data_creation(random_seed): def data_creation(random_seed):
np.random.seed = random_seed np.random.seed = random_seed
return np.random.random((sensor_size)) return np.random.random((sensor_size))
``` ```
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
# in order to run several random seeds in parallel the parameter has to be a list. To use the current single value in this
# notebook we use the first entry in the list
random_seed_single = random_seed[0]
fake_data = [] fake_data = []
for i in range(runs): for i in range(runs):
fake_data.append(data_creation(random_seed+10*i)) fake_data.append(data_creation(random_seed_single+10*i))
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
Plot the random image. everything we write here in the markup cells will show up as text in the report. Create some random images and plot them. Everything we write here in the markup cells will show up as text in the report.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
plt.subplot(211) plt.subplot(211)
plt.imshow(fake_data[0], interpolation="nearest") plt.imshow(fake_data[0], interpolation="nearest")
plt.title('Random Image') plt.title('Random Image')
plt.ylabel('sensor height') plt.ylabel('sensor height')
plt.subplot(212) plt.subplot(212)
plt.imshow(fake_data[5], interpolation="nearest") plt.imshow(fake_data[5], interpolation="nearest")
plt.xlabel('sensor width') plt.xlabel('sensor width')
plt.ylabel('sensor height') plt.ylabel('sensor height')
plt.subplots_adjust(bottom=0.1, right=0.8, top=0.9) plt.subplots_adjust(bottom=0.1, right=0.8, top=0.9)
cax = plt.axes([0.85, 0.1, 0.075, 0.9]) cax = plt.axes([0.85, 0.1, 0.075, 0.9])
plt.colorbar(cax=cax).ax.set_ylabel("# counts") plt.colorbar(cax=cax).ax.set_ylabel("# counts")
plt.show() plt.show()
``` ```
%% Output %% Output
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
These plots show two randomly filled sensor images. We can use markup cells also as captions for images. These plots show two randomly filled sensor images. We can use markup cells also as captions for images.
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
## Simple Analysis ## Simple Analysis
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
mean = [] mean = []
std = [] std = []
for im in fake_data: for im in fake_data:
mean.append(im.mean()) mean.append(im.mean())
std.append(im.std()) std.append(im.std())
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
To parallelise jobs we use the ipyparallel client. To parallelise jobs we use the ipyparallel client. This will run on one node an ipcluster with the specified number of cores given in xfel_calibrate/notebooks.py.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
from functools import partial from functools import partial
def parallel_stats(input): def parallel_stats(input):
return input.mean(), input.std() return input.mean(), input.std()
p = partial(parallel_stats) p = partial(parallel_stats)
results = view.map_sync(p, fake_data) results = view.map_sync(p, fake_data)
p_mean= [ x[0] for x in results ] p_mean= [ x[0] for x in results ]
p_std= [ x[1] for x in results ] p_std= [ x[1] for x in results ]
``` ```
%% Cell type:markdown id: tags: %% Cell type:markdown id: tags:
We calculate the mean value of all images, as well as the standard deviation. We calculate the mean value of all images, as well as the standard deviation.
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
plt.subplot(221) plt.subplot(221)
plt.hist(mean, 50) plt.hist(mean, 50)
plt.xlabel('mean') plt.xlabel('mean')
plt.ylabel('counts') plt.ylabel('counts')
plt.title('Mean value') plt.title('Mean value')
plt.subplot(222) plt.subplot(222)
plt.hist(p_mean, 50) plt.hist(p_mean, 50)
plt.xlabel('mean parallel') plt.xlabel('mean parallel')
plt.ylabel('counts') plt.ylabel('counts')
plt.title('Parallel Mean value') plt.title('Parallel Mean value')
plt.subplot(223) plt.subplot(223)
plt.hist(std, 50) plt.hist(std, 50)
plt.xlabel('std') plt.xlabel('std')
plt.ylabel('counts') plt.ylabel('counts')
plt.title('Std value') plt.title('Std value')
plt.subplot(224) plt.subplot(224)
plt.hist(p_std, 50) plt.hist(p_std, 50)
plt.xlabel('std parallel') plt.xlabel('std parallel')
plt.ylabel('counts') plt.ylabel('counts')
plt.title('Parallel Std value') plt.title('Parallel Std value')
plt.subplots_adjust(top=0.99, bottom=0.01, left=0.01, right=0.99, hspace=0.7, wspace=0.35) plt.subplots_adjust(top=0.99, bottom=0.01, left=0.01, right=0.99, hspace=0.7, wspace=0.35)
plt.show() plt.show()
``` ```
%% Output %% Output
%% Cell type:code id: tags: %% Cell type:code id: tags:
``` python ``` python
``` ```
......
...@@ -84,7 +84,7 @@ notebooks = { ...@@ -84,7 +84,7 @@ notebooks = {
"TUTORIAL": { "TUTORIAL": {
"TEST": { "TEST": {
"notebook": "notebooks/Tutorial/calversion.ipynb", "notebook": "notebooks/Tutorial/calversion.ipynb",
"concurrency": {"parameter": None, "concurrency": {"parameter": "random_seed",
"default concurrency": None, "default concurrency": None,
"cluster cores": 32}, "cluster cores": 32},
}, },
......
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment