Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • calibration/pycalibration
1 result
Show changes
Showing
with 1845 additions and 639 deletions
docs/static/myMDC/run_9037_general_status.png

86.3 KiB

docs/static/tests/given_argument_example.png

40.3 KiB

docs/static/tests/manual_action.png

87.1 KiB

docs/static/webservice_job_db.png

59 KiB

docs/static/xfel_calibrate_diagrams/overview_all_services.png

48.8 KiB

docs/static/xfel_calibrate_diagrams/xfel-calibrate_cli_process.png

34.5 KiB

site_name: EuXFEL Offline calibration
theme:
name: "material"
features:
- navigation.tabs
- navigation.tabs.sticky
- navigation.sections
- navigation.top
- navigation.instant
- navigation.tracking
- search.suggest
- search.highlight
- content.tabs.link
- content.code.annotation
- content.code.copy
- navigation.indexes
- content.tooltips
- toc.follow
language: en
palette:
- scheme: light
toggle:
icon: material/lightbulb
name: Switch to dark mode
- scheme: slate
toggle:
icon: material/lightbulb-outline
name: Switch to light mode
markdown_extensions:
- abbr
- pymdownx.highlight:
linenums_style: pymdownx-inline
anchor_linenums: true
- pymdownx.superfences
- pymdownx.inlinehilite
- pymdownx.snippets:
auto_append:
- docs/includes/abbreviations.md
- pymdownx.tasklist
- pymdownx.arithmatex:
generic: true
- pymdownx.tabbed:
alternate_style: true
- pymdownx.details
- pymdownx.mark
- pymdownx.emoji:
- attr_list
- def_list
- footnotes
- md_in_html
- toc:
permalink: "%"
permalink: True
- admonition
- tables
- codehilite
extra_css:
- css/extra.css
- css/custom.css
plugins:
- glightbox
- search
- autorefs
- gen-files:
scripts:
- docs/gen_ref_pages.py
- literate-nav:
nav_file: SUMMARY.md
- section-index
- mkdocstrings:
handlers:
python:
import:
- https://docs.python-requests.org/en/master/objects.inv
# paths: [src/cal_tools]
docstring_style: "sphinx"
docstring_section_style: "list"
repo_url: https://git.xfel.eu/calibration/pycalibration
nav:
- index.md
- Operation:
- CALCAT: operation/calibration_database.md
- myMDC: operation/myMDC.md
- Available Calibration notebooks: operation/available_notebooks.md
- Calibration webservice:
- The webservice: operation/webservice.md
- Calibration Configuration: operation/calibration_configurations.md
- Development:
- Installation: development/installation.md
- Workflow: development/workflow.md
- How to write a notebook: development/how_to_write_xfel_calibrate_notebook_NBC.md
- Configuration: development/configuration.md
- Automated tests: development/testing_pipeline.md
- Code Reference: reference/
- Reference:
- FAQ: references/faq.md
- Changelog: references/changelog.md
copyright: |
&copy; 2018 <a href="https://www.xfel.eu/" target="_blank" rel="noopener">European XFEL</a>
\ No newline at end of file
This diff is collapsed.
%% Cell type:markdown id: tags:
# DSSC Characterize Dark Images #
Author: S. Hauf, Version: 0.1
The following code analyzes a set of dark images taken with the DSSC detector to deduce detector offsets and noise. Data for the detector is presented in one run and don't acquire multiple gain stages.
The notebook explicitely does what pyDetLib provides in its offset calculation method for streaming data.
%% Cell type:code id: tags:
``` python
cluster_profile = "noDB" # The ipcluster profile to use
in_folder = "/gpfs/exfel/exp/SQS/202131/p900210/raw" # path to input data, required
out_folder = "/gpfs/exfel/data/scratch/samartse/data/DSSC" # path to output to, required
metadata_folder = "" # Directory containing calibration_metadata.yml when run by xfel-calibrate
sequences = [0] # sequence files to evaluate.
modules = [-1] # modules to run for
run = 20 #run number in which data was recorded, required
karabo_id = "SQS_DET_DSSC1M-1" # karabo karabo_id
karabo_da = ['-1'] # a list of data aggregators names, Default [-1] for selecting all data aggregators
receiver_id = "{}CH0" # inset for receiver devices
path_template = 'RAW-R{:04d}-{}-S{:05d}.h5' # the template to use to access data
h5path = '/INSTRUMENT/{}/DET/{}:xtdf/image' # path in the HDF5 file to images
h5path_idx = '/INDEX/{}/DET/{}:xtdf/image' # path in the HDF5 file to images
slow_data_pattern = 'RAW-R{}-DA{}-S00000.h5'
use_dir_creation_date = True # use the dir creation date for determining the creation time
cal_db_interface = "tcp://max-exfl-cal001:8020" # the database interface to use
cal_db_timeout = 3000000 # timeout on caldb requests"
local_output = True # output constants locally
db_output = False # output constants to database
mem_cells = 0 # number of memory cells used, set to 0 to automatically infer
bias_voltage = 100 # detector bias voltage
rawversion = 2 # RAW file format version
thresholds_offset_sigma = 3. # thresholds in terms of n sigma noise for offset deduced bad pixels
thresholds_offset_hard = [4, 125] # thresholds in absolute ADU terms for offset deduced bad pixels,
# minimal threshold at 4 is set at hardware level, DSSC full range 0-511
thresholds_noise_sigma = 3. # thresholds in terms of n sigma noise for offset deduced bad pixels
thresholds_noise_hard = [0.001, 3] # thresholds in absolute ADU terms for offset deduced bad pixels
offset_numpy_algorithm = "mean"
high_res_badpix_3d = False # set this to True if you need high-resolution 3d bad pixel plots. Runtime: ~ 1h
slow_data_aggregators = [1,1,1,1] # quadrant/aggregator
slow_data_path = 'SQS_NQS_DSSC/FPGA/PPT_Q'
operation_mode = '' # Detector operation mode, optional
```
%% Cell type:code id: tags:
``` python
import os
import warnings
# imports and things that do not usually need to be changed
from datetime import datetime
warnings.filterwarnings('ignore')
from collections import OrderedDict
import h5py
import matplotlib
from ipyparallel import Client
from IPython.display import Latex, Markdown, display
matplotlib.use('agg')
import matplotlib.pyplot as plt
%matplotlib inline
import numpy as np
import tabulate
import yaml
from iCalibrationDB import Conditions, Constants, Detectors, Versions
from cal_tools.dssclib import get_dssc_ctrl_data, get_pulseid_checksum
from cal_tools.enums import BadPixels
from cal_tools.plotting import (
create_constant_overview,
plot_badpix_3d,
show_overview,
show_processed_modules,
)
from cal_tools.tools import (
get_dir_creation_date,
get_from_db,
get_notebook_name,
get_pdu_from_db,
get_random_db_interface,
get_report,
map_gain_stages,
parse_runs,
run_prop_seq_from_path,
save_const_to_h5,
send_to_db,
)
view = Client(profile=cluster_profile)[:]
view.use_dill()
# make sure a cluster is running with ipcluster start --n=32, give it a while to start
h5path = h5path.format(karabo_id, receiver_id)
h5path_idx = h5path_idx.format(karabo_id, receiver_id)
gain_names = ['High', 'Medium', 'Low']
if karabo_da[0] == '-1':
if modules[0] == -1:
modules = list(range(16))
karabo_da = ["DSSC{:02d}".format(i) for i in modules]
else:
modules = [int(x[-2:]) for x in karabo_da]
max_cells = mem_cells
offset_runs = OrderedDict()
offset_runs["high"] = run
creation_time=None
if use_dir_creation_date:
creation_time = get_dir_creation_date(in_folder, run)
print(f"Using {creation_time} as creation time of constant.")
run, prop, seq = run_prop_seq_from_path(in_folder)
dinstance = "DSSC1M1"
print(f"Detector in use is {karabo_id}")
cal_db_interface = get_random_db_interface(cal_db_interface)
```
%% Cell type:code id: tags:
``` python
print("Parameters are:")
print(f"Proposal: {prop}")
print(f"Memory cells: {mem_cells}/{max_cells}")
print("Runs: {}".format([ v for v in offset_runs.values()]))
print(f"Sequences: {sequences}")
print(f"Using DB: {db_output}")
print(f"Input: {in_folder}")
print(f"Output: {out_folder}")
print(f"Bias voltage: {bias_voltage}V")
file_loc = f'proposal:{prop} runs:{[ v for v in offset_runs.values()][0]}'
report = get_report(metadata_folder)
```
%% Cell type:markdown id: tags:
The following lines will create a queue of files which will the be executed module-parallel. Distinguishing between different gains.
%% Cell type:code id: tags:
``` python
# set everything up filewise
os.makedirs(out_folder, exist_ok=True)
gmf = map_gain_stages(in_folder, offset_runs, path_template, karabo_da, sequences)
gain_mapped_files, total_sequences, total_file_size = gmf
print(f"Will process a total of {total_sequences} file.")
```
%% Cell type:markdown id: tags:
## Calculate Offsets, Noise and Thresholds ##
The calculation is performed per-pixel and per-memory-cell. Offsets are simply the median value for a set of dark data taken at a given gain, noise the standard deviation, and gain-bit values the medians of the gain array.
%% Cell type:code id: tags:
``` python
import copy
from functools import partial
def characterize_module(cells, bp_thresh, rawversion, karabo_id, h5path, h5path_idx, inp):
import copy
import h5py
import numpy as np
from cal_tools.enums import BadPixels
from cal_tools.dssclib import get_num_cells
filename, channel = inp
h5path = h5path.format(channel)
h5path_idx = h5path_idx.format(channel)
if cells == 0:
cells = get_num_cells(filename, h5path)
if cells is None:
raise ValueError(f"ERROR! Empty image data file for channel {channel}")
print(f"Using {cells} memory cells")
pulseid_checksum = None
thresholds_offset_hard, thresholds_offset_sigma, thresholds_noise_hard, thresholds_noise_sigma = bp_thresh
infile = h5py.File(filename, "r")
if rawversion == 2:
count = np.squeeze(infile[f"{h5path_idx}/count"])
first = np.squeeze(infile[f"{h5path_idx}/first"])
last_index = int(first[count != 0][-1]+count[count != 0][-1])
first_index = int(first[count != 0][0])
else:
status = np.squeeze(infile[f"{h5path_idx}/status"])
if np.count_nonzero(status != 0) == 0:
return
last = np.squeeze(infile[f"{h5path_idx}/last"])
first = np.squeeze(infile[f"{h5path_idx}/first"])
last_index = int(last[status != 0][-1]) + 1
first_index = int(first[status != 0][0])
im = np.array(infile[f"{h5path}/data"][first_index:last_index,...])
cellIds = np.squeeze(infile[f"{h5path}/cellId"][first_index:last_index,...])
infile.close()
pulseid_checksum = get_pulseid_checksum(filename, h5path, h5path_idx)
im = im[:, 0, ...].astype(np.float32)
im = np.rollaxis(im, 2)
im = np.rollaxis(im, 2, 1)
mcells = cells
offset = np.zeros((im.shape[0], im.shape[1], mcells), dtype = np.float64)
noise = np.zeros((im.shape[0], im.shape[1], mcells), dtype = np.float64)
for cc in np.unique(cellIds[cellIds < mcells]):
cellidx = cellIds == cc
if offset_numpy_algorithm == "mean":
offset[...,cc] = np.mean(im[..., cellidx], axis=2)
else:
offset[...,cc] = np.median(im[..., cellidx], axis=2)
noise[...,cc] = np.std(im[..., cellidx], axis=2)
# bad pixels
bp = np.zeros(offset.shape, np.uint32)
# offset related bad pixels
offset_mn = np.nanmedian(offset, axis=(0,1))
offset_std = np.nanstd(offset, axis=(0,1))
bp[(offset < offset_mn-thresholds_offset_sigma*offset_std) |
(offset > offset_mn+thresholds_offset_sigma*offset_std)] |= BadPixels.OFFSET_OUT_OF_THRESHOLD.value
bp[(offset < thresholds_offset_hard[0]) | (offset > thresholds_offset_hard[1])] |= BadPixels.OFFSET_OUT_OF_THRESHOLD.value
bp[~np.isfinite(offset)] |= BadPixels.OFFSET_NOISE_EVAL_ERROR.value
# noise related bad pixels
noise_mn = np.nanmedian(noise, axis=(0,1))
noise_std = np.nanstd(noise, axis=(0,1))
bp[(noise < noise_mn-thresholds_noise_sigma*noise_std) |
(noise > noise_mn+thresholds_noise_sigma*noise_std)] |= BadPixels.NOISE_OUT_OF_THRESHOLD.value
bp[(noise < thresholds_noise_hard[0]) | (noise > thresholds_noise_hard[1])] |= BadPixels.NOISE_OUT_OF_THRESHOLD.value
bp[~np.isfinite(noise)] |= BadPixels.OFFSET_NOISE_EVAL_ERROR.value
return offset, noise, bp, cells, pulseid_checksum
offset_g = OrderedDict()
noise_g = OrderedDict()
gain_g = OrderedDict()
badpix_g = OrderedDict()
gg = 0
start = datetime.now()
all_cells = []
checksums = {}
try:
tGain, encodedGain, operatingFreq = get_dssc_ctrl_data(in_folder + "/r{:04d}/".format(offset_runs["high"]),
slow_data_pattern,
slow_data_aggregators,
offset_runs["high"], slow_data_path)
except IOError:
print("ERROR: Couldn't access slow data to read tGain, encodedGain, and operatingFreq \n")
for gain, mapped_files in gain_mapped_files.items():
inp = []
dones = []
for i in modules:
qm = "Q{}M{}".format(i//4 +1, i % 4 + 1)
if qm in mapped_files and not mapped_files[qm].empty():
fname_in = mapped_files[qm].get()
print("Process file: ", fname_in)
dones.append(mapped_files[qm].empty())
else:
continue
inp.append((fname_in, i))
p = partial(characterize_module, max_cells,
(thresholds_offset_hard, thresholds_offset_sigma,
thresholds_noise_hard, thresholds_noise_sigma), rawversion, karabo_id, h5path, h5path_idx)
results = list(map(p, inp))
for ii, r in enumerate(results):
i = modules[ii]
offset, noise, bp, thiscell, pulseid_checksum = r
all_cells.append(thiscell)
qm = "Q{}M{}".format(i//4 +1, i % 4 + 1)
if qm not in offset_g:
offset_g[qm] = np.zeros((offset.shape[0], offset.shape[1], offset.shape[2]))
noise_g[qm] = np.zeros_like(offset_g[qm])
badpix_g[qm] = np.zeros_like(offset_g[qm], np.uint32)
checksums[qm] = pulseid_checksum
offset_g[qm][...] = offset
noise_g[qm][...] = noise
badpix_g[qm][...] = bp
gg +=1
if len(all_cells) > 0:
max_cells = np.max(all_cells)
print(f"Using {max_cells} memory cells")
else:
raise ValueError("0 processed memory cells. No raw data available.")
```
%% Cell type:code id: tags:
``` python
# TODO: add db_module when received from myMDC
# Create the modules dict of karabo_das and PDUs
qm_dict = OrderedDict()
for i, k_da in zip(modules, karabo_da):
qm = f"Q{i//4+1}M{i%4+1}"
qm_dict[qm] = {"karabo_da": k_da,
"db_module": ""}
```
%% Cell type:code id: tags:
``` python
# Retrieve existing constants for comparison
clist = ["Offset", "Noise"]
old_const = {}
old_mdata = {}
print('Retrieve pre-existing constants for comparison.')
for qm in offset_g.keys():
old_const[qm] = {}
old_mdata[qm] = {}
qm_db = qm_dict[qm]
karabo_da = qm_db["karabo_da"]
for const in clist:
dconst =getattr(Constants.DSSC, const)()
condition = Conditions.Dark.DSSC(memory_cells=max_cells,
bias_voltage=bias_voltage,
pulseid_checksum=checksums[qm],
acquisition_rate=operatingFreq[qm],
target_gain=tGain[qm],
encoded_gain=encodedGain[qm])
# This should be used in case of running notebook
# by a different method other than myMDC which already
# sends CalCat info.
# TODO: Set db_module to "" by default in the first cell
if not qm_db["db_module"]:
qm_db["db_module"] = get_pdu_from_db(karabo_id, karabo_da, dconst,
condition, cal_db_interface,
snapshot_at=creation_time)[0]
data, mdata = get_from_db(karabo_id, karabo_da,
dconst,
condition,
None,
cal_db_interface, creation_time=creation_time,
verbosity=2, timeout=cal_db_timeout)
old_const[qm][const] = data
if mdata is None or data is None:
old_mdata[qm][const] = {
"timestamp": "Not found",
"filepath": None,
"h5path": None
}
else:
old_mdata[qm][const] = {
"timestamp": mdata.calibration_constant_version.begin_at.isoformat(),
"filepath": os.path.join(
mdata.calibration_constant_version.hdf5path,
mdata.calibration_constant_version.filename,
),
"h5path": mdata.calibration_constant_version.h5path,
}
with open(f"{out_folder}/module_metadata_{qm}.yml", "w") as fd:
yaml.safe_dump(
{"module": qm, "pdu": qm_db["db_module"], "old-constants": old_mdata[qm]},
fd,
)
```
%% Cell type:code id: tags:
``` python
res = OrderedDict()
for i in modules:
qm = f"Q{i//4+1}M{i%4+1}"
try:
res[qm] = {'Offset': offset_g[qm],
'Noise': noise_g[qm],
}
except Exception as e:
print(f"Error: No constants for {qm}: {e}")
```
%% Cell type:code id: tags:
``` python
# Push the same constant two different times.
# One with the generated pulseID check sum setting for the offline calibration.
# And another for the online calibration as it doesn't have this pulseID checksum, yet.
md = None
for dont_use_pulseIds in [True, False]:
for qm in res.keys():
karabo_da = qm_dict[qm]["karabo_da"]
db_module = qm_dict[qm]["db_module"]
for const in res[qm].keys():
dconst = getattr(Constants.DSSC, const)()
dconst.data = res[qm][const]
opfreq = None if dont_use_pulseIds else operatingFreq[qm]
targetgain = None if dont_use_pulseIds else tGain[qm]
encodedgain = None if dont_use_pulseIds else encodedGain[qm]
pidsum = None if dont_use_pulseIds else checksums[qm]
# set the operating condition
condition = Conditions.Dark.DSSC(memory_cells=max_cells,
bias_voltage=bias_voltage,
pulseid_checksum=pidsum,
acquisition_rate=opfreq,
target_gain=targetgain,
encoded_gain=encodedgain)
for parm in condition.parameters:
if parm.name == "Memory cells":
parm.lower_deviation = max_cells
parm.upper_deviation = 0
if db_output:
md = send_to_db(db_module, karabo_id, dconst, condition, file_loc, report,
cal_db_interface, creation_time=creation_time, timeout=cal_db_timeout)
if local_output and dont_use_pulseIds: # Don't save constant localy two times.
md = save_const_to_h5(db_module, karabo_id, dconst, condition,
dconst.data, file_loc, report,
creation_time, out_folder)
print(f"Calibration constant {const} is stored locally.\n")
if not dont_use_pulseIds:
print("Constants parameter conditions are:\n")
print(f"• memory_cells: {max_cells}\n• bias_voltage: {bias_voltage}\n"
f"• pulseid_checksum: {pidsum}\n• acquisition_rate: {opfreq}\n"
f"• target_gain: {targetgain}\n• encoded_gain: {encodedgain}\n"
f"• creation_time: {creation_time}\n")
```
%% Cell type:code id: tags:
``` python
mnames = []
for i in modules:
qm = f"Q{i//4+1}M{i % 4+1}"
display(Markdown(f'## Position of the module {qm} and its ASICs##'))
mnames.append(qm)
show_processed_modules(dinstance=dinstance, constants=None, mnames=mnames, mode="position")
show_processed_modules(karabo_id, constants=None, mnames=mnames, mode="position")
```
%% Cell type:markdown id: tags:
## Single-Cell Overviews ##
Single cell overviews allow to identify potential effects on all memory cells, e.g. on sensor level. Additionally, they should serve as a first sanity check on expected behaviour, e.g. if structuring on the ASIC level is visible in the offsets, but otherwise no immediate artifacts are visible.
%% Cell type:code id: tags:
``` python
cell = 9
gain = 0
out_folder = None
show_overview(res, cell, gain, out_folder=out_folder, infix="_{}".format(run))
```
%% Cell type:code id: tags:
``` python
cols = {BadPixels.NOISE_OUT_OF_THRESHOLD.value: (BadPixels.NOISE_OUT_OF_THRESHOLD.name, '#FF000080'),
BadPixels.OFFSET_NOISE_EVAL_ERROR.value: (BadPixels.OFFSET_NOISE_EVAL_ERROR.name, '#0000FF80'),
BadPixels.OFFSET_OUT_OF_THRESHOLD.value: (BadPixels.OFFSET_OUT_OF_THRESHOLD.name, '#00FF0080'),
BadPixels.OFFSET_OUT_OF_THRESHOLD.value | BadPixels.NOISE_OUT_OF_THRESHOLD.value: ('MIXED', '#DD00DD80')}
if high_res_badpix_3d:
display(Markdown("""
## Global Bad Pixel Behaviour ##
The following plots show the results of bad pixel evaluation for all evaluated memory cells.
Cells are stacked in the Z-dimension, while pixels values in x/y are rebinned with a factor of 2.
This excludes single bad pixels present only in disconnected pixels.
Hence, any bad pixels spanning at least 4 pixels in the x/y-plane, or across at least two memory cells are indicated.
Colors encode the bad pixel type, or mixed type.
"""))
# set rebin_fac to 1 for avoiding rebining and
# losing real values of badpixels(High resolution).
gain = 0
for mod, data in badpix_g.items():
plot_badpix_3d(data, cols, title=mod, rebin_fac=2)
plt.show()
```
%% Cell type:markdown id: tags:
## Aggregate values, and per Cell behaviour ##
The following tables and plots give an overview of statistical aggregates for each constant, as well as per cell behavior.
%% Cell type:code id: tags:
``` python
create_constant_overview(offset_g, "Offset (ADU)", max_cells, entries=1)
```
%% Cell type:code id: tags:
``` python
create_constant_overview(noise_g, "Noise (ADU)", max_cells, 0, 100, entries=1)
```
%% Cell type:code id: tags:
``` python
bad_pixel_aggregate_g = OrderedDict()
for m, d in badpix_g.items():
bad_pixel_aggregate_g[m] = d.astype(np.bool).astype(np.float)
create_constant_overview(bad_pixel_aggregate_g, "Bad pixel fraction", max_cells, entries=1)
```
%% Cell type:markdown id: tags:
## Summary tables ##
The following tables show summary information for the evaluated module. Values for currently evaluated constants are compared with values for pre-existing constants retrieved from the calibration database.
%% Cell type:code id: tags:
``` python
time_summary = []
for qm, qm_data in old_mdata.items():
time_summary.append(f"The following pre-existing constants are used for comparison for module {qm}:")
for const, const_data in qm_data.items():
time_summary.append(f"- {const} created at {const_data['timestamp']}")
display(Markdown("\n".join(time_summary)))
```
%% Cell type:code id: tags:
``` python
header = ['Parameter',
"New constant", "Old constant ",
"New constant", "Old constant ",
"New constant", "Old constant "]
for const in ['Offset', 'Noise']:
table = [['','High gain', 'High gain']]
for qm in res.keys():
data = np.copy(res[qm][const])
if old_const[qm][const] is not None:
dataold = np.copy(old_const[qm][const])
f_list = [np.nanmedian, np.nanmean, np.nanstd, np.nanmin, np.nanmax]
n_list = ['Median', 'Mean', 'Std', 'Min', 'Max']
for i, f in enumerate(f_list):
line = [n_list[i]]
line.append('{:6.1f}'.format(f(data[...,gain])))
if old_const[qm][const] is not None:
line.append('{:6.1f}'.format(f(dataold[...,gain])))
else:
line.append('-')
table.append(line)
display(Markdown('### {} [ADU], good and bad pixels ###'.format(const)))
md = display(Latex(tabulate.tabulate(table, tablefmt='latex', headers=header)))
```
......
This diff is collapsed.
%% Cell type:markdown id: tags:
# Jungfrau Dark Image Characterization #
Author: European XFEL Detector Group, Version: 2.0
Analyzes Jungfrau dark image data to deduce offset, noise and resulting bad pixel maps
%% Cell type:code id: tags:
``` python
in_folder = '/gpfs/exfel/exp/SPB/202130/p900204/raw/' # folder under which runs are located, required
out_folder = '/gpfs/exfel/data/scratch/ahmedk/test/remove' # path to place reports at, required
metadata_folder = '' # Directory containing calibration_metadata.yml when run by xfel-calibrate
run_high = 141 # run number for G0 dark run, required
run_med = 142 # run number for G1 dark run, required
run_low = 143 # run number for G2 dark run, required
# Parameters used to access raw data.
karabo_da = ['JNGFR01', 'JNGFR02','JNGFR03','JNGFR04', 'JNGFR05', 'JNGFR06','JNGFR07','JNGFR08'] # list of data aggregators, which corresponds to different JF modules
karabo_id = 'SPB_IRDA_JF4M' # karabo_id (detector identifier) prefix of Jungfrau detector to process.
karabo_id_control = '' # if control is on a different ID, set to empty string if it is the same a karabo-id
receiver_template = 'JNGFR{:02}' # inset for receiver devices
instrument_source_template = '{}/DET/{}:daqOutput' # template for instrument source name (filled with karabo_id & receiver_id). e.g. 'SPB_IRDA_JF4M/DET/JNGFR01:daqOutput'
ctrl_source_template = '{}/DET/CONTROL' # template for control source name (filled with karabo_id_control)
# Parameters for calibration database and storing constants.
use_dir_creation_date = True # use dir creation date
cal_db_interface = 'tcp://max-exfl-cal001:8016#8045' # calibrate db interface to connect to
cal_db_timeout = 300000 # timeout on caldb requests
local_output = True # output constants locally
db_output = False # output constants to database
# Parameters affecting creating dark calibration constants.
badpixel_threshold_sigma = 5. # bad pixels defined by values outside n times this std from median
offset_abs_threshold_low = [1000, 10000, 10000] # absolute bad pixel threshold in terms of offset, lower values
offset_abs_threshold_high = [8000, 15000, 15000] # absolute bad pixel threshold in terms of offset, upper values
max_trains = 1000 # Maximum trains to process darks. Set to 0 to process all available train images. 1000 trains is enough resolution to create the dark constants
min_trains = 100 # Minimum number of trains to process dark constants. Raise a warning if the run has fewer trains.
manual_slow_data = False # if true, use manually entered bias_voltage and integration_time values
time_limits = 0.025 # to find calibration constants later on, the integration time is allowed to vary by 0.5 us
# Parameters to be used for injecting dark calibration constants.
integration_time = -1 # Integration time in us. Set to -1 to overwrite by value in file.
gain_setting = -1 # 0 for dynamic, forceswitchg1, forceswitchg2, 1 for dynamichg0, fixgain1, fixgain2. Set to overwrite by value in file.
gain_mode = -1 # 1 if medium and low runs are fixgain1 and fixgain2, otherwise 0. Set to -1 to overwrite by value in file.
bias_voltage = -1 # sensor bias voltage in V, will be overwritten by value in file
memory_cells = -1 # Number of memory cells.
# Parameters used for plotting
detailed_report = False
# TODO: this is used for only Warning check at AGIPD dark.
# Need to rethink if it makes sense to use it here as well.
operation_mode = 'ADAPTIVE_GAIN' # Detector operation mode, optional
```
%% Cell type:code id: tags:
``` python
import os
import warnings
from logging import warning
warnings.filterwarnings('ignore')
import matplotlib
import matplotlib.pyplot as plt
import multiprocessing
import numpy as np
import pasha as psh
import yaml
from IPython.display import Markdown, display
from extra_data import RunDirectory
matplotlib.use('agg')
%matplotlib inline
from XFELDetAna.plotting.heatmap import heatmapPlot
from XFELDetAna.plotting.histogram import histPlot
from cal_tools import step_timing
from cal_tools.jungfrau import jungfraulib
from cal_tools.enums import BadPixels, JungfrauGainMode
from cal_tools.tools import (
get_dir_creation_date,
get_pdu_from_db,
get_random_db_interface,
get_report,
save_const_to_h5,
send_to_db,
)
from iCalibrationDB import Conditions, Constants
```
%% Cell type:code id: tags:
``` python
# Constants relevant for the analysis
run_nums = [run_high, run_med, run_low] # run number for G0/HG0, G1, G2
sensor_size = (1024, 512)
gains = [0, 1, 2]
fixed_settings = [
JungfrauGainMode.FIX_GAIN_1.value, JungfrauGainMode.FIX_GAIN_2.value]
dynamic_settings = [
JungfrauGainMode.FORCE_SWITCH_HG1.value, JungfrauGainMode.FORCE_SWITCH_HG2.value]
old_fixed_settings = ["fixgain1", "fixgain2"]
creation_time = None
if use_dir_creation_date:
creation_time = get_dir_creation_date(in_folder, run_high)
print(f"Using {creation_time} as creation time")
os.makedirs(out_folder, exist_ok=True)
cal_db_interface = get_random_db_interface(cal_db_interface)
print(f'Calibration database interface: {cal_db_interface}')
if karabo_id_control == "":
karabo_id_control = karabo_id
```
%% Cell type:code id: tags:
``` python
proposal = list(filter(None, in_folder.strip('/').split('/')))[-2]
file_loc = f"proposal:{proposal} runs:{run_high} {run_med} {run_low}"
report = get_report(metadata_folder)
step_timer = step_timing.StepTimer()
```
%% Cell type:markdown id: tags:
## Reading control data
%% Cell type:code id: tags:
``` python
step_timer.start()
gain_runs = dict()
med_low_settings = []
ctrl_src = ctrl_source_template.format(karabo_id_control)
run_nums = jungfraulib.sort_runs_by_gain(
raw_folder=in_folder,
runs=run_nums,
ctrl_src=ctrl_src,
)
_gain_mode = None
for gain, run_n in enumerate(run_nums):
run_dc = RunDirectory(f"{in_folder}/r{run_n:04d}/")
gain_runs[run_n] = [gain, run_dc]
ctrl_data = jungfraulib.JungfrauCtrl(run_dc, ctrl_src)
# Read control data for the high gain run only.
if gain == 0:
run_mcells, sc_start = ctrl_data.get_memory_cells()
if integration_time < 0:
integration_time = ctrl_data.get_integration_time()
print(f"Integration time is {integration_time} us.")
else:
print(f"Integration time is manually set to {integration_time} us.")
if bias_voltage < 0:
bias_voltage = ctrl_data.get_bias_voltage()
print(f"Bias voltage is {bias_voltage} V.")
else:
print(f"Bias voltage is manually set to {bias_voltage} V.")
if gain_setting < 0:
gain_setting = ctrl_data.get_gain_setting()
print(f"Gain setting is {gain_setting} ({ctrl_data.run_settings})")
else:
print(f"Gain setting is manually set to {gain_setting}.")
if run_mcells == 1:
memory_cells = 1
print('Dark runs in single cell mode, '
f'storage cell start: {sc_start:02d}')
else:
memory_cells = 16
print('Dark runs in burst mode, '
f'storage cell start: {sc_start:02d}')
else: # medium and low gain
_gain_mode = ctrl_data.get_gain_mode()
med_low_settings.append(ctrl_data.run_mode)
# TODO: consider updating this cell into something similar to agipdlib.AgipdCtrlsRuns()
if gain_mode < 0:
gain_mode = _gain_mode
print(f"Gain mode is {gain_mode} ({med_low_settings})")
else:
print(f"Gain mode is manually set to {gain_mode}.")
step_timer.done_step(f'Reading control data.')
```
%% Cell type:code id: tags:
``` python
step_timer.start()
# set the operating condition
condition = Conditions.Dark.jungfrau(
memory_cells=memory_cells,
bias_voltage=bias_voltage,
integration_time=integration_time,
gain_setting=gain_setting,
gain_mode=gain_mode,
)
db_modules = get_pdu_from_db(
karabo_id=karabo_id,
karabo_da=karabo_da,
constant=Constants.jungfrau.Offset(),
condition=condition,
cal_db_interface=cal_db_interface,
snapshot_at=creation_time)
step_timer.done_step('Set conditions and get PDU names from CalCat.')
```
%% Cell type:code id: tags:
``` python
# Start retrieving existing constants for comparison
step_timer.start()
mod_x_const = [(mod, const) for const in ["Offset", "Noise", "BadPixelsDark"] for mod in karabo_da]
from cal_tools.tools import get_from_db
from datetime import timedelta
def retrieve_old_constant(mod, const):
dconst = getattr(Constants.jungfrau, const)()
data, mdata = get_from_db(
karabo_id=karabo_id,
karabo_da=mod,
constant=dconst,
condition=condition,
empty_constant=None,
cal_db_interface=cal_db_interface,
creation_time=creation_time-timedelta(seconds=60) if creation_time else None,
strategy="pdu_prior_in_time",
verbosity=1,
timeout=cal_db_timeout
)
if mdata is None or data is None:
timestamp = "Not found"
filepath = None
h5path = None
else:
timestamp = mdata.calibration_constant_version.begin_at.isoformat()
filepath = os.path.join(
mdata.calibration_constant_version.hdf5path,
mdata.calibration_constant_version.filename
)
h5path = mdata.calibration_constant_version.h5path
return data, timestamp, filepath, h5path
old_retrieval_pool = multiprocessing.Pool()
old_retrieval_res = old_retrieval_pool.starmap_async(
retrieve_old_constant, mod_x_const
)
old_retrieval_pool.close()
step_timer.done_step('Retrieved old dark constants for comparison.')
```
%% Cell type:code id: tags:
``` python
# Use only high gain threshold for all gains in case of fixed_gain.
if gain_mode: # fixed_gain
offset_abs_threshold = [[offset_abs_threshold_low[0]]*3, [offset_abs_threshold_high[0]]*3]
else:
offset_abs_threshold = [offset_abs_threshold_low, offset_abs_threshold_high]
```
%% Cell type:code id: tags:
``` python
context = psh.context.ThreadContext(num_workers=memory_cells)
```
%% Cell type:code id: tags:
``` python
"""
All jungfrau runs are taken through one acquisition, except for the forceswitch runs.
While taking non-fixed dark runs, a procedure of multiple acquisitions is used to switch the storage cell indices.
This is done for medium and low gain dark dynamic runs, only [forceswitchg1, forceswitchg2]:
Switching the cell indices in burst mode is a work around for hardware procedure
deficiency that produces wrong data for dark runs except for the first storage cell.
This is why multiple acquisitions are taken to switch the used storage cells and
acquire data through two cells for each of the 16 cells instead of acquiring darks through all 16 cells.
"""
print(f"Maximum trains to process is set to {max_trains}")
noise_map = dict()
offset_map = dict()
bad_pixels_map = dict()
for mod in karabo_da:
step_timer.start()
instrument_src = instrument_source_template.format(
karabo_id, receiver_template.format(int(mod[-2:])))
print(f"\n- Instrument data path for {mod} is {instrument_src}.")
# (1024, 512, 1 or 16, 3)
offset_map[mod] = context.alloc(
shape=(sensor_size+(memory_cells, 3)), fill=0, dtype=np.float32)
noise_map[mod] = context.alloc(like=offset_map[mod], fill=0)
bad_pixels_map[mod] = context.alloc(shape=offset_map[mod].shape, dtype=np.uint32, fill=0)
for run_n, [gain, run_dc] in gain_runs.items():
def process_cell(worker_id, array_index, cell_number):
cell_slice_idx = acelltable == cell_number
if cell_slice_idx.sum() == 0:
# This cell is not in the data (or it's deliberated excluded)
bad_pixels_map[mod][..., cell_number, gain] = BadPixels.NO_DARK_DATA.value
offset_map[mod][..., cell_number, gain] = np.nan
noise_map[mod][..., cell_number, gain] = np.nan
return
thiscell = images[..., cell_slice_idx] # [1024, 512, n_trains]
# Identify cells/trains with images of 0 pixels.
# TODO: An investigation is ongoing by DET to identify reason for these empty images.
nonzero_adc = np.any(thiscell != 0 , axis=(0, 1)) # [n_trains]
# Exclude empty images with 0 pixels, before calculating offset and noise
thiscell = thiscell[..., nonzero_adc]
offset_map[mod][..., cell_number, gain] = np.mean( # [1024, 512]
thiscell, axis=2, dtype=np.float32)
noise_map[mod][..., cell_number, gain] = np.std( # [1024, 512]
thiscell, axis=2, dtype=np.float32)
del thiscell
# Check if there are wrong bad gain values.
# 1. Exclude empty images.
# 2. Indicate pixels with wrong gain value for any train for each cell.
# TODO: mean is used to use thresholds for accepting gain values, even if not 0 mean value.
gain_avg = np.mean( # [1024, 512]
gain_vals[..., cell_slice_idx][..., nonzero_adc],
axis=2, dtype=np.float32
)
# Assign WRONG_GAIN_VALUE for a pixel in a badpixel map for all gains.
bad_pixels_map[mod][:, :,cell_number][gain_avg != raw_g] |= BadPixels.WRONG_GAIN_VALUE.value
print(f"Gain stage {gain}, run {run_n}")
# load shape of data for memory cells, and detector size (imgs, cells, x, y)
n_trains = run_dc[instrument_src, "data.adc"].shape[0]
# load number of data available, including trains with empty data.
all_trains = len(run_dc.train_ids)
instr_dc = run_dc.select(instrument_src, require_all=True)
empty_trains = all_trains - n_trains
if empty_trains != 0:
print(f"{mod} has {empty_trains} empty trains out of {all_trains} trains")
if max_trains > 0:
n_trains = min(n_trains, max_trains)
print(f"Processing {n_trains} images.")
if n_trains == 0:
raise ValueError(f"{run_n} has no trains to process.")
if n_trains < min_trains:
warning(f"Less than {min_trains} trains are available in RAW data.")
# Select only requested number of images to process darks.
instr_dc = instr_dc.select_trains(np.s_[:n_trains])
images = np.transpose(
instr_dc[instrument_src, "data.adc"].ndarray(), (3, 2, 1, 0))
acelltable = np.transpose(instr_dc[instrument_src, "data.memoryCell"].ndarray())
gain_vals = np.transpose(
instr_dc[instrument_src, "data.gain"].ndarray(), (3, 2, 1, 0))
# define gain value as saved in raw gain map
raw_g = 3 if gain == 2 else gain
if memory_cells == 1:
acelltable -= sc_start
# Only for dynamic medium and low gain runs [forceswitchg1, forceswitchg2] in burst mode.
if gain_mode == 0 and gain > 0 and memory_cells == 16:
if (
gain_mode == 0 and # dynamic gain mode
gain > 0 and # Medium and low runs
memory_cells == 16 and # Burst mode
acelltable.shape[0] == 2 # forceswitchg1 and forceswitchg2 acquired with the MDL device.
):
# 255 similar to the receiver which uses the 255
# value to indicate a cell without an image.
# image shape for forceswitchg1 and forceswitchg2 = (1024, 512, 2, trains)
# compared to expected shape of (1024, 512, 16, trains) for high gain run.
acelltable[1:] = 255
# Calculate offset and noise maps
context.map(process_cell, range(memory_cells))
cells_missing = (bad_pixels_map[mod][0, 0, :, gain] & BadPixels.NO_DARK_DATA) > 0
if np.any(cells_missing):
print(f"No dark data in gain stage {gain} found for cells", np.nonzero(cells_missing)[0])
del images
del acelltable
del gain_vals
step_timer.done_step('Creating Offset and noise constants for a module.')
```
%% Cell type:code id: tags:
``` python
if detailed_report:
display(Markdown("## Offset and Noise Maps:"))
display(Markdown(
"Below offset and noise maps for the high ($g_0$) gain stage are shown, "
"alongside the distribution of these values. One expects block-like "
"structures mapping to the ASICs of the detector"))
g_name = ['G0', 'G1', 'G2']
g_range = [(0, 8000), (8000, 16000), (8000, 16000)]
n_range = [(0., 50.), (0., 50.), (0., 50.)]
unit = '[ADCu]'
# TODO: Fix plots arrangment and speed for Jungfrau burst mode.
step_timer.start()
for pdu, mod in zip(db_modules, karabo_da):
for g_idx in gains:
for cell in range(0, memory_cells):
f_o0 = heatmapPlot(
np.swapaxes(offset_map[mod][..., cell, g_idx], 0, 1),
y_label="Row",
x_label="Column",
lut_label=unit,
aspect=1.,
vmin=g_range[g_idx][0],
vmax=g_range[g_idx][1],
title=f'Pedestal {g_name[g_idx]} - Cell {cell:02d} - Module {mod} ({pdu})')
fo0, ax_o0 = plt.subplots()
res_o0 = histPlot(
ax_o0, offset_map[mod][..., cell, g_idx],
bins=800,
range=g_range[g_idx],
facecolor='b',
histotype='stepfilled',
)
ax_o0.tick_params(axis='both',which='major',labelsize=15)
ax_o0.set_title(
f'Module pedestal distribution - Cell {cell:02d} - Module {mod} ({pdu})',
fontsize=15)
ax_o0.set_xlabel(f'Pedestal {g_name[g_idx]} {unit}',fontsize=15)
ax_o0.set_yscale('log')
f_n0 = heatmapPlot(
np.swapaxes(noise_map[mod][..., cell, g_idx], 0, 1),
y_label="Row",
x_label="Column",
lut_label= unit,
aspect=1.,
vmin=n_range[g_idx][0],
vmax=n_range[g_idx][1],
title=f"RMS noise {g_name[g_idx]} - Cell {cell:02d} - Module {mod} ({pdu})",
)
fn0, ax_n0 = plt.subplots()
res_n0 = histPlot(
ax_n0,
noise_map[mod][..., cell, g_idx],
bins=100,
range=n_range[g_idx],
facecolor='b',
histotype='stepfilled',
)
ax_n0.tick_params(axis='both', which='major', labelsize=15)
ax_n0.set_title(
f'Module noise distribution - Cell {cell:02d} - Module {mod} ({pdu})',
fontsize=15)
ax_n0.set_xlabel(
f'RMS noise {g_name[g_idx]} ' + unit, fontsize=15)
plt.show()
step_timer.done_step('Plotting offset and noise maps.')
```
%% Cell type:markdown id: tags:
## Bad Pixel Map ###
The bad pixel map is deduced by comparing offset and noise of each pixel ($v_i$) and each gain ($g$) against the median value for that gain stage:
$$
v_i > \mathrm{median}(v_{k,g}) + n \sigma_{v_{k,g}}
$$
or
$$
v_i < \mathrm{median}(v_{k,g}) - n \sigma_{v_{k,g}}
$$
Values are encoded in a 32 bit mask, where for the dark image deduced bad pixels the following non-zero entries are relevant:
%% Cell type:code id: tags:
``` python
def print_bp_entry(bp):
print("{:<30s} {:032b} -> {}".format(bp.name, bp.value, int(bp.value)))
print_bp_entry(BadPixels.OFFSET_OUT_OF_THRESHOLD)
print_bp_entry(BadPixels.NOISE_OUT_OF_THRESHOLD)
print_bp_entry(BadPixels.OFFSET_NOISE_EVAL_ERROR)
print_bp_entry(BadPixels.NO_DARK_DATA)
print_bp_entry(BadPixels.WRONG_GAIN_VALUE)
def eval_bpidx(d):
mdn = np.nanmedian(d, axis=(0, 1))[None, None, :, :]
std = np.nanstd(d, axis=(0, 1))[None, None, :, :]
idx = (d > badpixel_threshold_sigma*std+mdn) | (d < (-badpixel_threshold_sigma)*std+mdn)
return idx
```
%% Cell type:code id: tags:
``` python
step_timer.start()
for pdu, mod in zip(db_modules, karabo_da):
display(Markdown(f"### Badpixels for module {mod} ({pdu}):"))
offset_abs_threshold = np.array(offset_abs_threshold)
bad_pixels_map[mod][eval_bpidx(offset_map[mod])] |= BadPixels.OFFSET_OUT_OF_THRESHOLD.value
bad_pixels_map[mod][~np.isfinite(offset_map[mod])] |= BadPixels.OFFSET_NOISE_EVAL_ERROR.value
bad_pixels_map[mod][eval_bpidx(noise_map[mod])] |= BadPixels.NOISE_OUT_OF_THRESHOLD.value
bad_pixels_map[mod][~np.isfinite(noise_map[mod])] |= BadPixels.OFFSET_NOISE_EVAL_ERROR.value
bad_pixels_map[mod][(offset_map[mod] < offset_abs_threshold[0][None, None, None, :]) | (offset_map[mod] > offset_abs_threshold[1][None, None, None, :])] |= BadPixels.OFFSET_OUT_OF_THRESHOLD.value # noqa
if detailed_report:
for g_idx in gains:
for cell in range(memory_cells):
bad_pixels = bad_pixels_map[mod][:, :, cell, g_idx]
fn_0 = heatmapPlot(
np.swapaxes(bad_pixels, 0, 1),
y_label="Row",
x_label="Column",
lut_label=f"Badpixels {g_name[g_idx]} [ADCu]",
aspect=1.,
vmin=0, vmax=5,
title=f'G{g_idx} Bad pixel map - Cell {cell:02d} - Module {mod} ({pdu})')
step_timer.done_step('Creating bad pixels constant')
```
%% Cell type:markdown id: tags:
## Inject and save calibration constants
%% Cell type:code id: tags:
``` python
step_timer.start()
for mod, db_mod in zip(karabo_da, db_modules):
constants = {
'Offset': np.moveaxis(offset_map[mod], 0, 1),
'Noise': np.moveaxis(noise_map[mod], 0, 1),
'BadPixelsDark': np.moveaxis(bad_pixels_map[mod], 0, 1),
}
md = None
for key, const_data in constants.items():
const = getattr(Constants.jungfrau, key)()
const.data = const_data
for parm in condition.parameters:
if parm.name == "Integration Time":
parm.lower_deviation = time_limits
parm.upper_deviation = time_limits
if db_output:
md = send_to_db(
db_module=db_mod,
karabo_id=karabo_id,
constant=const,
condition=condition,
file_loc=file_loc,
report_path=report,
cal_db_interface=cal_db_interface,
creation_time=creation_time,
timeout=cal_db_timeout,
)
if local_output:
md = save_const_to_h5(
db_module=db_mod,
karabo_id=karabo_id,
constant=const,
condition=condition,
data=const.data,
file_loc=file_loc,
report=report,
creation_time=creation_time,
out_folder=out_folder,
)
print(f"Calibration constant {key} is stored locally at {out_folder}.\n")
print("Constants parameter conditions are:\n")
print(
f"• Bias voltage: {bias_voltage}\n"
f"• Memory cells: {memory_cells}\n"
f"• Integration time: {integration_time}\n"
f"• Gain setting: {gain_setting}\n"
f"• Gain mode: {gain_mode}\n"
f"• Creation time: {md.calibration_constant_version.begin_at if md is not None else creation_time}\n") # noqa
step_timer.done_step("Injecting constants.")
```
%% Cell type:code id: tags:
``` python
print(f"Total processing time {step_timer.timespan():.01f} s")
step_timer.print_summary()
```
%% Cell type:code id: tags:
``` python
# now we need the old constants
old_const = {}
old_mdata = {}
old_retrieval_res.wait()
for (mod, const), (data, timestamp, filepath, h5path) in zip(
mod_x_const, old_retrieval_res.get()):
old_const.setdefault(mod, {})[const] = data
old_mdata.setdefault(mod, {})[const] = {
"timestamp": timestamp,
"filepath": filepath,
"h5path": h5path,
}
```
%% Cell type:code id: tags:
``` python
display(Markdown("## The following pre-existing constants are used for comparison:"))
for mod, consts in old_mdata.items():
pdu = db_modules[karabo_da.index(mod)]
display(Markdown(f"- {mod} ({pdu})"))
for const in consts:
display(Markdown(f" - {const} at {consts[const]['timestamp']}"))
# saving locations of old constants for summary notebook
with open(f"{metadata_folder or out_folder}/module_metadata_{mod}.yml", "w") as fd:
yaml.safe_dump(
{
"module": mod,
"pdu": pdu,
"old-constants": old_mdata[mod],
},
fd,
)
```
......
This diff is collapsed.