Skip to content
Snippets Groups Projects

Compare revisions

Changes are shown as if the source revision was being merged into the target revision. Learn more about comparing revisions.

Source

Select target project
No results found

Target

Select target project
  • calibration/pycalibration
1 result
Show changes
Commits on Source (334)
Showing
with 1830 additions and 236 deletions
......@@ -8,8 +8,6 @@
*.npy
*.out
*.pkl
*.png
*.png
*.secrets.yaml
*.so
*.tar
......
......@@ -55,7 +55,6 @@ pytest:
automated_test:
variables:
OUTPUT: $CI_MERGE_REQUEST_SOURCE_BRANCH_NAME
REFERENCE: reference_folder
DETECTORS: all
CALIBRATION: all
stage: automated_test
......@@ -65,10 +64,10 @@ automated_test:
<<: *before_script
script:
- export LANG=C # Hopefully detect anything relying on locale
- python3 -m pip install ".[automated_test]"
- python3 -m pip install ".[test]"
- echo "Running automated test. This can take sometime to finish depending on the test data."
- echo "Given variables are REFERENCE=$REFERENCE, OUTPUT=$OUTPUT, DETECTORS=$DETECTORS, CALIBRATION=$CALIBRATION"
- python3 -m pytest ./tests/test_reference_runs --color yes --verbose --release-test --reference-folder /gpfs/exfel/data/scratch/xcaltst/test/$REFERENCE --out-folder /gpfs/exfel/data/scratch/xcaltst/test/$OUTPUT --detectors $DETECTORS --calibration $CALIBRATION --find-difference
- python3 -m pytest ./tests/test_reference_runs --color yes --verbose --release-test --reference-folder /gpfs/exfel/d/cal_tst/reference_folder --out-folder /gpfs/exfel/data/scratch/xcaltst/test/$OUTPUT --detectors $DETECTORS --calibration $CALIBRATION
timeout: 24 hours
cython-editable-install-test:
......
......@@ -17,7 +17,7 @@ repos:
# If `CI_MERGE_REQUEST_TARGET_BRANCH_SHA` env var is set then this will
# run flake8 on the diff of the merge request, otherwise it will run
# flake8 as it would usually execute via the pre-commit hook
entry: bash -c 'if [ -z ${CI_MERGE_REQUEST_TARGET_BRANCH_SHA} ]; then (flake8 "$@"); else (git diff $CI_MERGE_REQUEST_TARGET_BRANCH_SHA...$CI_MERGE_REQUEST_SOURCE_BRANCH_SHA | flake8 --diff); fi' --
entry: bash -c 'if [ -z ${CI_MERGE_REQUEST_TARGET_BRANCH_SHA} ]; then (flake8 "$@" --max-line-length 88); else (git diff $CI_MERGE_REQUEST_TARGET_BRANCH_SHA...$CI_MERGE_REQUEST_SOURCE_BRANCH_SHA | flake8 --diff --max-line-length 88); fi' --
- repo: https://github.com/myint/rstcheck
rev: 3f92957478422df87bd730abde66f089cc1ee19b # commit where pre-commit support was added
hooks:
......
......@@ -6,9 +6,8 @@
version: 2
# Build documentation in the docs/ directory with Sphinx
sphinx:
configuration: docs/source/conf.py
fail_on_warning: false
mkdocs:
configuration: mkdocs.yml
# Optionally set the version of Python and requirements required to build your docs
python:
......@@ -17,5 +16,3 @@ python:
- requirements: docs/requirements.txt
- method: pip
path: .
extra_requirements:
- docs
# Makefile for Sphinx documentation
#
# You can set these variables from the command line.
SPHINXOPTS =
SPHINXBUILD = sphinx-build
PAPER =
BUILDDIR = build
# Internal variables.
PAPEROPT_a4 = -D latex_paper_size=a4
PAPEROPT_letter = -D latex_paper_size=letter
ALLSPHINXOPTS = -d $(BUILDDIR)/doctrees $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source
# the i18n builder cannot share the environment and doctrees with the others
I18NSPHINXOPTS = $(PAPEROPT_$(PAPER)) $(SPHINXOPTS) source
.PHONY: help
help:
@echo "Please use \`make <target>' where <target> is one of"
@echo " html to make standalone HTML files"
@echo " dirhtml to make HTML files named index.html in directories"
@echo " singlehtml to make a single large HTML file"
@echo " pickle to make pickle files"
@echo " json to make JSON files"
@echo " htmlhelp to make HTML files and a HTML help project"
@echo " qthelp to make HTML files and a qthelp project"
@echo " applehelp to make an Apple Help Book"
@echo " devhelp to make HTML files and a Devhelp project"
@echo " epub to make an epub"
@echo " epub3 to make an epub3"
@echo " latex to make LaTeX files, you can set PAPER=a4 or PAPER=letter"
@echo " latexpdf to make LaTeX files and run them through pdflatex"
@echo " latexpdfja to make LaTeX files and run them through platex/dvipdfmx"
@echo " text to make text files"
@echo " man to make manual pages"
@echo " texinfo to make Texinfo files"
@echo " info to make Texinfo files and run them through makeinfo"
@echo " gettext to make PO message catalogs"
@echo " changes to make an overview of all changed/added/deprecated items"
@echo " xml to make Docutils-native XML files"
@echo " pseudoxml to make pseudoxml-XML files for display purposes"
@echo " linkcheck to check all external links for integrity"
@echo " doctest to run all doctests embedded in the documentation (if enabled)"
@echo " coverage to run coverage check of the documentation (if enabled)"
@echo " dummy to check syntax errors of document sources"
.PHONY: clean
clean:
rm -rf $(BUILDDIR)/*
.PHONY: html
html:
$(SPHINXBUILD) -b html $(ALLSPHINXOPTS) $(BUILDDIR)/html
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/html."
.PHONY: dirhtml
dirhtml:
$(SPHINXBUILD) -b dirhtml $(ALLSPHINXOPTS) $(BUILDDIR)/dirhtml
@echo
@echo "Build finished. The HTML pages are in $(BUILDDIR)/dirhtml."
.PHONY: singlehtml
singlehtml:
$(SPHINXBUILD) -b singlehtml $(ALLSPHINXOPTS) $(BUILDDIR)/singlehtml
@echo
@echo "Build finished. The HTML page is in $(BUILDDIR)/singlehtml."
.PHONY: pickle
pickle:
$(SPHINXBUILD) -b pickle $(ALLSPHINXOPTS) $(BUILDDIR)/pickle
@echo
@echo "Build finished; now you can process the pickle files."
.PHONY: json
json:
$(SPHINXBUILD) -b json $(ALLSPHINXOPTS) $(BUILDDIR)/json
@echo
@echo "Build finished; now you can process the JSON files."
.PHONY: htmlhelp
htmlhelp:
$(SPHINXBUILD) -b htmlhelp $(ALLSPHINXOPTS) $(BUILDDIR)/htmlhelp
@echo
@echo "Build finished; now you can run HTML Help Workshop with the" \
".hhp project file in $(BUILDDIR)/htmlhelp."
.PHONY: qthelp
qthelp:
$(SPHINXBUILD) -b qthelp $(ALLSPHINXOPTS) $(BUILDDIR)/qthelp
@echo
@echo "Build finished; now you can run "qcollectiongenerator" with the" \
".qhcp project file in $(BUILDDIR)/qthelp, like this:"
@echo "# qcollectiongenerator $(BUILDDIR)/qthelp/EuropeanXFELOfflineCalibration.qhcp"
@echo "To view the help file:"
@echo "# assistant -collectionFile $(BUILDDIR)/qthelp/EuropeanXFELOfflineCalibration.qhc"
.PHONY: applehelp
applehelp:
$(SPHINXBUILD) -b applehelp $(ALLSPHINXOPTS) $(BUILDDIR)/applehelp
@echo
@echo "Build finished. The help book is in $(BUILDDIR)/applehelp."
@echo "N.B. You won't be able to view it unless you put it in" \
"~/Library/Documentation/Help or install it in your application" \
"bundle."
.PHONY: devhelp
devhelp:
$(SPHINXBUILD) -b devhelp $(ALLSPHINXOPTS) $(BUILDDIR)/devhelp
@echo
@echo "Build finished."
@echo "To view the help file:"
@echo "# mkdir -p $$HOME/.local/share/devhelp/EuropeanXFELOfflineCalibration"
@echo "# ln -s $(BUILDDIR)/devhelp $$HOME/.local/share/devhelp/EuropeanXFELOfflineCalibration"
@echo "# devhelp"
.PHONY: epub
epub:
$(SPHINXBUILD) -b epub $(ALLSPHINXOPTS) $(BUILDDIR)/epub
@echo
@echo "Build finished. The epub file is in $(BUILDDIR)/epub."
.PHONY: epub3
epub3:
$(SPHINXBUILD) -b epub3 $(ALLSPHINXOPTS) $(BUILDDIR)/epub3
@echo
@echo "Build finished. The epub3 file is in $(BUILDDIR)/epub3."
.PHONY: latex
latex:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo
@echo "Build finished; the LaTeX files are in $(BUILDDIR)/latex."
@echo "Run \`make' in that directory to run these through (pdf)latex" \
"(use \`make latexpdf' here to do that automatically)."
.PHONY: latexpdf
latexpdf:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through pdflatex..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
.PHONY: latexpdfja
latexpdfja:
$(SPHINXBUILD) -b latex $(ALLSPHINXOPTS) $(BUILDDIR)/latex
@echo "Running LaTeX files through platex and dvipdfmx..."
$(MAKE) -C $(BUILDDIR)/latex all-pdf-ja
@echo "pdflatex finished; the PDF files are in $(BUILDDIR)/latex."
.PHONY: text
text:
$(SPHINXBUILD) -b text $(ALLSPHINXOPTS) $(BUILDDIR)/text
@echo
@echo "Build finished. The text files are in $(BUILDDIR)/text."
.PHONY: man
man:
$(SPHINXBUILD) -b man $(ALLSPHINXOPTS) $(BUILDDIR)/man
@echo
@echo "Build finished. The manual pages are in $(BUILDDIR)/man."
.PHONY: texinfo
texinfo:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo
@echo "Build finished. The Texinfo files are in $(BUILDDIR)/texinfo."
@echo "Run \`make' in that directory to run these through makeinfo" \
"(use \`make info' here to do that automatically)."
.PHONY: info
info:
$(SPHINXBUILD) -b texinfo $(ALLSPHINXOPTS) $(BUILDDIR)/texinfo
@echo "Running Texinfo files through makeinfo..."
make -C $(BUILDDIR)/texinfo info
@echo "makeinfo finished; the Info files are in $(BUILDDIR)/texinfo."
.PHONY: gettext
gettext:
$(SPHINXBUILD) -b gettext $(I18NSPHINXOPTS) $(BUILDDIR)/locale
@echo
@echo "Build finished. The message catalogs are in $(BUILDDIR)/locale."
.PHONY: changes
changes:
$(SPHINXBUILD) -b changes $(ALLSPHINXOPTS) $(BUILDDIR)/changes
@echo
@echo "The overview file is in $(BUILDDIR)/changes."
.PHONY: linkcheck
linkcheck:
$(SPHINXBUILD) -b linkcheck $(ALLSPHINXOPTS) $(BUILDDIR)/linkcheck
@echo
@echo "Link check complete; look for any errors in the above output " \
"or in $(BUILDDIR)/linkcheck/output.txt."
.PHONY: doctest
doctest:
$(SPHINXBUILD) -b doctest $(ALLSPHINXOPTS) $(BUILDDIR)/doctest
@echo "Testing of doctests in the sources finished, look at the " \
"results in $(BUILDDIR)/doctest/output.txt."
.PHONY: coverage
coverage:
$(SPHINXBUILD) -b coverage $(ALLSPHINXOPTS) $(BUILDDIR)/coverage
@echo "Testing of coverage in the sources finished, look at the " \
"results in $(BUILDDIR)/coverage/python.txt."
.PHONY: xml
xml:
$(SPHINXBUILD) -b xml $(ALLSPHINXOPTS) $(BUILDDIR)/xml
@echo
@echo "Build finished. The XML files are in $(BUILDDIR)/xml."
.PHONY: pseudoxml
pseudoxml:
$(SPHINXBUILD) -b pseudoxml $(ALLSPHINXOPTS) $(BUILDDIR)/pseudoxml
@echo
@echo "Build finished. The pseudo-XML files are in $(BUILDDIR)/pseudoxml."
.PHONY: dummy
dummy:
$(SPHINXBUILD) -b dummy $(ALLSPHINXOPTS) $(BUILDDIR)/dummy
@echo
@echo "Build finished. Dummy builder generates no files."
div.autodoc-docstring {
padding-left: 20px;
margin-bottom: 30px;
border-left: 5px solid rgba(230, 230, 230);
}
div.autodoc-members {
padding-left: 20px;
margin-bottom: 15px;
}
/* :root > * {
--md-primary-fg-color: #152066;
--md-primary-fg-color--light:#3c3b72;
--md-primary-fg-color--dark: #000020;
--md-primary-bg-color: #ffffff;
--md-primary-bg-color--light:#B2B2B2;
--md-footer-bg-color: #000020;
--md-accent-fg-color: #f39200;
--md-accent-fg-color--transparent: #f3920085;
--md-accent-bg-color: #ffffff;
--md-accent-bg-color--light: #ffffff;
} */
[data-md-color-scheme="light"] {
/* // Default color shades */
--md-primary-fg-color: #152066;
--md-primary-fg-color--light:#3c3b72;
--md-primary-fg-color--dark: #000020;
--md-primary-bg-color: #ffffff;
--md-primary-bg-color--light:#B2B2B2;
--md-footer-bg-color: #000020;
--md-accent-fg-color: #f39200;
--md-accent-fg-color--transparent: #f3920085;
--md-accent-bg-color: #ffffff;
--md-accent-bg-color--light: #ffffff;
--md-typeset-a-color: #2840dd;
}
[data-md-color-scheme="slate"] {
--md-primary-fg-color: #f39200;
--md-primary-fg-color--light:#f3920085;
--md-primary-fg-color--dark: #da996f;
--md-primary-bg-color: #ffffff;
--md-primary-bg-color--light:#B2B2B2;
--md-footer-bg-color: #000020;
--md-accent-fg-color: #fcda9d;
--md-accent-fg-color--transparent: #3c3b72;
--md-accent-bg-color: #ffffff;
--md-accent-bg-color--light: #ffffff;
/* // Default color shades */
--md-default-fg-color: hsla(0, 0%, 100%, 1);
--md-default-fg-color--light: hsla(0, 0%, 100%, 0.87);
--md-default-fg-color--lighter: hsla(0, 0%, 100%, 0.32);
--md-default-fg-color--lightest: hsla(0, 0%, 100%, 0.12);
--md-default-bg-color: hsla(232, 15%, 21%, 1);
--md-default-bg-color--light: hsla(232, 15%, 21%, 0.54);
--md-default-bg-color--lighter: hsla(232, 15%, 21%, 0.26);
--md-default-bg-color--lightest: hsla(232, 15%, 21%, 0.07);
/* // Code color shades */
--md-code-bg-color: hsla(232, 15%, 18%, 1);
--md-code-fg-color: hsla(60, 30%, 96%, 1);
/* // Text color shades */
--md-text-color: var(--md-default-fg-color--light);
--md-text-link-color: var(--md-primary-fg-color);
/* // Admonition color shades */
--md-admonition-bg-color: hsla(0, 0%, 100%, 0.025);
--md-admonition-fg-color: var(--md-default-fg-color);
/* // Footer color shades */
--md-footer-bg-color: hsla(230, 9%, 13%, 0.87);
--md-footer-bg-color--dark: hsla(232, 15%, 10%, 1);
--md-typeset-a-color: #f39200;
}
\ No newline at end of file
code, .rst-content tt, .rst-content code {
white-space: pre;
}
\ No newline at end of file
.. _advanced_topics:
# Advanced Topics
Advanced Topics
===============
!!! warning
The following tasks should only be carried out by trained staff.
The following tasks should only be carried out by trained staff.
Request dark characterization
-----------------------------
## Extending Correction Notebooks on User Request
The script runs dark characterization notebook with default parameters via web service. The data needs to be transferred via the MDC, however, the web service will wait for any transfer to be completed. The detector is chosen automatically with respect to given instrument (`--instrument`). For AGIPD, LPD, Jungfrau runs for the three gain stages need to be given (`--run-high`, `--run-med`, `--run-low`). For FastCCD and ePIX only a single run needs to be given (`--run`).
The complete list of parameters is::
-h, --help show this help message and exit
--proposal PROPOSAL The proposal number, without leading p, but with
leading zeros
--instrument {SPB,MID,FXE,SCS,SQS,HED}
The instrument
--cycle CYCLE The facility cycle
--run-high RUN_HIGH Run number of high gain data as an integer
--run-med RUN_MED Run number of medium gain data as an integer
--run-low RUN_LOW Run number of low gain data as an integer
--run RUN Run number as an integer
The path to data files is defined from script parameters. A typical data path, which can be found in the MDC is::
/gpfs/exfel/exp/MID/201930/p900071/raw
Where `MID` is an instrument name, `201930` is a cycle, `900071` is a proposal number.
Extending Correction Notebooks on User Request
----------------------------------------------
Internally, each automated correction run will trigger `calibrate_nbc.py` to be called
anew on the respective notebook. This means that any changes to save to this notebook
will be picked up for subsequent runs.
Internally, each automated correction run will trigger
[calibrate_nbc.py] to be called anew on the respective
notebook. This means that any changes to save to this notebook will be
picked up for subsequent runs.
This can be useful to add user requests while running. For this:
1. create a working copy of the notebook in question, and create a commit of the the
production notebook to fall back to in case of problems::
1. create a working copy of the notebook in question, and create a
commit of the production notebook to fall back to in case of
problems:
git add production_notebook_NBC.py
git commit -m "Known working version before edits"
cp production_notebook_NBC.py production_notebook_TEST.py
> `` ` git add production_notebook_NBC.py git commit -m "Known working version before edits" cp production_notebook_NBC.py production_notebook_TEST.py ``\`
2. add any feature there and *thouroughly* test them
3. when you are happy with the results, copy them over into the production notebook and
save.
2. add any feature there and *thoroughly* test them
3. when you are happy with the results, copy them over into the
production notebook and save.
.. warning::
Live editing of correction notebooks is fully at your responsiblity. Do not do it
if you are not 100% sure you know what you are doing.
???+ warning
4. If it fails, revert back to the original state, ideally via git::
Live editing of correction notebooks is fully at your responsibility. Do
not do it if you are not 100% sure you know what you are doing.
git checkout HEAD -- production_notebook_NBC.py
1. If it fails, revert to the original state, ideally via git:
5. Any runs which did not correct do to failures of the live edit can then be relaunched
manually, assuming the correction notebook allows run and overwrite paramters::
git checkout HEAD -- production_notebook_NBC.py
xfel-calibrate ...... --run XYZ,ZXY-YYS --overwrite
2. Any runs which did not correct do to failures of the live edit can
then be relaunched manually, assuming the correction notebook allows
run and overwrite parameters:
xfel-calibrate ...... --run XYZ,ZXY-YYS --overwrite
Using a Parameter Generator Function
------------------------------------
By default, the parameters to be exposed to the command line are deduced from the
first code cell of the notebook, after resolving the notebook itself from the
detector and characterization type. For some applications it might be beneficial
to define a context-specific parameter range within the same notebook, based on
additional user input. This can be done via a parameter generation function which
is defined in one of the code cell::
By default, the parameters to be exposed to the command line are deduced
from the first code cell of the notebook, after resolving the notebook
itself from the detector and characterization type. For some
applications it might be beneficial to define a context-specific
parameter range within the same notebook, based on additional user
input. This can be done via a parameter generation function which is
defined in one of the code cell:
def extend_parms(detector_instance):
from iCalibrationDB import Conditions
......@@ -110,20 +84,22 @@ is defined in one of the code cell::
return "\n".join(all_conditions)
.. note::
???+ note
Note how all imports are inlined, as the function is executed outside
the notebook context.
Note how all imports are inlined, as the function is executed outside the
notebook context.
In the example, the function generates a list of additional parameters depending
on the `detector_instance` given. Here, `detector_instance` is defined in the first
code cell the usual way. Any other parameters defined such, that have names matching
those of the generator function signature are passed to this function. The function
should then return a string containing additional code to be appended to the first
code cell.
In the example, the function generates a list of additional parameters
depending on the [detector\_instance] given. Here,
[detector\_instance] is defined in the first code cell the
usual way. Any other parameters defined such, that have names matching
those of the generator function signature are passed to this function.
The function should then return a string containing additional code to
be appended to the first code cell.
To make use of this functionality, the parameter generator function needs to be
configured in `notebooks.py`, e.g. ::
To make use of this functionality, the parameter generator function
needs to be configured in [notebooks.py], e.g. :
...
"GENERIC": {
......@@ -137,8 +113,8 @@ configured in `notebooks.py`, e.g. ::
}
...
To generically query which parameters are defined in the first code cell, the
code execution history feature of iPython can be used::
To generically query which parameters are defined in the first code
cell, the code execution history feature of iPython can be used:
ip = get_ipython()
session = ip.history_manager.get_last_session_id()
......@@ -157,5 +133,5 @@ code execution history feature of iPython can be used::
if parms[n] == "None" or parms[n] == "'None'":
parms[n] = None
This will create a dictionary `parms` which contains all parameters either
as `float` or `str` values.
This will create a dictionary [parms] which contains all
parameters either as [float] or [str] values.
# xfel-calibrate configration
The European XFEL offline calibration is executed using the command line interface.
By running `xfel-calibrate DETECTOR CALIBRATION --<configurations>` The notebook of the executed detector calibration is submitted on MAXWELL into [SLURM][slurm] nodes.
The offline calibration CLI machinery consists of several configuration pieces that are necessary for the calibration process. These files contain the configuration information for the notebook to process and how to submit it on MAXWELL resources.
- `settings.py`: Consist of the tool's environment definitions.
- `notebooks.py`: The module where every calibration notebook is connected to a detector calibration for the CLI.
## Settings
The `settings.py` is a python configuration file, which configures the tool's environment.
```py
# path into which temporary files from each run are placed
temp_path = "{}/temp/".format(os.getcwd())
# Path to use for calling Python. If the environment is correctly set, simply the command
python_path = "python"
# Path to store reports in
report_path = "{}/calibration_reports/".format(os.getcwd())
# Also try to output the report to an out_folder defined by the notebook
try_report_to_output = True
# the command to run this concurrently. It is prepended to the actual call
launcher_command = "sbatch -p exfel -t 24:00:00 --mem 500G --mail-type END --requeue --output {temp_path}/slurm-%j.out"
```
## Notebooks
The notebooks.py module is responsible for configuring the connection between the notebooks and the command line. It achieves this by using a nested dictionary structure, with two levels of nesting. The first level contains a key for the detector being used, and the second level contains keys for the calibration types. The third level of the dictionary contains the names of the notebooks (notebook, pre-notebook, and dep-notebook) along with the relevant concurrency parameters. By organizing the configuration in this way, the notebooks.py module is able to provide a clear and flexible way of connecting the notebooks to the command line.
!!! example "Example for `xfel-calibrate/notebooks.py`"
```python
notebooks = {
"AGIPD": {
"DARK": {
"notebook":
"notebooks/AGIPD/Characterize_AGIPD_Gain_Darks_NBC.ipynb",
"dep_notebooks": [
"notebooks/generic/overallmodules_Darks_Summary_NBC.ipynb"],
"concurrency": {"parameter": "modules",
"use function": "find_modules",
"cluster cores": 8},
},
"PC": {
"notebook": "notebooks/AGIPD/Chracterize_AGIPD_Gain_PC_NBC.ipynb",
"concurrency": {"parameter": "modules",
"default concurrency": 16,
"cluster cores": 32},
},
"CORRECT": {
"pre_notebooks": ["notebooks/AGIPD/AGIPD_Retrieve_Constants_Precorrection.ipynb"],
"notebook": "notebooks/AGIPD/AGIPD_Correct_and_Verify.ipynb",
"dep_notebooks": [
"notebooks/AGIPD/AGIPD_Correct_and_Verify_Summary_NBC.ipynb"],
"concurrency": {"parameter": "sequences",
"use function": "balance_sequences",
"default concurrency": [-1],
"cluster cores": 16},
},
...
}
}
```
As previously explained, the DARK and CORRECT nested dictionaries that correspond to different calibration types contain references to the notebooks that will be executed and specify the concurrency settings for the main calibration notebook.
- `notebook`: The main calibration notebook and the notebook that will be affected with the concurrency configs.
- `pre_notebooks`: These notebooks runs before the main notebook as the usually prepare some essential data for it before it runs over multiple [SLURM][slurm] nodes. e.g. retrieving constant file paths before processing.
- `dep_notebooks`: These are the notebooks dependent on the processing of all [SLURM][slurm] nodes running the main notebook. e.g. running summary plots over the processed files.
!!! tip
It is good practice to name command line enabled notebooks with an `_NBC` suffix as shown in the above example.
- `concurrency` dictionary:
- `parameter`: The parameter name that will be used to distribute the processing of the notebook across multiple computing resources. The parameter should be of type list.
- `use_function`: In case there is a need to use a function within the notebook that will affect the [SLURM][slurm] nodes used. A function can be used here, and it will be expected in the first notebook cell of the main notebook. e.g. [balance_sequences](../reference/xfel-calibrate/calibrate.md#balance_sequences)
!!! Note
The function only needs to be defined, but not executed within the notebook context itself.
- `default concurrency`: The default concurrency to use if not defined by the user. e.g. `default_concurrency = 16` and `parameter` name modules of type ` lis(int)` leads to running 16 concurrent SLURM jobs with modules values 0 to 15, respectively for each node.
- `cluster cores`: This is for notebooks using ipcluster, only. This is of the number of cores to use.
[slurm]: https://slurm.schedmd.com/documentation.html
# Contributing to GitLab
- Branches prefixes:
- Feature: feat/
- Fix: fix/
- Documentation: doc/
- Refactoring: refactor/
- Short well-defined branch name.
- Avoid multiple unrelated changes in one merge request.
- Add common Abbreviations to the merge request name.
e.g. `DETECTOR` `CALIBRATION` >> `AGIPD` `DARK`
- Add doc strings to all functions, comments to complicated SW lines, and fixed numbers. It is recommended to use Google style
- WIP Merge requests are not yet finished by the author
- Non-WIP MR is finished, and it's adding requested features during the review should be avoided.
a) if a bug/change was fixed/done by the author during the review, one should point the reviewers to the new change.
- Add a description to your Merge request: Summary, Test, Reviewers, .....
- Connect the Merge request to any external documents and GitLab issues or MRs
- Add more than one reviewer per a merge request.
- Merge requests can be merged only after receiving 2 approvals or LGTMs, and all discussions are resolved.
- Delete source branch during merging to master.
- Reviewers are free to add comments from importing libraries in alphabetical order to software design (A suggestion is very welcomed)
- Notebooks should be added to branch/master without the cell's output.
\ No newline at end of file
# How to start writing a calibration notebook
Author: European XFEL Detector Group, Version 0.1
This is an example notebook to point to some common practices used in production notebooks.
This notebook is using ePix100 detector RAW data to apply offset and gain correction
This is meant to be a starting point on how to write calibration notebooks that can run in production using
`xfel-calibrate` CLI. However, it is recommended to have a look on some production notebooks residing
in `/notebooks/` directory which can have more advanced practices that can help you during your notebook development.
```python
# This first code cell must always contain the global notebook parameters.
# The parameters are parsed as input arguments for `xfel-calibration` command line interface.
# It is very important to have a comment for each parameter. The comments are not only helpful within the notebook,
# but they are used the as parameter description when `xfel-calibrate DETECTOR CALIBRATION --help` is used.
in_folder = "/gpfs/exfel/exp/CALLAB/202130/p900203/raw/" # directory to read data from, required
out_folder = "/gpfs/exfel/exp/CALLAB/202130/p900203/scratch/how_write_xfel_calibrate_NBC" # directory to output to, required
# Adding `required` at the comment here forces the user to add the parameter through the command line,
# ignoring the default value and only using it as an indication of the expected type.
run = 9046 # runs to process, required
# Parameters for accessing the raw data.
karabo_da = "EPIX01" # Data aggregator names. For multi-modular detectors like AGIPD and LPD, this is a list.
# To access the correct data files and calibration constants. The karabo_id string is used as a detector identifier.
karabo_id = "HED_IA1_EPX100-1" # Detector karabo_id name
# Boolean parameter can be set to False from xfel-calibrate by adding `no-` at the beginning of the boolean parameter name.
gain_correction = True # Proceed with gain correction.
# Parameters for the calibration database.
creation_time = "" # The timestamp to use with Calibration DB. Required Format: "YYYY-MM-DD hh:mm:ss" e.g. 2019-07-04 11:02:41
# It is preferred if operating conditions are read from RAW data instead of passed as an input argument.
bias_voltage = 200 # RAW data bias voltage.
in_vacuum = False # Detector operated in vacuum
photon_energy = 8.048 # Photon energy used for gain calibration
fix_temperature = 290 # fixed temperature value in Kelvin.
# Parameters affecting writing corrected data.
chunk_size_idim = 1 # H5 chunking size of output data
```
```python
from pathlib import Path
# Write after the first python notebook cell. It is a good practice to import all needed libraries and modules.
# Same as we do in a normal python module.
import matplotlib.pyplot as plt
import numpy as np
# To access data `extra_data` is used to read RAW/CORR data.
from extra_data import RunDirectory # https://extra-data.readthedocs.io/en/latest/
from extra_geom import Epix100Geometry # https://extra-geom.readthedocs.io/en/latest/
# For parallelization with a notebook it is suggested to use multiprocessing.
import multiprocessing # or
import pasha as psh # https://github.com/European-XFEL/pasha
# This library uses multiprocessing and provide tight integration with extra_data
# `cal_tools` directory consists of multiple useful functions that are used in many notebooks.
import cal_tools.restful_config as rest_cfg
# `calcat_interface` is the main module with functions to retrieve calibration constants from CALCAT.
from cal_tools.calcat_interface import EPIX100_CalibrationData
from cal_tools.epix100 import epix100lib
# `cal_tools.files` is recommended to write corrected data.
from cal_tools.files import DataFile
# An internal class to record computation time.
from cal_tools.step_timing import StepTimer
# `tools` consists for various number of functions to read files, wrappers for iCalibrationDB, etc ...
from cal_tools.tools import (
calcat_creation_time,
)
```
## Prepare global variables.
In the following cells it is a common practice to start assigning global variables,
like converting in_folder and out_folder to Path objects or initializing step_timer object.
```python
# Convert main folders to Paths.
in_folder = Path(in_folder)
out_folder = Path(out_folder)
# This is only needed in case of running the notebook interactively. Otherwise, the machinery take care of this.
out_folder.mkdir(parents=True, exist_ok=True)
run_folder = in_folder / f"r{run:04d}"
# Initiate the main Run data collection.
run_dc = RunDirectory(
run_folder, include="*S00000*").select(f"*{karabo_id}*", require_all=True)
print(f"The available source to correct for {karabo_id} are {list(run_dc.all_sources)}")
step_timer = StepTimer()
```
The available source to correct for HED_IA1_EPX100-1 are ['HED_IA1_EPX100-1/DET/CONTROL', 'HED_IA1_EPX100-1/DET/RECEIVER', 'HED_IA1_EPX100-1/DET/RECEIVER:daqOutput']
## Read operating conditions from RAW data.
It is recommended to read the calibration constants' operating conditions directly from RAW data.
To avoid wrong given values from the notebook's input argument.
Unfortunately, there is the possibility that these conditions are not stored in RAW data
because the detector is in its early operation stages.
Below we give an example of reading the integration time of the data. There are multiple functions and similar class
as epix100Ctrl for other detectors that are used for the same purpose.
```python
# Read control data.
data_source = "HED_IA1_EPX100-1/DET/RECEIVER:daqOutput"
ctrl_data = epix100lib.epix100Ctrl(
run_dc=run_dc,
instrument_src=data_source,
ctrl_src=f"{karabo_id}/DET/CONTROL",
)
integration_time = ctrl_data.get_integration_time()
```
## Retrieve needed calibration constants
Usually there is a cell when we retrieve calibration constants before correction
and sometimes before processing new calibration constants.
In this example we use `EPIX100_CalibrationData` class to initialize an object with
the necessary operating conditions and creation time.
Below the operating conditions values like integration_time and sensor_temperature are hard coded to specific value.
In production notebooks this is done differently.
```python
# Run creation time is important to get the correct calibration constant versions.
creation_time = calcat_creation_time(in_folder, run, creation_time)
print(f"Using {creation_time.isoformat()} as creation time")
epix_cal = EPIX100_CalibrationData(
detector_name=karabo_id,
sensor_bias_voltage=bias_voltage,
integration_time=integration_time,
sensor_temperature=fix_temperature,
in_vacuum=in_vacuum,
source_energy=photon_energy,
event_at=creation_time,
client=rest_cfg.calibration_client(),
)
const_data = epix_cal.ndarray_map()[karabo_da]
print(f"Retrieved calibrations for {karabo_id}: {list(const_data.keys())}")
```
Reading creation_date from input files metadata `INDEX/timestamp`
Using 2021-09-19T14:39:26.744069+00:00 as creation time
Retrieved calibrations for HED_IA1_EPX100-1: ['BadPixelsDarkEPix100', 'NoiseEPix100', 'OffsetEPix100', 'RelativeGainEPix100']
## Correcting Raw data
```python
data_key = "data.image.pixels"
raw_data = run_dc[data_source, data_key].ndarray()
dshape = raw_data.shape # Raw input data shape.
print(f"Number of trains to correct is {len(run_dc.train_ids)}")
def correct_train(wid, index, d):
"""Correct one train for ePix100 detector."""
d -= const_data["OffsetEPix100"][..., 0]
if gain_correction:
d /= const_data["RelativeGainEPix100"]
step_timer.start()
context = psh.context.ThreadContext(num_workers=10)
corr_data = context.alloc(shape=dshape, dtype=np.float32)
context.map(correct_train, raw_data.astype(np.float32))
step_timer.done_step('Correcting data')
```
Number of trains to correct is 1000
Correcting data: 0.4 s
## Writing corrected data
```python
# Storing data.
out_file = out_folder / "CORR-R9046-EPIX01-S00000.h5"
instrument_source = "HED_IA1_EPX100-1/DET/RECEIVER:daqOutput"
image_counts = run_dc[instrument_source, "data.image.pixels"].data_counts(labelled=False)
step_timer.start()
raw_file = run_dc.files[0] # FileAccess object
with DataFile(out_file, "w") as ofile:
# Create INDEX datasets.
ofile.create_index(run_dc.train_ids, from_file=raw_file)
# Create METDATA datasets
ofile.create_metadata(
like=run_dc,
sequence=raw_file.sequence,
instrument_channels=(f"{instrument_source}/data",)
)
# Create Instrument section to later add corrected datasets.
outp_source = ofile.create_instrument_source(instrument_source)
# Create count/first datasets at INDEX source.
outp_source.create_index(data=image_counts)
# Add main corrected `data.image.pixels` dataset and store corrected data.
outp_source.create_key(
"data.image.pixels", data=corr_data, chunks=((chunk_size_idim,) + dshape[1:]))
step_timer.done_step('Writing corrected data')
```
Writing corrected data: 0.5 s
# Plotting results
```python
geom = Epix100Geometry.from_origin()
fig, (ax1, ax2) = plt.subplots(nrows=1, ncols=2, figsize=(25, 10))
# Plotting mean data for RAW and CORRECTED across trains
geom.plot_data(np.mean(raw_data, axis=0), ax=ax1)
ax1.set_title("Mean RAW across trains")
geom.plot_data(np.mean(corr_data, axis=0), ax=ax2)
ax2.set_title("Mean CORR across trains")
plt.show()
```
![png](how_to_write_xfel_calibrate_notebook_NBC_files/how_to_write_xfel_calibrate_notebook_NBC_14_0.png)
# Installation
It's recommended to install the offline calibration (pycalibration)
package on maxwell, using the anaconda/3 environment.
The following instructions clone from the EuXFEL GitLab instance using
SSH remote URLs, this assumes that you have set up SSH keys for use with
GitLab already. If you have not then read the appendix section on [SSH
Key Setup for GitLab](#ssh-key-setup-for-gitlab) for instructions on how
to do this.
## Installation using python virtual environment - recommended
`pycalibration` uses Python 3.8. Currently, the default
python installation on Maxwell is still Python 3.6.8, so Python 3.8
needs to be loaded from a different location.
One option is to use the Maxwell Spack installation, running `module
load maxwell` will activate the test Spack instance from
DESY, then you can use `module load
python-3.8.6-gcc-10.2.0-622qtxd` to Python 3.8. Note that
this Spack instance is currently a trial phase and may not be stable.
Another option is to use `pyenv`, we provide a pyenv
installation at `/gpfs/exfel/sw/calsoft/.pyenv` which we use
to manage different versions of python. This can be activated with
`source /gpfs/exfel/sw/calsoft/.pyenv/bin/activate`
A quick setup would be:
1. `source /gpfs/exfel/sw/calsoft/.pyenv/bin/activate`
2. `git clone ssh://git@git.xfel.eu:10022/detectors/pycalibration.git && cd pycalibration` -
clone the offline calibration package from EuXFEL GitLab
3. `pyenv shell 3.8.11` - load required version of python
4. `python3 -m venv .venv` - create the virtual environment
5. `source .venv/bin/activate` - activate the virtual environment
6. `python3 -m pip install --upgrade pip` - upgrade version of pip
7. `python3 -m pip install .` - install the pycalibration package (add
`-e` flag for editable development installation)
Copy/paste script:
```bash
source /gpfs/exfel/sw/calsoft/.pyenv/bin/activate
git clone ssh://git@git.xfel.eu:10022/detectors/pycalibration.git
cd pycalibration
pyenv shell 3.8.11
python3 -m venv .venv
source .venv/bin/activate
python3 -m pip install --upgrade pip
python3 -m pip install . # `-e` flag for editable install, e.g. `pip install -e .`
```
## Creating an ipython kernel for virtual environments
To create an ipython kernel with pycalibration available you should (if
using a venv) activate the virtual environment first, and then run:
```bash
python3 -m pip install ipykernel # If not using a venv add `--user` flag
python3 -m ipykernel install --user --name pycalibration --display-name "pycalibration" # If not using a venv pick different name
```
This can be useful for Jupyter notebook tools as [max-jhub documentation](https://rtd.xfel.eu/docs/data-analysis-user-documentation/en/latest/jhub/)([max-jhub](https://max-jhub.desy.de/hub/login))
## SSH Key Setup for GitLab
It is highly recommended to setup SSH keys for access to GitLab as this
simplifies the setup process for all of our internal software present on
GitLab.
To set up the keys:
1. Connect to Maxwell
2. Generate a new keypair with `ssh-keygen -o -a 100 -t ed25519`, you
can either leave this in the default location (`~/.ssh/id_ed25519`)
or place it into a separate directory to make management of keys
easier if you already have multiple ones. If you are using a
password for your keys please check this page to learn how to manage
them:
<https://docs.github.com/en/github/authenticating-to-github/generating-a-new-ssh-key-and-adding-it-to-the-ssh-agent#adding-your-ssh-key-to-the-ssh-agent>
3. Add the public key (`id_ed25519.pub`) to your account on GitLab:
<https://git.xfel.eu/gitlab/profile/keys>
4. Add the following to your `~/.ssh/config` file
```bash
# Special flags for gitlab over SSH
Host git.xfel.eu
User git
Port 10022
ForwardX11 no
IdentityFile ~/.ssh/id_ed25519
```
Once this is done you can clone repositories you have access to from
GitLab without having to enter your password each time. As
`pycalibration` requirements are installed from SSH remote URLs having
SSH keys set up is a requirement for installing pycalibration.
# pyCalibration automated tests
## Objective
Test available detector calibrations against picked reference data
to prove the software consistency and avoid sneaky bugs that
affect the produced data quality.
1. Test the successful processing for executing SLURM jobs.
1. xfel-calibrate CL runs successfully
2. SLURM jobs are executed on Maxwell
3. SLURM jobs are COMPLETED at the end.
2. Test the availability of files in the out-folder
1. Validate the presence of a PDF report.
2. Validate the number of HDF5 files against the number of HDF5 files in the reference folder.
3. Validate the numerical values of the processed data against the referenced data.
These tests are meant to run on all detector calibrations before any release. As well as run it per branch on the selected detector/calibration that is affected by the branch's changes.
## Current state
- Tests are defined by a callab_test.py DICT.
```py
{
"<test-name>(`<detector-calibration-operationMode>`)": {
"det-type": "<detectorType>",
"cal_type": "<calibrationType>",
"config": {
"in-folder": "<inFolderPath>",
"out-folder": "<outFolderPath>",
"run": "<runNumber>",
"karabo-id": "detector",
"<config-name>": "<config-value>",
...: ...,
},
"reference-folder": "{}/{}/{}",
}
...
}
```
- Test are triggered using GitLab manual trigger CI pipeline.
After opening a merge request on GitLab the CI initiates the unit tests pipeline automatically. After these tests are finished, you get an option to trigger a manual test.
![manual action](../static/tests/manual_action.png)
Using this Gitlab feature you can run the tests with no configuration, this will result in running all the runs for the automated tests. This is usually good if the change is effecting all detectors and all calibration notebooks. Or before deploying new releases.
Otherwise, you can configure the test to a specific `CALIBRATION` (DARK or CORRECT) or/and configure the list of detectors to run the tests for.
![Put arguments](../static/tests/given_argument_example.png)
!!! warning
It is not recommended to run multiple automated tests on more than a merge request at the same time. As these test are memory consuming and all tests run on the same test node.
The GitLab test pipeline jobs progress report is used to collect useful information about the test result and the reason of failure for any of the tested calibration runs.
- Tests are triggered using CLI locally:
```bash
pytest tests/test_reference_runs/test_pre_deployment.py \
--release-test \
--reference-folder <reference-folder-path> \
--out-folder <out-folder-path>
```
- Arguments:
- required arguments:
- release-test: this is needed to trigger the automated test. To avoid triggering this as a part of the Gitlab CI this boolean was created.
- reference-folder: Setting the reference folder path. The reference folder is expected to have exactly the same structure as the out-folder. Usually the reference folders are out-folder from previous successful tested releases.
- out-folder: The output folder paths for saving the test output files.
- The structure is `<detector>/<test-name>/[PDF, HDF5, YAML, SlurmOut, ...]`
- optional arguments:
- picked-test: this can be used to only run the tests for one <test-name> only.
- calibration: this can be used to pick only one calibration type to run the test for. [dark or correct]
- detector: this can be used to pick detectors to run the test for and skip the rest.
- no-numerical-validation: as the numerical validation is done by default. This argument can be used to skip it and stop at executing the calibration and making sure that the SLURM jobs were COMPLETED.
- validation-only: in case the test output and reference files were already available and only a validation check is needed. this argument can be used to only run the validation checks without executing any calibration jobs.
- Below are the steps taken to fully test the calibration files:
1. Run `xfel-calibrate DET CAL ...`, this will result in Slurm calibration jobs.
2. Check the Slurm jobs state after they finish processing.
3. Confirm that there is a PDF available in the output folder.
4. Validate the HDF5 files.
1. Compare the MD5 checksum for the output and reference.
2. Find the datatsets/attributes that are different in both files.
In case a test fails the whole test fails and the next test starts.
# Development Workflow
We welcome contributions to the pipeline if you have calibration notebooks or algorithms that you believe could be useful. In order to facilitate the development process, we have provided a section that outlines the key points to consider during the development of new features. This section is designed to assist you throughout the development and review process, and ensure that your contributions are consistent with the pipeline's requirements. We believe that these guidelines will be helpful in creating a seamless development process and result in high-quality contributions that benefit the pipeline. If you have any questions or concerns regarding the development process, please do not hesitate to reach out to us for assistance. We look forward to working with you to enhance the pipeline's capabilities.
## Developing a notebook from scratch
Developing a notebook from scratch can be a challenging but rewarding process. Here are some key steps to consider:
1. Define the purpose
Start identifying what are you trying to solve and the task you want to perform with your notebook.
- Does the user need to execute the notebook interactively?
- Should it run the same way as the production notebooks? It is recommended that the notebook is executed in the same way as the production notebooks through xfel-calibrate CLI.
??? Note "`xfel-calibrate` CLI is essential"
If `xfel-calibrate` CLI is essential, you need to follow the guidelines in where and how to write the variables in the first notebook cell and how to include it as one of the CLI calibration options to execute.
- Does the notebook need to generate a report at the end to display its results or can it run without any user interaction?
??? Note "A report is needed"
If a report is needed you should make sure to provide sufficient guidance and textual details using markdown cells and clear prints within the code. You should also structure the notebook cells into appropriate subsections.
2. Plan you work flow
Map out the steps your notebook will take. From data ingestion to analyzing results and visualization.
- What are the required data sources that the notebook needs to access or utilize? For example, GPFS or calibration database.
- Can the notebook's internal concurrency be optimized through the use of multiprocessing or is it necessary to employ host-level cluster computing with SLURM to achieve higher performance?
??? Note "SLURM concurrency is needed"
If SLURM concurrency is needed, you need to identify the variable that the notebook will be replicated based on to split the processing.
- What visualization tools or techniques are necessary to provide an overview of the processing results generated by the notebook? Can you give examples of charts, graphs, or other visual aids that would be useful for understanding the output?
3. Write the code and include documentation
Begin coding your notebook based on your workflow plan. Use comments to explain code blocks and decisions.
- [PEP 8](https://peps.python.org/pep-0008/) styling code is highly recommended. It leads to code that is easier to read, understand, and maintain. Additionally, it is a widely accepted standard in the Python community, and following it make your code more accessible to other developers and improve collaboration.
- [Google style docstrings](https://google.github.io/styleguide/pyguide.html) is our recommended way of documenting the code. By providing clear and concise descriptions of your functions and methods, including input and output parameters, potential exceptions, and other important details, you make it easier for other developers to understand the code, and for the used mkdocs documentation to [reference it](SUMMARY.md).
4. Document the notebook and split into sections.
Enriching a notebook with documentation is an important step in creating a clear and easy-to-follow guide for others to use:
- Use Markdown cells to create titles and section headings: By using Markdown cells, you can create clear and descriptive headings for each section of your notebook. This makes it easier to navigate and understand the content of the notebook, but more importantly these are parsed while creating the PDF report using [sphinx][sphinx].
- Add detailed explanations to each section.
- Add comments to your code.
5. Test and refine
Test your notebook thoroughly to identify any issues. Refine your code and documentation as needed to ensure your notebook is accurate, efficient, and easy to use.
6. Share and collaborate
Share your notebook on [GitLab](https://git.xfel.eu/) to start seeking feedback and begin the reviewing process.
## Write notebook to execute using xfel-calibrate
To start developing a new notebook, you either create it in an existing detector directory or create a new directory for it with the new detector's name. Give it a suffix `_NBC` to denote that it is enabled for the tool chain.
You should then start writing your code following these [guidelines](how_to_write_xfel_calibrate_notebook_NBC.ipynb)
- First markdown cell goes for Title, author, and notebook description. This is automatically parsed in the report.
- First code cell must have all parameter that will be exposed to `xfel-calibrate` CLI
- Second code cell for importing all needed libraries and methods.
- The following code cells and markdown cells are for data ingestion, data processing, and data visualization. Markdown cells are very important as it will be parsed as the main source of report text and documentation after the calibration notebook is executed.
## Exposing parameters to `xfel-calibrate`
The European XFEL Offline Calibration toolkit automatically deduces
command line arguments from Jupyter notebooks. It does this with an
extended version of [nbparameterise][nbparameterise], originally written by Thomas
Kluyver.
Parameter deduction tries to parse all variables defined in the first
code cell of a notebook. The following variable types are supported:
* Numbers(INT or FLOAT)
* Booleans
* Strings
* Lists of the above
You should avoid having `import` statements in this cell. Line comments
can be used to define the help text provided by the command line interface, and to signify if lists can be constructed from ranges and if parameters are
required::
in_folder = "" # directory to read data from, required
out_folder = "" # directory to output to, required
metadata_folder = "" # directory containing calibration metadata file when run by xfel-calibrate
run = [820, ] # runs to use, required, range allowed
sequences = [0, 1, 2, 3, 4] # sequences files to use, range allowed
modules = [0] # modules to work on, required, range allowed
karabo_id = "MID_DET_AGIPD1M-1" # Detector karabo_id name
karabo_da = [""] # a list of data aggregators names, Default [-1] for selecting all data aggregators
skip-plots = False # exit after writing corrected files and metadata
The above are some examples of parameters from AGIPD correction notebook.
- Here, `in_folder` and `out_folder` are set as `required` string values.
Values for required parameters have to be given when executing from the command line.
This means that any defaults given in the first cell of the code are ignored
(they are only used to derive the type of the parameter).
- `modules` and `sequences` are lists of integers, which from the command line could also be assigned using a range expression,
e.g. `5-10,12,13,18-21`, which would translate to `5,6,7,8,9,12,13,18,19,20`.
!!! Warning
[nbparameterise][nbparameterise] can only parse the mentioned subset of variable types. An expression that evaluates to such a type will not be recognized. e.g. `a = list(range(3))` will not work!
- `karabo_id` is a string value indicating the detector to be processed.
- `karabo_da` is a list of strings to indicate the detector's modules to be processed. As `karabo_da` and `modules`
are two different variables pointing to the same physical parameter. In the later notebook cells both parameters are synced
before usage.
- `skip-plots` is a boolean for skipping the notebook plots to save time and deliver the report as soon as the data are processed.
to set `skip-plots` to False from the command line. `--no-skip-plots` is used.
The table below provides a set of recommended parameter names to ensure consistency across all notebooks.
| Parameter name | To be used for | Special purpose |
| ----------- | --------------------------------------------------------------------- |----------------------------|
| `in_folder` | the input path data resides in, usually without a run number. ||
| `out_folder` | path to write data out to, usually without a run number. | reports can be placed here |
| `metadata_folder` | directory path for calibration metadata file with local constants. ||
| `run(s)` | which XFEL DAQ runs to use, often ranges are allowed. ||
| `karabo_id` | detector karabo name to access detector files and constants. ||
| `karabo_da` | refers to detector's modules data aggregator names to process. ||
| `modules` | refers to the detector's modules indices to process, ranges often ok. ||
| `sequences` | sequence files for the XFEL DAQ system, ranges are often ok. ||
| `local_output` | write calibration constant from file, not database. ||
| `db_output` | write calibration constant from database, not file. |saves the database from unintentional constant |
| `injections` | developments or testing. | |
## External Libraries
You may use a wide variety of libraries available in Python, but keep in mind that others wanting to run the tool will need to install these requirements as well. Therefore::
- It is generally advisable to avoid using specialized tools or libraries unless there is a compelling reason to do so. Instead, it is often better to use well-established and widely-accepted alternatives that are more likely to be familiar to other developers and easier to install and use. For example, when creating visualizations, it is recommended to use the popular and widely-used library, [matplotlib][matplotlib] for charts, graphs and other visualisation. Similarly, [numpy][numpy] is widely used when performing numerical processing tasks.
- When developing software, it is important to keep in mind the runtime and library requirements for your application. In particular, if you are using a library that performs its own parallelism, you will need to ensure that it can either set up this parallelism programmatically or do so automatically. If you need to start your application from the command line, there may be additional challenges to consider.
- Reading out EXFEL RAW data is encouraged to be done using [extra_data][extra_data]. This tool is designed to facilitate efficient access to data structures stored in HDF5 format. By simplifying the process of accessing RAW or CORRECTED datasets, it allows users to quickly and easily select and filter the specific trains, cells, or pixels of interest. This can greatly reduce the complexity and time required for data analysis, and enables researchers to more effectively explore and analyze large datasets.
## Writing out data
If your notebook produces output data, consider writing data out as early as possible, such that it is available as soon as possible. Detailed plotting and inspection can be done later on in the notebook.
Also use HDF5 via [h5py][h5py] as your output format. If you correct or calibrate input data, which adheres to the XFEL naming convention, you should maintain the convention in your output data. You should not touch any data that you do not actively work on and should assure that the `INDEX` and identifier entries are synchronized with respect to your output data. E.g. if you remove pulses from a train, the `INDEX/.../count` section should reflect this. [`cal_tools.files`](../reference/src/cal_tools/files.md) module helps you achieve this easily.
## Plotting
When creating plots, make sure that the plot is either self-explanatory or add markdown comments with adequate description. Do not add "free-floating" plots, always put them into a context. Make sure to label your axes.
Also make sure the plots are readable on an A4-sized PDF page; this is the format the notebook will be rendered to for report outputs. Specifically, this means that figure sizes should not exceed approx 15x15 inches.
The report will contain 150 dpi PNG images of your plots. If you need higher quality output of individual plot files you should save these separately, e.g. via `fig.savefig(...)` yourself.
## xfel-calibrate execution
The package utilizes tools such as [nbconvert](https://github.com/jupyter/nbconvert) and
[nbparameterise][nbparameterise] to expose [Jupyter](http://jupyter.org/) notebooks to a command line interface. In the process reports are generated from these notebooks.
The general interface is:
% xfel-calibrate DETECTOR TYPE
where `DETECTOR` and `TYPE` specify the task to be performed.
Additionally, it leverages the DESY/XFEL Maxwell cluster to run these jobs in parallel via [SLURM][slurm].
Here is a list of [available_notebooks](../operation/available_notebooks.md).
## Interaction with the calibration database
During development, it is advised to work with local constant files first before injecting any calibration constants to the production database. After the notebook's algorithms arguments matured one can switch over to the test database and then production database.
The reason for this is to avoid injecting wrong constants that can affect production calibration.
And to avoid unnecessary intervention to disable wrong or unused injected calibration constants.
Additionally, the [calibration database](../operation/calibration_database.md) is limited to XFEL networks, so independent development improves the workflow.
## Testing
The most important test is that your notebook completes flawlessly outside any special tool chain feature. After all, the tool chain will only replace parameters, and then launch a concurrent job and generate a report out of notebook. If it fails to run in the normal Jupyter notebook environment, it will certainly fail in the tool chain environment.
Once you are satisfied with your current state of initial development, you can add it to the list of notebooks as mentioned in the [configuration](configuration.md#notebooks) section.
Any changes you now make in the notebook will be automatically propagated to the command line.
Specifically, you should verify that all arguments are parsed correctly, e.g. by calling::
```bash
xfel-calibrate DETECTOR NOTEBOOK_TYPE --help
```
From then on, check include if parallel [SLURM][slurm] jobs are executed correctly and if a report is generated at the end.
Finally, you should verify that the report contains the information you'd like to convey and is intelligible to people other than you.
???+ note
You can run the `xfel-calibrate` command without starting a [SLURM][slurm] cluster job, giving you direct access to console output, by adding the `--no-cluster-job` option.
## Documenting
Most documentation should be done in the notebook itself. Any notebooks specified in the `notebook.py` file will automatically show up in the [Available Notebooks](../operation/available_notebooks.md) section of this documentation.
[nbparameterise]: https://github.com/takluyver/nbparameterise
[ipcluster]: https://ipyparallel.readthedocs.io/en/latest/
[matplotlib]: https://matplotlib.org/
[numpy]: http://www.numpy.org/
[h5py]: https://www.h5py.org/
[iCalibrationDB]: https://git.xfel.eu/detectors/cal_db_interactive
[extra_data]: https://extra-data.readthedocs.io/en/latest/
[extra-geom]: https://extra-geom.readthedocs.io/en/latest/
[pasha]: https://github.com/European-XFEL/pasha
[slurm]: https://slurm.schedmd.com/documentation.html
[sphinx]: https://www.sphinx-doc.org/en/master/
\ No newline at end of file
# https://mkdocstrings.github.io/recipes/
from pathlib import Path
import mkdocs_gen_files
src = Path(__file__).parent.parent / "src"
nav = mkdocs_gen_files.Nav()
for path in sorted(src.rglob("*.py")):
module_path = path.relative_to(src).with_suffix("")
doc_path = path.relative_to(src).with_suffix(".md")
full_doc_path = Path("reference", doc_path)
parts = list(module_path.parts)
if parts[-1] in [
"notebooks", "settings",
"restful_config", "__main__",
"__init__", "listen_kafka",
"sqlite_view", "manual_launch"
"messages",
]:
continue
nav[parts] = doc_path.as_posix()
with mkdocs_gen_files.open(full_doc_path, "w") as fd:
ident = ".".join(parts)
fd.write(f"::: {ident}")
mkdocs_gen_files.set_edit_path(full_doc_path, path)
with mkdocs_gen_files.open(f"reference/SUMMARY.md", "w") as nav_file:
nav_file.writelines(nav.build_literate_nav())
*[PDU]: Physical Detector Unit is the name given for the hardware module.
*[CC]: Calibration Constant
*[CCV]: Calibration Constant Version
*[CLI]: Command Line Interface
*[myMDC]: Metadata Catalog
*[CALCAT]: Calibration Catalog is the web interface for the calibration database.
\ No newline at end of file
European XFEL Offline Calibration
=================================
The European XFEL Offline Calibration (pyCalibration) is a python package that consists of different services, responsible for applying most of the offline calibration and characterization for the detectors.
- [Overview](overview.md) Offline calibration overview
- [Installation](development/installation.md) How to install pycalibration
This diff is collapsed.