Skip to content
Snippets Groups Projects
Commit 30a2ff36 authored by Steffen Hauf's avatar Steffen Hauf
Browse files

More docs

parent 090f28ee
No related branches found
No related tags found
1 merge request!5Clean
......@@ -96,7 +96,7 @@ which would translate to `5,6,7,8,9,12,13,18,19,20`. It is also a required param
The parameter `local_output` is a Boolean.
The `cluster_profile` parameter is a bit special, in that the tool kit expects exactly this
name to provide the profile name for an `ipcluster_` being run. Hence you use `ipcluster`
name to provide the profile name for an ipcluster_ being run. Hence you use `ipcluster`
for parallelisation, define your profile name in this variable.
The excerpt above is from a flat field characterization notebook for AGIPD. The code would lead
......@@ -182,7 +182,7 @@ You may use a wide variaty of libraries available in Python, but keep in mind th
wanting to run the tool will need to install these requirements as well. Thus,
* do not use a specialized tool if an accepted alternative exists. Plots e.g. should usually
be created using `matplotlib_` and numerical processing should be done in `numpy_`.
be created using matplotlib_ and numerical processing should be done in numpy_.
* keep runtimes and library requirements in mind. A library doing its own parallelism either
needs to programatically be able to set this up, or automatically do so. If you need to
......@@ -196,7 +196,7 @@ If your notebook produces output data, consider writing data out as early as pos
such that it is available as soon as possible. Detailed plotting and inspection can
possibly done later on in a notebook.
Also consider using HDF5 via `h5py_` as your output format. If you correct or calibrated
Also consider using HDF5 via h5py_ as your output format. If you correct or calibrated
input data, which adhears to the XFEL naming convention, you should maintain the convention
in your output data. You should not touch any data that you do not actively work on and
should assure that the `INDEX` and identifier entries are syncronized with respect to
......@@ -241,8 +241,77 @@ The report will contain 150 dpi png images of your plots. If you need higher qua
of individual plot files you should save these separetly, e.g. via `fig.savefig(...)` yourself.
Calibration Database Interaction
--------------------------------
Tasks which require calibration constants or produce such should do this by interacting with
the European XFEL calibration database.
In terms of developement workflow it is usually easier to work with file-based I/O first and
only switch over to the database after the algorithmic part of the notebook has matured.
Reasons for this include:
* for developing against the database new constants will have to be integrated therein first
* if the parameters a constant depends on change a lot during early development these
updates will always have to be propagated to the database manually
* database access is limited to the XFEL networks, making offline development more difficult.
Once a stable point is reached, database access can be enabled according to the iCalibrationDB_
documentation.
Providing Performance Statistics
--------------------------------
The final step in notebook development should be to inject performance parameters into the
InfluxDB_ installation tracking these. This can be done relatively easy via the interfaces
provided in the `cal_tools` subpackage::
from cal_tools.cal_tools import get_notebook_name
from cal_tools.influx import InfluxLogger
from datetime import datetime
logger = InfluxLogger(detector="LPD", instrument=instrument, mem_cells=mem_cells,
notebook=get_notebook_name(), proposal=proposal)
start = datetime.now()
# ... do something that takes time
duration = (datetime.now()-start).total_seconds()
logger.runtime_summary_entry(success=True, runtime=duration,
total_sequences=total_sequences,
filesize=total_file_size)
Testing
-------
The most important test is that your notebook completes flawlessy outside any special
tool chain feature. After all, the tool chain will only replace parameters, and then
launch a concurrent job and generate a report out of notebook. If it fails to run in the
normal Jupyter notebook environment, it will certainly fail in the tool chain environment.
Once you are satisfied with your current state of initial development, you can add it
to the list of notebooks as mentioned in the configuration_ section.
Any changes you now make in the notebook will be automatically propagated to the command line.
Specifically, you should verify that all arguments are parsed correctly, e.g. by calling::
python calibrate_nbc.py DETECTOR NOTEBOOK_TYPE --help
From then on, check include if parallel slurm jobs are exectuted correctly and if a report
is generated at the end.
Finally, you should verify that the report contains the information you'd like to convey and
is intelegable to people other than you.
.. _nbparameterise: https://github.com/takluyver/nbparameterise
.. _ipcluster: https://ipyparallel.readthedocs.io/en/latest/
.. _matplotlib: https://matplotlib.org/
.. _numpy: http://www.numpy.org/
.. _h5py: https://www.h5py.org/
\ No newline at end of file
.. _h5py: https://www.h5py.org/
.. _iCalibrationDB: https://in.xfel.eu/readthedocs/docs/icalibrationdb/en/latest/
.. _InfluxDB: https://www.influxdata.com/
\ No newline at end of file
0% Loading or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment