"The list of available mnemonics can vary from run to run, depending on what sources were recorded. The function `mnemonics_for_run` returns the mnemonics that correspond to actual data sources in a run (`extra_data` `DataCollection`):"
"The list of available mnemonics can vary from run to run, depending on which sources were recorded. The function `mnemonics_for_run` returns the mnemonics that correspond to actual data sources in a run. The input parameters can be the proposal and run numbers of the run or the run itself (`extra_data` `DataCollection`):"
Within the framework of the [extra_data](https://extra-data.readthedocs.io/en/latest/) package, which the SCS ToolBox is built upon, the European XFEL data is organized in a hierachical structure, in which a *source* (for instance, a motor, or the output of a digitizer) contains a few datasets, accessed with a *key* (the actual position of the motor, the various channels of the digitizer). The ToolBox *mnemonics* are simple words that represent frequently used variables at the SCS instrument. Each menmonic is associated with a dictionnary containing the source, the key and the dimension names of the variable.
The mnemonics are stored in a dictionnary, accessible as `toolbox_scs.mnemonics`. Let us read the content of the mnemonic `SCS_SA3`, which corresponds to the pulse energy of the SASE 3 pulses measured by the XGM in the SCS experiment hutch:
%% Cell type:code id: tags:
``` python
importtoolbox_scsastb
tb.mnemonics['SCS_XGM']
```
%% Output
({'source': 'SCS_BLU_XGM/XGM/DOOCS:output',
'key': 'data.intensityTD',
'dim': ['XGMbunchId']},)
%% Cell type:markdown id: tags:
The list of available mnemonics can vary from run to run, depending on what sources were recorded. The function `mnemonics_for_run` returns the mnemonics that correspond to actual data sources in a run (`extra_data``DataCollection`):
The list of available mnemonics can vary from run to run, depending on which sources were recorded. The function `mnemonics_for_run` returns the mnemonics that correspond to actual data sources in a run. The input parameters can be the proposal and run numbers of the run or the run itself (`extra_data``DataCollection`):
%% Cell type:code id: tags:
``` python
run,_=tb.load(2212,213)
# providing the proposal and run numbers
run_mnemonics=tb.mnemonics_for_run(2212,213)
# alternative, providing the DataCollection as input argument
The mnemonics are by no means an exhaustive list of the contents of a run, but rather convenience shortcuts to the mostly used data sources at SCS. Please refer to the [extra_data](https://extra-data.readthedocs.io/en/latest/) package to access the full list of data sources present in a run.
</div>
%% Cell type:markdown id: tags:
## The `load` function
%% Cell type:markdown id: tags:
The `load` function of the ToolBox loads the variables recorded in a run into memory. Given a proposal number and a run number, the function in its simplest form takes a list of mnemonics as the `fields` argument. The data associated to the mnemonics is loaded and all variables are aligned by train Id and pulse Id.
The function returns an `extra_data``DataCollection` (run) and an `xarray``Dataset` (data, which is displayed here in a summarized form). The DataCollection is the key element of the `extra_data` package and it is used in many functions of the ToolBox. It contains information on the run and enables data handling and loading (see the `extra_data`[documentation](https://extra-data.readthedocs.io/en/latest/) for details). The Dataset data is the main result of our loading operation. In it, we can find:
* Dimensions `pulse_slot`, `trainId`, `sa3_pId`
* Coordinates: `trainId` and `sa3_pId`: the train Id values and the SASE 3 pulse Id values.
* Data variables: The loaded data arrays. In this example, nrj is the monochromator energy, in eV, for each train. MCP3peaks is one of the MCPs of the TIM detector, SCS_SA3 is the pulse energy of the SASE 3 pulses measured by the XGM in the SCS hutch. The bunchPatternTable is loaded by default. It is an array of 2700 values per train (the maximum number of pulses at 4.5 MHz provided by the machine) and contains information on how the pulses are distributed among SASE 1, 2, 3, and the various lasers at European XFEL. The `sa3_pId` coordinates are extracted from this table.
* Attribute `runFolder`, the name of the folder that contains the raw files of the run. It can be accessed via: `data.attrs['runFolder']`.
%% Cell type:markdown id: tags:
The (maximum) number of pulses per train is given by `data.sa3_pId.size`
%% Cell type:markdown id: tags:
## Accessing the raw arrays
%% Cell type:markdown id: tags:
The function `load`, by default, loads the raw arrays using the `get_array` function of `extra_data`, and extracts only the relevant data from them, according to the bunch pattern table. It may be required, in some cases, to access the raw array of a specific mnemonic. For this, we can use the `DataCollection` returned earlier by the call to `load`:
The `raw_traces``DataArray` contains the digitizer raw traces generated by the MCP 2 of the TIM detector. The array has dimensions `trainId` and `samplesId` (the latter given by `tb.mnemonics['MCP2raw']['dim']`). Quick visual inspection of the trace of the first train can be performed using the built-in plotting function of `xarray`: