Stop execution on notebook errors

So we can neatly test with this patch on Wednesday, but I'm wondering whether it makes sense to have it as a toggle-able option in the configuration. We don't have a proper way to set global defaults right now, but this might still allow us to change the behaviour quickly during operations. Alternatively, we might simply revert this change in these cases.

I agree that it's probably useful to be able to undo the change quickly if it causes a problem. Three options in order of increasing technical complexity:

Just revert the change as a hotfix if needed: add --on-error-resume-next back to the script and deploy pycalibration again.
Add an option like xfel-calibrate --ignore-nb-errors, which could be added to webservice config to go back globally. Change the YAML file, restart the webservice.
Allow controlling it per instrument/proposal/detector via calibration_configurations. Use update_config.py to flip the switch as needed.

I'm leaning towards 1. I think the main reason to add configuration options would be as a signal that we're serious about not breaking things - but so long as we can fix any issues quickly, I'm not sure anyone cares about how we do it. And if we add a config option, there's a risk that it becomes basically permanent.

So currently with this MR there will be a produced pdf (without the error message) and the slurm out files (error message and logs in slurm***.out). How are you planning to address this later as you mentioned?
I put my voice with (1)

Just revert the change as a hotfix if needed: add --on-error-resume-next back to the script and deploy pycalibration again.

in case a lot of unexpected errors occurred. We should aim to fix the errors before reverting this.

You're right, the notebook which errors, and therefore that section of the PDF report, won't contain any output. I thought it would be saved with the output up to that point, but I had remembered that wrong. Maybe we need to fix that in princess first.

https://github.com/European-XFEL/princess/pull/1

added 1 commit

900df369 - Use princess 0.3 to save notebook after error

Compare with previous version

I have merged the change in princess, and updated the version we use, so the notebooks should still be saved after an error, and therefore the output should appear in the PDF report.

added 36 commits

900df369...7acf1828 - 34 commits from branch master
03046a38 - Stop execution on notebook errors
b4049d62 - Use princess 0.3 to save notebook after error

Compare with previous version

Thanks, LGTM.

So what is the plan now we go with which point? and It would be good to have it merged before next deployment so we can test it along with other developed MRs.

I think we're OK to just revert the change if needed - no-one seems to have really argued that that's insufficient. But I'd like at least @schmidtp to OK that before I merge it.

I like it, let's do it.

merged

mentioned in commit 06f3c810

mentioned in merge request !525 (merged)

changed milestone to %3.4.1

Stop execution on notebook errors

Description

How Has This Been Tested?

Types of changes

Checklist:

Reviewers

Activity

Stop execution on notebook errors

Description

How Has This Been Tested?

Types of changes

Checklist:

Reviewers

Merge request reports

Activity