[Webservice] Monitor Slurm jobs in separate process (!668) · Merge requests · calibration / pycalibration

Merged Thomas Kluyver requested to merge separate-job-monitor into master 2 years ago

Description

Now that we have a process supervisor for the webservice & serve_overview process, it makes sense for this to be a separate process as well, rather than a thread in the webservice process (which was convenient when we launched it manually). The diff here is mostly just moving code that already exists into a separate file for clarity.

This means its logs will be visible separately, and the supervisor can restart the job monitor if it fails. The overview server (http://max-exfl016.desy.de:8008/ ) will no longer show log messages from job monitoring, which it currently does. We could add them as a separate block if needed, or use caldeploy logs to look at them.

A related change will be needed in the deployment tools.

How Has This Been Tested?

Run on max-exfl017, see comment below.

Types of changes

Refactor (refactoring code with no functionality changes)

Checklist:

My code follows the code style of this project.

Reviewers

@schmidtp @roscar

Edited 2 years ago by Thomas Kluyver

Activity

Please register or sign in to reply

[Webservice] Monitor Slurm jobs in separate process