Relates to https://in.xfel.eu/redmine/issues/76705
The main change is splitting the part that periodically checks the status of Slurm jobs into a separate process, rather than a thread in the webservice. It was bundled together to make it convenient to launch & kill them together, but once we're using a service manager, there's more advantage to having a status for each component (e.g. either can be automatically restarted on crash). We had already found that these parts are only connected by an SQLite database (and Slurm's own database), so no extra work is needed to run them in separate processes.
I split the code into two files as well, to make navigation easier.
I also made logging go to stderr by default, rather than a file. Within systemd, this means that logging will go to the journal, to be available through journalctl
. We could also remove the timestamps from the log formatting, as systemd adds timestamps externally, but I haven't done this yet.
TBD