Webservice: update Slurm jobs status in a separate thread
Description
The task to periodically query Slurm about calibration jobs and update myMdC with the results makes several blocking calls to metadata_client
and sqlite3
. Doing this in an asyncio task means either blocking the event loop (e.g. if myMdC is slow to respond), or wrapping all these calls in asyncio's thread executor.
The Slurm monitoring (update_job_db
) and the ZMQ server to accept requests from myMdC (ActionsServer
) are largely independent. The only thing both interact with is an sqlite database. So the simple option is to run them in separate threads. update_job_db
runs as a single logical task, so I turned it back into plain non-async code. ActionsServer
still uses asyncio, to allow multiple tasks waiting for runs to be transferred and sending status updates to myMdC.
The two components could even be separate processes. It seemed simpler to launch & kill them together, but it's a simple change if we ever wanted to manage them separately.
How Has This Been Tested?
Ran on max-exfl017, submitted a correction job and a dark processing one, watched logs & myMdC as they completed.
I've also documented this testing procedure in the README in this folder.
Types of changes
- Refactoring
- Documentation
Checklist:
- My code follows the code style of this project.