Launch jobs with variable nice value
Description
Based on the discussion in https://git.xfel.eu/calibration/planning/-/issues/110, this MR introduces a flexible --nice
value passed to sbatch
depending on the number of jobs running in the selected partition for this instrument. Originally only meant for active proposals, the code is cleaner when just using the partition
value determined before and is currently changed by ITDM's script anyway shortly after submission. I will clarify this point before merging.
How Has This Been Tested?
Tests! Unit tests!
Types of changes
- New feature (non-breaking change which adds functionality)
Reviewers
Merge request reports
Activity
changed milestone to %3.5.2
added Waiting for review label
- Resolved by Thomas Kluyver
I don't get it, those tests are trolling me. They work perfectly if running in my test environment, but
get_slurm_nice
seems to always return0
in the CI?!
- Resolved by Thomas Kluyver
Nice! I've been wary of trying to manage priorities because that seems like Slurm's job, but this looks pretty clean and testable.
At present, this mechanism is used to give dark processing priority over correction. Is it intentional to get rid of that, or should we include an adjustment for dark runs?
Is it possible that the fake subprocess doesn't catch the subprocess call in CI? Do you have the same version of Python (3.8.10) in your test environment?
This is pretty weird. I thought it could have been an issue caused by the containers in singularity, but I checked the pytest supbrocess code and it works by faking popen instead of messing with
PATH
and adding in shims, so it's really not clear to me what could cause this.I've got some semi-urgent stuff to work on right now for Thomas (M) so I'll try and finish that asap and check this out later today (as well as some pending MRs).
We figured out that another test was replacing the
run_proc_async
function, so the new tests failed if that test was run before them in the same session. I've fixed that by using pytest'smonkeypatch
fixture to replace it temporarily - when the test in question finishes, monkeypatch sets it back to the original value.With @kluyvert knowing how to properly use
--nice
, we finally learned it is directly substracted from the priority. I was going to rebalance the numbers and formula after this to ensure active proposals may not fall belowexfel
jobs, but it seems there partitions are assigned a PriorityTier not the same as a job's Priority.Looking at this value's documentation:
PriorityTier: Configure the partition's PriorityTier setting relative to other partitions to control the preemptive behavior when PreemptType=preempt/partition_prio. This option is not relevant if PreemptType=preempt/qos. If two jobs from two different partitions are allocated to the same resources, the job in the partition with the greater PriorityTier value will preempt the job in the partition with the lesser PriorityTier value. If the PriorityTier values of the two partitions are equal then no preemption will occur. The default PriorityTier value is 1.
NOTE: In addition to being used for partition based preemption, PriorityTier also has an effect on scheduling. The scheduler will evaluate jobs in the partition(s) with the highest PriorityTier before evaluating jobs in other partitions, regardless of which jobs have the highest Priority. The scheduler will consider the job priority when evaluating jobs within the partition(s) with the same PriorityTier.
The configuration on Maxwell right now is:
-
upex-high
: 1000 -
upex-middle
: 900 -
exfel
: 500
I would conclude that we can freely substract values from a job's priority through
--nice
without changing the priority order of partitions, i.e. we do not have to careful to substract too much from anupex-middle
job to fall belowexfel
.Do you agree with this conclusion?
-
686 penalty of 25 running jobs. 687 688 :param partition: Partition to run jobs in. 689 :param instrument: Instrument to run jobs for. 690 :param cycle: Cycle of proposal to run jobs for. 691 :param job_penalty: Scaling factor per job, 2 by default. 692 :param commissioning_penalty: Base penalty for commissioning, 693 1250 by default. 694 :return: Nice value to be passed to sbatch --nice 695 """ 696 697 # List all names for jobs running in the specified partition. 698 code, job_names = await run_proc_async( 699 ['squeue', '-h', '-o', '%.20j', '-p', partition, '--me']) 700 701 if code != 0: changed this line in version 9 of the diff
added 1 commit
- 77d6c5d0 - (fixup) Log error when squeue returns non-zero