Skip to content
Snippets Groups Projects

Add --slurm-partition CLI parameter to xfel-calibrate

Merged Thomas Kluyver requested to merge config-slurm-partition into master
2 unresolved threads

Description

We want to control the Slurm partition used by proposal and by action (correct vs dark). Making this a command line option for xfel-calibrate should allow it to be set in proposal YAML config files like slurm-partition: upex-high.

How Has This Been Tested?

I have tested launching xfel-calibrate with --slurm-partition upex-high. As expected, it passes --partition upex-high to sbatch, and then fails because I don't have permission to use that partition. I also checked that without suppplying --slurm-partition, it still submits to the exfel partition as before.

Types of changes

  • New feature (non-breaking change which adds functionality)

Checklist:

  • My code follows the code style of this project.

Reviewers

@schmidtp @ahmedk

Edited by Thomas Kluyver

Merge request reports

Loading
Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
582 614 :return: List of commands and parameters to be used by subprocess
583 615 """
584 616
585 launcher_slurm = launcher_command.format(temp_path=temp_path)
617 launcher_slurm = launcher_command.format(temp_path=temp_path).split()
586 618
587 # calculate number of general nodes available
588 free = int(check_output(free_nodes_cmd, shell=True).decode('utf8'))
589 preempt = int(check_output(preempt_nodes_cmd, shell=True).decode('utf8'))
590 ureservation = args['reservation']
591 priority = args['priority']
592
593 if (ureservation == "" and
594 (free + preempt >= max_reserved or
595 priority > 1 or
596 reservation == "")):
  • I have tried to preserve the behaviour in all cases where the new --slurm-partition option is not used, but I've rearranged the logic in this if to a form that I find easier to follow.

  • It took me some time to understand this logic and compare it to the simple one you introduced instead. nice update.

  • Please register or sign in to reply
  • Thomas Kluyver changed the description

    changed the description

    • I've done a bit of interactive testing (details in the description).

      It's worth noting that when a partition or a reservation is specified in the proposal YAML, that will be used for all jobs of that type. There's some logic in the code for checking how many nodes are free and using that to decide when to use a reservation, but I'm assuming that the options configured in the proposal should be used unconditionally.

    • Definitely. The idea behind the new partition scheme is to take care of proper allocation on a higher level, so we can simply queue up and let SLURM handle it.

    • Please register or sign in to reply
  • Great, thanks!

    Final question, sorry for not looking myself: This covers partition, !467 (merged) covers the number of nodes - what about --reservation? Are we using some fixed? Can we configure this as well in the YAML file?

  • The reservation can already be configured in the YAML file (reservation: x -> --reservation x command line option). This overrides any partition setting if it's used.

    A reservation can also be configured in the settings.py file in this package. This has lower priority than the command line arguments (and the equivalents in proposal YAML config). And this is also where the 'how many nodes are free?' check applies.

  • Great, then we should be set with this configuration to realize whatever the users come up with. Except being able to queue as a user, but that's something to tackle later.

    It's still not sure if the users really want to have a single job take up to 96 nodes, or rather run several jobs at the same time.

    I'll try to review today, but if any @calibration can spare a moment, let's get this merged.

  • The code changes LGTM. Letting get_launcher_command handle a list the entire time instead of splitting at the end is definitely an improvement.

  • Thanks all :-)

  • Thomas Kluyver mentioned in commit 02d1e4bb

    mentioned in commit 02d1e4bb

  • Please register or sign in to reply
    Loading