Skip to content
Snippets Groups Projects

Feat/user notebooks

Merged Robert Rosca requested to merge feat/user-notebooks into master

Description

Adds in support for notebooks in user-defined spaces. Idea is that you can, in xfel_calibrate/notebooks.py add in a configuration like:

{
    ...
    "REMI": {
        "CORRECT": {
            "notebook": None,
            "user": {
                "notebook": "/gpfs/exfel/exp/{instrument}/{cycle}/p{proposal}/usr/calibration/notebooks/correct.ipynb",
                "venv": "/gpfs/exfel/sw/software/exfel_environments/sqs-remi-preview"
            },
            "concurrency": {
                "parameter": None,
                "use function": None,
                "default concurrency": None,
                "cluster cores": 1
            },
        },
    },
}

If ["notebook"] is set to None then user_notebook gets parsed and templated by CLI arguments, then the "notebook" key is set to this string and the rest continues as normal.

How Has This Been Tested?

  • Add in tests
  • Deploy and test with REMI CALLAB data on max-exfl17 with mymdc test instance

Manual test setup involved adding a notebook to /gpfs/exfel/exp/CALLAB/202130/p900203/usr/calibration/notebooks/correct.ipynb (this matches the REMI user notebook path pattern), then running xfel-calibrate REMI CORRECT --instrument CALLAB --cycle 202130 --proposal 900203.

This will then execute the test notebook.

The test notebook does... nothing really, it just creates an empty file called touch in the in_folder directory. If the SLURM job finishes without errors, the PDF renders correctly, and the touch file is present, then the user notebook with a user venv executed correctly.

Relevant Documents (optional)

Part of SQS Moonshot, see:

Types of changes

  • New feature (non-breaking change which adds functionality)
  • Test (additional or refactored tests)

Checklist:

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I added tests where appropriate.

Reviewers

@calibration

Notes:

Edited by Robert Rosca

Merge request reports

Checking pipeline status.

Merged by Robert RoscaRobert Rosca 3 years ago (May 7, 2021 8:33am UTC)

Loading

Activity

Filter activity
  • Approvals
  • Assignees & reviewers
  • Comments (from bots)
  • Comments (from users)
  • Commits & branches
  • Edits
  • Labels
  • Lock status
  • Mentions
  • Merge request status
  • Tracking
  • Robert Rosca changed target branch from master to test/add-xfel-calibrate-cli-tests

    changed target branch from master to test/add-xfel-calibrate-cli-tests

  • Robert Rosca changed the description

    changed the description

  • Robert Rosca changed milestone to %SQS Moonshot

    changed milestone to %SQS Moonshot

  • Robert Rosca changed the description

    changed the description

  • assigned to @roscar

  • Robert Rosca added 12 commits

    added 12 commits

    Compare with previous version

  • Where is this meant to fall on the scale between 'quick & dirty' and 'long term solution'?

    I think we should aim to get away from having a hardcoded list of notebooks which the calibration machinery is allowed to run. At the level of the xfel-calibrate CLI, perhaps it could accept an arbitrary path at the command line (maybe via a separate xfel-calibrate-nb command). For the webservice, it could be configured through the proposal YAML config.

    Obviously we may want to take a shortcut for a prototype, but I don't think it's actually much extra complexity to do what I'm thinking of. And my understanding of 'moonshot' is that we're meant to be building things to keep, not just prototypes.

  • So this is somewhat a result of me figuring out what REMI actually needs as I developed it. Initially I was certain that there will actually be configuration in the notebook (lacking a more sophisticated way of providing parameters) the scientists will want to change. But the design evolved into a much more usable way, which does not need that ultimately, but is nicely served by a YAML file (which is exposed to users though!). Thus, the notebook might as well have been in the repository to begin with, not requiring any changes at all. Now we are left with the half-hearted solution I thought I once needed.

    (Somewhat) addressing your original question: Non-hacky short-term solution

    That being said, of course I fully agree with your premise of not having hardcoded machinery at all. We've actually had a nice discussion over this in today's detector meeting :grin:

  • added Waiting for review label and removed In Progress label

  • Robert Rosca added 5 commits

    added 5 commits

    • 52488c28 - Add logic for user notebooks
    • 65a5f03f - Create usr and scratch directories in mock proposal
    • e66665a2 - Add test-cli notebook, both normal and user dir configs
    • 1639338f - Make the user path templating more flexible
    • d9bbb032 - Add user notebook tests

    Compare with previous version

  • Robert Rosca added 1 commit

    added 1 commit

    Compare with previous version

  • Robert Rosca added 1 commit

    added 1 commit

    • c54734e6 - 1 commit from branch test/add-xfel-calibrate-cli-tests

    Compare with previous version

  • Robert Rosca added 8 commits

    added 8 commits

    • 63ed5875 - 1 commit from branch test/add-xfel-calibrate-cli-tests
    • e8279768 - Allow for notebooks with `None` path
    • f5ffbd47 - Fix version parsing
    • 4c1814e5 - Add logic for user notebooks
    • 005a3cb3 - Create usr and scratch directories in mock proposal
    • f66da840 - Add test-cli notebook, both normal and user dir configs
    • 3a403ff4 - Make the user path templating more flexible
    • 5352c57c - Add user notebook tests

    Compare with previous version

  • Robert Rosca added 1 commit

    added 1 commit

    Compare with previous version

  • Robert Rosca added 9 commits

    added 9 commits

    • 5cac9af0 - 1 commit from branch test/add-xfel-calibrate-cli-tests
    • 1c6d436a - Allow for notebooks with `None` path
    • 34d1f639 - Fix version parsing
    • a3956d7c - Add logic for user notebooks
    • 9d3725bd - Create usr and scratch directories in mock proposal
    • 6287f072 - Add test-cli notebook, both normal and user dir configs
    • 9a78101a - Make the user path templating more flexible
    • 8dfe44c4 - Add user notebook tests
    • 54507606 - Adjust formatting

    Compare with previous version

  • Where is this meant to fall on the scale between 'quick & dirty' and 'long term solution'?

    I'd say somewhere around half way between the two :wink: I think the solution of allowing for variables in the path which are then filled in based on the command line arguments is a good one. But the implementation is quick and dirty because the way that the parser gets generated, and then modified, and the separate independent uses of sys.argv, etc... made it pretty hard to come up with a better solution that didn't require a lot of changes.

    I think we should aim to get away from having a hardcoded list of notebooks which the calibration machinery is allowed to run. At the level of the xfel-calibrate CLI, perhaps it could accept an arbitrary path at the command line (maybe via a separate xfel-calibrate-nb command). For the webservice, it could be configured through the proposal YAML config.

    Wouldn't moving it to the configs (I assume by proposal yaml config you mean the ones here https://git.xfel.eu/gitlab/detectors/calibration_configurations) just move them from one hardcoded place to another?

    The approach I'm taking here is to allow for fstrings in the notebook path, which then gets filled in from the CLI arguments. If they're arguments that normally get passed through the webservice anyway (e.g. cycle, proposal, and run) then you can have pretty generic paths like the one I currently have for REMI:

        "REMI": {
            "CORRECT": {
                "notebook": None,
                "user_notebook":
                    "/gpfs/exfel/exp/{instrument}/{cycle}/p{proposal}"
                    "/usr/calibration/notebooks/correct.ipynb",
                "concurrency": {
                    "parameter": None,
                    "use function": None,
                    "default concurrency": None,
                    "cluster cores": 1
                },
            },

    Which gives you a decent amount of flexibility and required the fewest changes.

    Obviously we may want to take a shortcut for a prototype, but I don't think it's actually much extra complexity to do what I'm thinking of. And my understanding of 'moonshot' is that we're meant to be building things to keep, not just prototypes.

    True, but going back to your first question, for me the long-term solution is completely changing how the CLI portion works. The current version is, to me (granted maybe I'm stupid), extremely difficult to read, nevermind understand or change.

    image

    There are a lot of other functions which are pretty complex and (at least now with modern python packages) could be refactored to be much simpler or could be completely removed. The main long term result of this moonshot work is, to me, the tests added in this PR and in !464 (merged) which lay groundwork for further tests, as well as a future refactoring of the file; and learning more about what requirements users and scientists have.

    The nice thing with this calibrate.py file is that it's not imported by anything else so changing how it works internally, as long as the exposed command line calls remain the same, ought to be pretty safe.

  • Robert Rosca unmarked as a Work In Progress

    unmarked as a Work In Progress

  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
  • Loading
Please register or sign in to reply
Loading