[webservice] Intoduce Global Logger for xfel-calibrate

added 1 commit

bf1a2cab - rename test logging notebook and add a comment back

Compare with previous version

added 1 commit

637a9ed1 - rename test logging notebook and add a comment back

Compare with previous version

added 1 commit

a448d172 - rename test logging notebook and add a comment back

Compare with previous version

changed the description

Comments from the first discussion:

Consider using JSON instead of plain text for easy log ingestion.
We can save two log formats(JSON and plain text) or store only JSON and depend on another tool to access a human-readable log format from it.
Writing multiple log files per job would help in identifying duplicated and unique warnings/errors between different jobs.
Two approaches to identify a warning/error for a calibration process
- Either we only write a log if there related log-level message, hence file presence is the indication we need.
- We go with what we currently have. check log files if empty or not.
- use job-id to identify logs by other tools. webservice(job monitor), greylog, ...
info.log is currently having all log levels. I need to add a filter like for warning to avoid logging errors and warnings again in info.log

Do we have a particular purpose in mind for capturing info & debug level logs? These can be very verbose depending on the libraries that you're using. It will be small compared to the detector data we're working with, but I'd still rather not create a bunch of files without some concrete reason.

If there's something we want to record directly from the calibration code, in most cases we can just print() it so it shows up in the relevant position in the slurm-12345.out files and in the PDF report. We could also expose selected logging that way, but I don't think we'd want to catch everything.

added 53 commits

a448d172...3077e2ad - 50 commits from branch master
273b23f6 - Intoduce Global Logger for xfel-calibrate
057cb66a - store info logs separately as well instead of console and refactor for pep8
41f381b2 - rename test logging notebook and add a comment back

Compare with previous version

While trying python-json-logger it will store each log entry as a separate json entry in the file. to read this one have to read each entry in a loop.

EDIT

1st format

{"timestamp": 1727090237.1661031, "level": "ERROR", "filename": "1266379894.py", "lineno": 1, "notebook": "TEST-LOGGING__NONE__NONE", "directory": "/gpfs/exfel/data/scratch/ahmedk/test/remove/log/TEST-TEST-LOGGING-240923_111712.297118", "job_id": "9628486", "message": "Logging some (ERROR) without failing the notebook"}
{"timestamp": 1727090237.1708908, "level": "ERROR", "filename": "4120722563.py", "lineno": 108, "notebook": "TEST-LOGGING__NONE__NONE", "directory": "/gpfs/exfel/data/scratch/ahmedk/test/remove/log/TEST-TEST-LOGGING-240923_111712.297118", "job_id": "9628486", "message": "An error occurred. Exception type: ValueError, Message: FAIL", "exc_info": "Traceback (most recent call last):\n  File \"/home/ahmedk/calibration/pycalibration/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py\", line 3577, in run_code\n    exec(code_obj, self.user_global_ns, self.user_ns)\n  File \"/tmp/ipykernel_85641/3022636508.py\", line 1, in <module>\n    raise ValueError('FAIL')\nValueError: FAIL"}

With a derived class to overwrite the FileHandler, I can create an array-like format to have a full JSON file.

2nd format

[
{
  "timestamp": "2024-09-24 12:46:06,419",
  "level": "ERROR",
  "filename": "1266379894.py",
  "lineno": 1,
  "notebook": "TEST-LOGGING__NONE__NONE",
  "directory": "/gpfs/exfel/data/scratch/ahmedk/test/remove/log/TEST-TEST-LOGGING-240924_104550.513219",
  "job_id": "9636995",
  "class": "DefaultClass",
  "message": "Logging some (ERROR) without failing the notebook"
},
{
  "timestamp": "2024-09-24 12:46:06,426",
  "level": "ERROR",
  "filename": "1852384038.py",
  "lineno": 114,
  "notebook": "TEST-LOGGING__NONE__NONE",
  "directory": "/gpfs/exfel/data/scratch/ahmedk/test/remove/log/TEST-TEST-LOGGING-240924_104550.513219",
  "job_id": "9636995",
  "class": "ValueError",
  "message": "An error occurred. Exception type: ValueError, Message: FAIL",
  "exc_info": "Traceback (most recent call last):\n  File \"/home/ahmedk/calibration/pycalibration/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py\", line 3577, in run_code\n    exec(code_obj, self.user_global_ns, self.user_ns)\n  File \"/tmp/ipykernel_677383/3022636508.py\", line 1, in <module>\n    raise ValueError('FAIL')\nValueError: FAIL"
}
]

I tried even this kind of format, where we have logs collected by different classes

3rd format

{
  "DefaultClass": [
    {
      "timestamp": "2024-09-24 13:01:42,849",
      "level": "ERROR",
      "filename": "1266379894.py",
      "lineno": 1,
      "notebook": "TEST-LOGGING__NONE__NONE",
      "directory": "/gpfs/exfel/data/scratch/ahmedk/test/remove/log/TEST-TEST-LOGGING-240924_110139.632251",
      "job_id": "9637021",
      "class": "DefaultClass",
      "message": "Logging some (ERROR) without failing the notebook"
    }
  ],
  "ValueError": [
    {
      "timestamp": "2024-09-24 13:01:42,856",
      "level": "ERROR",
      "filename": "2356325268.py",
      "lineno": 120,
      "notebook": "TEST-LOGGING__NONE__NONE",
      "directory": "/gpfs/exfel/data/scratch/ahmedk/test/remove/log/TEST-TEST-LOGGING-240924_110139.632251",
      "job_id": "9637021",
      "class": "ValueError",
      "message": "An error occurred. Exception type: ValueError, Message: FAIL",
      "exc_info": "Traceback (most recent call last):\n  File \"/home/ahmedk/calibration/pycalibration/.venv/lib/python3.11/site-packages/IPython/core/interactiveshell.py\", line 3577, in run_code\n    exec(code_obj, self.user_global_ns, self.user_ns)\n  File \"/tmp/ipykernel_2395076/3022636508.py\", line 1, in <module>\n    raise ValueError('FAIL')\nValueError: FAIL"
    }
  ]
}

The question here is which format would be the most efficient for our use-case and for the webservice

added 1 commit

28309d57 - feat: store logs in JSON format and split in different files for job_id

Compare with previous version

added 1 commit

e0440a0b - feat: simplify implementation to JSON per line

Compare with previous version

added 1 commit

8120b13f - fix: remove directory

Compare with previous version

added 1 commit

3f89e843 - fix: remove directory

Compare with previous version

[webservice] Intoduce Global Logger for xfel-calibrate

Description

Related Issues

Changes

1. New Logging Setup Module

2. Log File Structure

3. Implementation Details

4. Next step

Implementation Steps

How Has This Been Tested?

Relevant Documents (optional)

Types of changes

Checklist:

Reviewers

Activity

1st format

2nd format

3rd format

[webservice] Intoduce Global Logger for xfel-calibrate

Description

Related Issues

Changes

1. New Logging Setup Module

2. Log File Structure

3. Implementation Details

4. Next step

Implementation Steps

How Has This Been Tested?

Relevant Documents (optional)

Types of changes

Checklist:

Reviewers

Merge request reports

Activity

1st format

2nd format

3rd format