Logging context #245

grusin-db · 2025-06-06T10:29:48Z

TL;DR

Adds thread and async safe support of passing logging context into:

logger output
exceptions and tracebacks, utilising native feature of notes (python 3.11+)

This enables clean notification about what is the context under which given logger message, or exception/traceback occurred.

This is not a breaking change. Existing code should just work as is, without any changes to output of the logger. The change will only affect the code that is under the scope of the context manager and/or decorator.

How to use

Decorator

meant for automatic adding function parameters to the logging context, in case parameters need to be skipped they just need to be decorated with SkipLogging annotation, and they wont be added to context

from databricks.labs.blueprint.logger import logging_context, SkipLogging

@logging_context_params(foo="bar")
# `foo=bar` will be added to the context, because it's defined on decorator level
# `a` will be added, it will alway have value of parameter `a` at execution time
# `b` will be skipped from auto adding to context because is wrapped in `SkipLogging`
def do_math(a: int, b: SkipLogging[int]):
    r = pow(a, b)
    logger.info(f"result of {a}**{b} is {r}")
    return r

do_math(2, 8)
>>> 16:25:30     INFO [root] result of 2**8 is 256 (foo='bar', a=2)

Threads

Threads from databricks.labs.blueprint.parallel a has been patched to support logging context. Hence if functions being executed are decorated with logging_context_params , or are inside of the with logging_context(...)block, both logger, exceptions and tracebacks will be enriched.

Implementation note: by default threads don't handle ContextVars inheritance from parent thread, only asyncio is capable of doing this because it was part of ContextVar PEP. I've patched the Threads code to do the same.

from databricks.labs.blueprint.parallel import Threads
from functools import partial

# do_math is decorated with logging_context_params, without SkipLogging to show all parameters
Threads.strict("logging_show_off", [
    partial(do_math, 2, 2),
    partial(do_math, 2, 4), 
    partial(do_math, 2, 8), 
    partial(do_math, 2, 16), 
])
>>> 16:30:07    DEBUG [d.l.blueprint.parallel] Starting 4 tasks in 16 threads
>>> 16:30:07     INFO [root][logging_show_off_0] result of 2**2 is 4 (foo='bar', a=2, b=2)
>>> 16:30:07     INFO [root][logging_show_off_1] result of 2**4 is 16 (foo='bar', a=2, b=4)
>>> 16:30:07     INFO [root][logging_show_off_2] result of 2**8 is 256 (foo='bar', a=2, b=8)
>>> 16:30:07     INFO [root][logging_show_off_3] result of 2**16 is 65536 (foo='bar', a=2, b=16)
>>> 16:30:07     INFO [d.l.blueprint.parallel][logging_show_off_3] logging_show_off 4/4, rps: 417.493/sec
>>> 16:30:07     INFO [d.l.blueprint.parallel] Finished 'logging_show_off' tasks: 100% results available (4/4). Took 0:00:00.010311

Context manager

Context manager can be nested, the same applies to decorators. In case parameters would have overlaping names, behaviour is like with local variables, the most recent definition in scope is used.

from databricks.labs.blueprint.logger import logging_context

with logging_context(foo="bar", a=2):
    logger.info("hello")
>>> 16:26:17     INFO [root] hello (foo='bar', a=2)

Exceptions

Exception are carrying the context they occurred in, this feature is available in python 3.11+.

In case exception is bubbled up across many try / catch context scopes, only the original context is stored to provide the most meaningful information about origin of the exception

from databricks.labs.blueprint.logger import logging_context

with logging_context(user='Alice', action='read', ):
    logger.info("inside of first context")
    with logging_context(top_secret="47"):
        1/0
>>> 16:26:36     INFO [root] inside of first context (user='Alice', action='read')
>>> ---------------------------------------------------------------------------
>>> ZeroDivisionError                         Traceback (most recent call last)
>>> File <command-3369811628408695>, line 4
>>>           2 logger.info("inside of first context")
>>>           3 with logging_context(top_secret="47"):
>>> ---->     4     1/0
>>>
>>> ZeroDivisionError: division by zero
>>> Context: user='Alice', action='read', top_secret='47'

Version compatibility

Python 3.10 is supported, but with notes feature disabled, hence Exceptions, nor tracebacks wont have the attached information about the logging context at the time of the exception. Other feature of logger still work the same way as in 3.11+
Python 3.11+ is fully supported, without any limitations.

PR State:

code works & test pass
add logger formatting validation tests for any sample context
mypy complains about one of the Annotations... need to fix that for sake of static checking
docs are done in PR
docs in README.md

asnare

I started reviewing this a while ago, and apologise for taking so long to come back to it. I'm open to the idea of including context in the logging: this is essentially similar to log4j's MDC which is nice, but rarely used in practice because it turns out to be a lot of work to provide context all the time.

Stepping back a bit, there's a bit of a movement towards treating tracing as separate to logging, but there's obviously overlap: typically the what we're calling the context in this PR is related to tracing, and it's useful for logging to be aware of the tracing context. Simultaneously the move to structured logging makes this a lot simpler to deal with. Food for thought, I guess.

Getting back to this PR, I've reviewed the technical aspects of the implementation, but haven't looked at the tests yet. My broad take is:

I like that this is opt-in: existing users of the library shouldn't notice any changes: log formatting will be unaffected and overhead looks minimal. (This was a major concern when I first saw the PR come in: MDC in log4j tends to suffers from simply being noise in the logs for the most common situation where there is no context.)
There are some issues related to mutability and safety of the state; I think the inline comments are enough to explain the issue, and we need to address them.
Please type-hint all functions/methods where this is possible, as well as globals. This makes review much easier and later refactoring safer.
I'd like to see the changes to the parallel module made more generic, and split into a separate PR. When submitting tasks the context should be captured, and when the tasks run within the pool it should be within this captured context.
The tests will also need to be updated to reflect these changes.

asnare · 2025-06-25T07:39:26Z