Skip to content

Conversation

grusin-db
Copy link
Contributor

@grusin-db grusin-db commented Jun 6, 2025

TL;DR

Adds thread and async safe support of passing logging context into:

This enables clean notification about what is the context under which given logger message, or exception/traceback occurred.

This is not a breaking change. Existing code should just work as is, without any changes to output of the logger. The change will only affect the code that is under the scope of the context manager and/or decorator.

How to use

Decorator

meant for automatic adding function parameters to the logging context, in case parameters need to be skipped they just need to be decorated with SkipLogging annotation, and they wont be added to context

from databricks.labs.blueprint.logger import logging_context, SkipLogging

@logging_context_params(foo="bar")
# `foo=bar` will be added to the context, because it's defined on decorator level
# `a` will be added, it will alway have value of parameter `a` at execution time
# `b` will be skipped from auto adding to context because is wrapped in `SkipLogging`
def do_math(a: int, b: SkipLogging[int]):
    r = pow(a, b)
    logger.info(f"result of {a}**{b} is {r}")
    return r

do_math(2, 8)
>>> 16:25:30     INFO [root] result of 2**8 is 256 (foo='bar', a=2)

Threads

Threads from databricks.labs.blueprint.parallel a has been patched to support logging context. Hence if functions being executed are decorated with logging_context_params , or are inside of the with logging_context(...)block, both logger, exceptions and tracebacks will be enriched.

Implementation note: by default threads don't handle ContextVars inheritance from parent thread, only asyncio is capable of doing this because it was part of ContextVar PEP. I've patched the Threads code to do the same.

from databricks.labs.blueprint.parallel import Threads
from functools import partial

# do_math is decorated with logging_context_params, without SkipLogging to show all parameters
Threads.strict("logging_show_off", [
    partial(do_math, 2, 2),
    partial(do_math, 2, 4), 
    partial(do_math, 2, 8), 
    partial(do_math, 2, 16), 
])
>>> 16:30:07    DEBUG [d.l.blueprint.parallel] Starting 4 tasks in 16 threads
>>> 16:30:07     INFO [root][logging_show_off_0] result of 2**2 is 4 (foo='bar', a=2, b=2)
>>> 16:30:07     INFO [root][logging_show_off_1] result of 2**4 is 16 (foo='bar', a=2, b=4)
>>> 16:30:07     INFO [root][logging_show_off_2] result of 2**8 is 256 (foo='bar', a=2, b=8)
>>> 16:30:07     INFO [root][logging_show_off_3] result of 2**16 is 65536 (foo='bar', a=2, b=16)
>>> 16:30:07     INFO [d.l.blueprint.parallel][logging_show_off_3] logging_show_off 4/4, rps: 417.493/sec
>>> 16:30:07     INFO [d.l.blueprint.parallel] Finished 'logging_show_off' tasks: 100% results available (4/4). Took 0:00:00.010311

Context manager

Context manager can be nested, the same applies to decorators. In case parameters would have overlaping names, behaviour is like with local variables, the most recent definition in scope is used.

from databricks.labs.blueprint.logger import logging_context

with logging_context(foo="bar", a=2):
    logger.info("hello")
>>> 16:26:17     INFO [root] hello (foo='bar', a=2)

Exceptions

Exception are carrying the context they occurred in, this feature is available in python 3.11+.

In case exception is bubbled up across many try / catch context scopes, only the original context is stored to provide the most meaningful information about origin of the exception

from databricks.labs.blueprint.logger import logging_context

with logging_context(user='Alice', action='read', ):
    logger.info("inside of first context")
    with logging_context(top_secret="47"):
        1/0
>>> 16:26:36     INFO [root] inside of first context (user='Alice', action='read')
>>> ---------------------------------------------------------------------------
>>> ZeroDivisionError                         Traceback (most recent call last)
>>> File <command-3369811628408695>, line 4
>>>           2 logger.info("inside of first context")
>>>           3 with logging_context(top_secret="47"):
>>> ---->     4     1/0
>>>
>>> ZeroDivisionError: division by zero
>>> Context: user='Alice', action='read', top_secret='47'

Version compatibility

  • Python 3.10 is supported, but with notes feature disabled, hence Exceptions, nor tracebacks wont have the attached information about the logging context at the time of the exception. Other feature of logger still work the same way as in 3.11+
  • Python 3.11+ is fully supported, without any limitations.

PR State:

  • code works & test pass
  • add logger formatting validation tests for any sample context
  • mypy complains about one of the Annotations... need to fix that for sake of static checking
  • docs are done in PR
  • docs in README.md

@grusin-db grusin-db requested a review from nfx as a code owner June 6, 2025 10:29
@grusin-db grusin-db marked this pull request as draft June 6, 2025 11:51
@grusin-db grusin-db marked this pull request as ready for review June 20, 2025 08:46
@gueniai gueniai requested review from asnare and removed request for nfx June 24, 2025 20:19
@asnare asnare added the enhancement New feature or request label Jun 25, 2025
Copy link
Contributor

@asnare asnare left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I started reviewing this a while ago, and apologise for taking so long to come back to it. I'm open to the idea of including context in the logging: this is essentially similar to log4j's MDC which is nice, but rarely used in practice because it turns out to be a lot of work to provide context all the time.

Stepping back a bit, there's a bit of a movement towards treating tracing as separate to logging, but there's obviously overlap: typically the what we're calling the context in this PR is related to tracing, and it's useful for logging to be aware of the tracing context. Simultaneously the move to structured logging makes this a lot simpler to deal with. Food for thought, I guess.

Getting back to this PR, I've reviewed the technical aspects of the implementation, but haven't looked at the tests yet. My broad take is:

  • I like that this is opt-in: existing users of the library shouldn't notice any changes: log formatting will be unaffected and overhead looks minimal. (This was a major concern when I first saw the PR come in: MDC in log4j tends to suffers from simply being noise in the logs for the most common situation where there is no context.)
  • There are some issues related to mutability and safety of the state; I think the inline comments are enough to explain the issue, and we need to address them.
  • Please type-hint all functions/methods where this is possible, as well as globals. This makes review much easier and later refactoring safer.
  • I'd like to see the changes to the parallel module made more generic, and split into a separate PR. When submitting tasks the context should be captured, and when the tasks run within the pool it should be within this captured context.
  • The tests will also need to be updated to reflect these changes.

@@ -0,0 +1,179 @@
"""internall plumbing for passing logging context (dict) to logger instances"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""internall plumbing for passing logging context (dict) to logger instances"""
"""Internal plumbing for passing logging context (dict) to logger instances."""

from types import MappingProxyType
from typing import TYPE_CHECKING, Annotated, Any, TypeVar, get_origin

if TYPE_CHECKING:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we don't use TYPE_CHECKING guards: our IDEs and mypy work fine without it, and it can interfere with some refactoring operations.

I think this code will work fine without the guard?

return Annotated[item, SkipLogging()]


_CTX: ContextVar = ContextVar("ctx", default={})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name of the variable and the first parameter are supposed to match, for example:

Suggested change
_CTX: ContextVar = ContextVar("ctx", default={})
_CTX: ContextVar = ContextVar("_CTX", default={})

That said, there's a bigger safety issue we need to think about here, because the dictionary holding the context is mutable. This leads to some weird consequences:

  • The default of {} will be problematic: I expect this to allow the context to be shared between contexts: context $a$ obtains the default (via current_context(), for example), and updates it to include $k=v$; if context $b$ then obtains the default it will include the $k=v$ from context $a$ because it was all the same dictionary instance.
  • The .reset() semantics can be surprising because they assume immutable values, but the dictionary is not.

From a practical point of view, I can see most of the code is avoiding this issue by creating new dictionaries, but there's a path to problems via current_context().

Some things that would help here:

  • Type-hint _CTX as a ContextVar[Mapping[str,Any]]: this will (I think?) allow the type-checker to detect updates to the returned value, useful for verifying our own code doesn't accidentally update the instance.
  • Either make current_context() protected or return a defensive copy of the internal dictionary.
  • A comment explaining the situation so that the next maintainer doesn't accidentally break it. ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[Later note: I've thought about this some more, and also think it's safest to have no default value, but rather use .get(None) when fetching, and initialise if None is returned.]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch with that, let me handle default internally and now worry about {}

yield name


def _skip_dict_key(params: dict, keys_to_skip: set):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we type-hint this properly, please?

return ", ".join(f"{k}={v!r}" for k, v in params.items())


def _get_skip_logging_param_names(sig: inspect.Signature):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please annotate the return-type.

class LoggingContextInjectingFilter(logging.Filter):
"""Adds current_context() to the log record."""

def filter(self, record):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Type-hints, please.

Comment on lines +157 to +158
record.context = f"{_params_str(ctx)}" if ctx else ""
record.context_msg = f" ({record.context})" if record.context else ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious why we're attaching the context rendered as a string? I would think attaching it as a dictionary is preferable?

Maybe something like:

Suggested change
record.context = f"{_params_str(ctx)}" if ctx else ""
record.context_msg = f" ({record.context})" if record.context else ""
record.context = ctx
record.context_msg = f" ({_params_str(ctx)})" if ctx else ""

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've put it as string to remove the quotes around name of key, it takes less space this way and feels easier to read:

d = { 'process': 'a', 'name': 'Alice', 'priority': 70}

# {'process': 'a', 'name': 'Alice', 'priority': 70}
print(d) 

# {'process': 'a', 'name': 'Alice', 'priority': 70}
print(repr(d)) 

# process = 'a', name = 'Alice', priority = 70
print(", ".join(f"{k} = {v!r}" for k, v in d.items()))

but all in all, I don't have strong opinion

Comment on lines +162 to +179
class LoggingThreadPoolExecutor(ThreadPoolExecutor):
"""ThreadPoolExecutor drop in replacement that will apply current loging context to all new started threads."""

def __init__(self, max_workers=None, thread_name_prefix="", initializer=None, initargs=()):
self.__current_context = current_context()
self.__wrapped_initializer = initializer

super().__init__(
max_workers=max_workers,
thread_name_prefix=thread_name_prefix,
initializer=self._logging_context_init,
initargs=initargs,
)

def _logging_context_init(self, *args):
_CTX.set(self.__current_context)
if self.__wrapped_initializer:
self.__wrapped_initializer(*args)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was mildly surprised that we need this, I thought that the threading module and friends was updated with contextvars support but it looks like only asuncion was retrofitted.

With this in mind, you're correct that context variables aren't properly transferred into execution contexts in other threads. I suspect this is partly because it's not always clear at which point the context should be preferred.

For example, using a basic Thread you could choose to capture the context during __init__() or during start(). Ensuring this happens means doing something like:

class ContextThread(threading.Thread):
    def __init__(self, *args, **kw) -> None:
        self._captured_ctx = contextvars.copy_context()
        return super().__init__(*args, **kw)
    def run(self):
        transferred_ctx = self._captured_ctx
        del self._transfer_ctx
        return transferred_ctx.run(super().run)

Returning to the situation at hand: ThreadPoolExecutor. For this what really matters is the context when a task is submitted to the pool. That is, during .submit() the context of the caller should be captured so that when the submitted function is executed the captured context of the submitter is in place. (This is a bit fiddly, but my example above shows the basics of it… it's even cleaner because there's no temporary state required on the pool instance.)

I'd like to see this implemented for contextvars in general, irrespective of the logging changes that we're making. Maybe it should be a separate PR, just to limit the scope and ensure we get that right independently of these changes. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer to patch this in this PR, otherwise it's quite hard to prove usefulness of this feature when we don't have tests for multithreading, and only then I see benefits of using this feature. On non multithreading code, just reading top down logs is enough to understand context.

I can make it working in submit(), will update code soon.

Comment on lines +108 to +113

# safe check, just in case injection filter is removed
context_repr = record.context if hasattr(record, "context") else ""
context_msg = f" {self.GRAY}({context_repr}){self.RESET}" if context_repr else ""

return f"{self.GRAY}{timestamp}{self.RESET} {level} {color_marker}[{name}]{thread_name} {msg}{self.RESET}{context_msg}"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, how about?

Suggested change
# safe check, just in case injection filter is removed
context_repr = record.context if hasattr(record, "context") else ""
context_msg = f" {self.GRAY}({context_repr}){self.RESET}" if context_repr else ""
return f"{self.GRAY}{timestamp}{self.RESET} {level} {color_marker}[{name}]{thread_name} {msg}{self.RESET}{context_msg}"
context_msg = getattr("record", "context_msg", "")
if context_msg:
formatted = f"{self.GRAY}{timestamp}{self.RESET} {level} {color_marker}[{name}]{thread_name} {msg}{self.GRAY}{context_msg}{self.RESET}
else:
formatted = f"{self.GRAY}{timestamp}{self.RESET} {level} {color_marker}[{name}]{thread_name} {msg}{self.RESET}
return formatted

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here the plot thickens… ideally we'd capture the context (as discussed earlier) when the task to run is being created, but by the time we reach here that's already happened. That said, we should be capturing the caller context during task submission and ensuring that when each task runs on the other thread, it starts within the captured context. (The task may modify it further, of-course.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants