Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Python Client] Sensitive Data Exposure in Debug Logs - No Built-in Redaction Mechanism #1025

Open
2 tasks done
ganeshrvel opened this issue Jan 4, 2025 · 3 comments
Open
2 tasks done

Comments

@ganeshrvel
Copy link

  • I confirm this is a bug with Supabase, not with my own application.
  • I confirm I have searched the Docs, GitHub Discussions, and Discord.

Describe the bug

The Supabase Python client exposes sensitive data (tokens, query parameters) in debug logs without providing any built-in mechanism to redact this information. This was previously reported in discussion https://github.com/orgs/supabase/discussions/31019 but remains unresolved. This is a security concern as sensitive tokens and data are being logged in plaintext, potentially exposing them in log files.

To Reproduce

  1. Set up a Python application using the Supabase client
  2. Enable debug logging for the client
  3. Make any API call that includes sensitive data (like authentication tokens)
  4. Check debug logs to see exposed sensitive information:
import logging
import supabase

# Configure logging
logging.basicConfig(level=logging.DEBUG)

# Initialize Supabase client
client = supabase.create_client(...)

# Make any API call
result = client.from_('sensitive_table').select('*').execute()

The debug logs will show sensitive information like:

[DEBUG] [hpack.hpack] Decoded (b'content-location', b'/sensitive_table?sensitive_token=eq.abc-1234-567899888-23333-33333-333333-333333')

Expected behavior

The Supabase Python client should:

  1. Provide built-in configuration options to redact sensitive data in debug logs
  2. Either mask sensitive tokens and parameters by default or
  3. Provide clear documentation on how to properly configure logging to protect sensitive data

System information

  • OS: Linux
  • Version of supabase-py: latest
  • Version of Python: 3.11

Additional context

Standard Python logging filters don't work effectively as the logs are generated by underlying libraries (httpx, httpcore, hpack). This is a security issue that needs proper handling at the client library level. Custom filters like:

class SensitiveDataFilter(logging.Filter):
    def filter(self, record: logging.LogRecord) -> bool:
        record.msg = re.sub(r"abc-[0-9a-f\-]+", "[REDACTED-TOKEN]", record.msg)
        return True

don't fully address the issue as they can't catch all instances of sensitive data exposure.

This issue was previously raised in discussion https://github.com/orgs/supabase/discussions/31019 without any resolution, hence filing it as a bug report given its security implications.

@ganeshrvel ganeshrvel added the bug Something isn't working label Jan 4, 2025
@juancarlospaco
Copy link
Contributor

juancarlospaco commented Jan 6, 2025

DEBUG log level should not be used for production, it should only be used for Debugging, it is meant to "print as much as possible" for Debugging purposes, also in DEBUG mode the performance may be bad.

Please don't use DEBUG mode for public stuff and you should be OK.

@silentworks silentworks removed the bug Something isn't working label Jan 8, 2025
@silentworks
Copy link
Contributor

Yes don't use DEBUG mode in production/public stuff as @juancarlospaco said.

If you are looking for a filter that works with the INFO logger then you can use this. A lot of this code was lifted from this PR supabase/realtime-py#217

import copy
import logging
import re
import httpx

class SensitiveDataFilter(logging.Filter):
    def filter(self, record: logging.LogRecord) -> bool:
        record.msg = self.sanitize_line(record.msg)
        record.args = self.sanitize_args(record.args)
        return True

    @staticmethod
    def sanitize_args(d):
        if isinstance(d, dict):
            d = d.copy()  # so we don't overwrite anything
            for k, v in d.items():
                d[k] = SensitiveDataFilter.sanitize_line(v)
        elif isinstance(d, tuple):
            # need a deepcopy of tuple turned to a list, as to not change the original values
            # otherwise we end up changing the items at the original memory location of the passed in tuple
            y = copy.deepcopy(list(d))
            for x, value in enumerate(y):
                if isinstance(value, str):
                    y[x] = re.sub(r"abc-[0-9a-f\-]+", "[REDACTED-TOKEN]", value)
                if isinstance(value, httpx.URL):
                    raw_value = str(value)
                    sanitized_url = re.sub(
                        r"abc-[0-9a-f\-]+", "[REDACTED-TOKEN]", raw_value
                    )
                    y[x] = httpx.URL(sanitized_url)
            return tuple(y)  # convert the list back to a tuple
        return d

    @staticmethod
    def sanitize_line(line):
        return re.sub(r"abc-[0-9a-f\-]+", "[REDACTED-TOKEN]", line)


# Applying the filter
logging.getLogger("httpx").addFilter(SensitiveDataFilter())

# Configure logging
logging.basicConfig(level=logging.INFO)

@ganeshrvel
Copy link
Author

@silentworks We don't use DEBUG mode in production, as @juancarlospaco mentioned. This was mainly in reference to dev builds.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants