Skip to content

Fix: prevent signed integer overflow in OP_MSG message sizes#2693

Open
HyperPS wants to merge 11 commits intomongodb:masterfrom
HyperPS:fix/int32-overflow-opmsg
Open

Fix: prevent signed integer overflow in OP_MSG message sizes#2693
HyperPS wants to merge 11 commits intomongodb:masterfrom
HyperPS:fix/int32-overflow-opmsg

Conversation

@HyperPS
Copy link

@HyperPS HyperPS commented Jan 30, 2026

Changes in this PR

This change fixes multiple signed integer overflow risks when computing MongoDB wire protocol message lengths in the C extension (pymongo/_cmessagemodule.c).

Previously, several message size calculations were performed using int and written directly into int32 fields without validating bounds. In edge cases involving large buffers or payloads, this could lead to signed integer truncation, incorrect message lengths, and potential memory corruption.

This patch:

  • Introduces a shared _check_int32_size() helper to validate all computed
    message and section sizes before downcasting to int32_t
  • Converts intermediate size calculations to size_t
  • Applies consistent bounds checks across:
    • OP_QUERY
    • OP_GET_MORE
    • OP_MSG (including type 1 payloads)
    • Batched OP_MSG writes
    • Batched legacy write commands

Security Impact

This issue was reported via Huntr as a potential integer overflow leading to malformed MongoDB wire protocol messages. While exploitation requires crafted inputs, validating message sizes defensively prevents undefined behavior and improves robustness of the PyMongo C extension.

Huntr report:

@HyperPS HyperPS requested a review from a team as a code owner January 30, 2026 15:31
@HyperPS HyperPS requested a review from sleepyStick January 30, 2026 15:31
Copy link
Contributor

@Jibola Jibola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for identifying this code improvement.

Overall, Logic looks good, but please fix formatting to match _cmessagemodule.c and confirm all length writes now use _check_int32_size.

@HyperPS
Copy link
Author

HyperPS commented Feb 1, 2026

This fixes signed int32 overflow issues in OP_MSG and buffer size calculations by validating message and section lengths before casting to int32, preventing integer truncation and potential memory corruption.
Please let me know if you’d like any adjustments.

HyperPS and others added 2 commits February 1, 2026 15:14
Co-authored-by: Jib <Jibzade@gmail.com>
Co-authored-by: Jib <Jibzade@gmail.com>
@HyperPS HyperPS requested a review from Jibola February 1, 2026 20:24
@HyperPS
Copy link
Author

HyperPS commented Feb 21, 2026

Sir @Jibola — any updates on this please? Could you also confirm whether the Huntr report has been fully validated and if this is being tracked as a potential CVE-worthy issue, given the integer overflow → possible memory corruption risk?

Copy link
Member

@ShaneHarvey ShaneHarvey left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a quick review it looks like we're changing a lot of places that either can't overflow/underflow or already have proper checks. Could you clarify my questions?

@ShaneHarvey
Copy link
Member

It looks like we only call buffer_write_bytes_ssize_t in a few places and only with the collection or id name:

$ git grep -A2 buffer_write_bytes_ssize_t | tee
pymongo/_cmessagemodule.c:61:static int buffer_write_bytes_ssize_t(buffer_t buffer, const char* data, Py_ssize_t size) {
pymongo/_cmessagemodule.c-62-    int downsize = _downcast_and_check(size, 0);
pymongo/_cmessagemodule.c-63-    if (size == -1) {
--
pymongo/_cmessagemodule.c:115:        !buffer_write_bytes_ssize_t(buffer, collection_name,
pymongo/_cmessagemodule.c-116-                                    collection_name_length + 1) ||
pymongo/_cmessagemodule.c-117-        !buffer_write_int32(buffer, (int32_t)num_to_skip) ||
--
pymongo/_cmessagemodule.c:191:        !buffer_write_bytes_ssize_t(buffer,
pymongo/_cmessagemodule.c-192-                                    collection_name,
pymongo/_cmessagemodule.c-193-                                    collection_name_length + 1) ||
--
pymongo/_cmessagemodule.c:290:        if (!buffer_write_bytes_ssize_t(buffer, identifier, identifier_length + 1)) {
pymongo/_cmessagemodule.c-291-            goto fail;
pymongo/_cmessagemodule.c-292-        }
--
pymongo/_cmessagemodule.c:716:        !buffer_write_bytes_ssize_t(buffer, ns, ns_len + 1) || /* namespace */
pymongo/_cmessagemodule.c-717-        !buffer_write_bytes(buffer,
pymongo/_cmessagemodule.c-718-                            "\x00\x00\x00\x00"  /* skip */

Even if we set the collection name to be larger that a >32 int (so that it will overflow), the C extensions properly error:

>>> name = '1'*1024*1024*1024*2
>>> len(name)
2147483648
>>> coll = client.t[name]
>>> coll.insert_one({})
Traceback (most recent call last):
  File "<python-input-23>", line 1, in <module>
    coll.insert_one({})
    ~~~~~~~~~~~~~~~^^^^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/collection.py", line 891, in insert_one
    self._insert_one(
    ~~~~~~~~~~~~~~~~^
        document,
        ^^^^^^^^^
    ...<5 lines>...
        comment=comment,
        ^^^^^^^^^^^^^^^^
    ),
    ^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/collection.py", line 831, in _insert_one
    self._database.client._retryable_write(
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^
        acknowledged, _insert_command, session, operation=_Op.INSERT
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/mongo_client.py", line 2083, in _retryable_write
    return self._retry_with_session(retryable, func, s, bulk, operation, operation_id)
           ~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/mongo_client.py", line 1968, in _retry_with_session
    return self._retry_internal(
           ~~~~~~~~~~~~~~~~~~~~^
        func=func,
        ^^^^^^^^^^
    ...<4 lines>...
        operation_id=operation_id,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/shane/git/mongo-python-driver/pymongo/_csot.py", line 125, in csot_wrapper
    return func(self, *args, **kwargs)
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/mongo_client.py", line 2014, in _retry_internal
    ).run()
      ~~~^^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/mongo_client.py", line 2763, in run
    return self._read() if self._is_read else self._write()
                                              ~~~~~~~~~~~^^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/mongo_client.py", line 2899, in _write
    return self._func(self._session, conn, self._retryable)  # type: ignore
           ~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/collection.py", line 819, in _insert_command
    result = conn.command(
        self._database.name,
    ...<5 lines>...
        retryable_write=retryable_write,
    )
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/helpers.py", line 47, in inner
    return func(*args, **kwargs)
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/pool.py", line 426, in command
    self._raise_connection_failure(error)
    ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^^^^^^^
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/pool.py", line 398, in command
    return command(
        self,
    ...<20 lines>...
        write_concern=write_concern,
    )
  File "/Users/shane/git/mongo-python-driver/pymongo/synchronous/network.py", line 148, in command
    request_id, msg, size, max_doc_size = message._op_msg(
                                          ~~~~~~~~~~~~~~~^
        flags, spec, dbname, read_preference, codec_options, ctx=compression_ctx
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "/Users/shane/git/mongo-python-driver/pymongo/message.py", line 419, in _op_msg
    return _op_msg_uncompressed(flags, command, identifier, docs, opts)
bson.errors.InvalidStringData: String length must be <= 2147483647

So I think it's a bug fix that we should merge but clarify that the bug can never actually happen in practice.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates the PyMongo C extension’s message-building code paths, primarily around validating/handling downcasted sizes before writing to the internal buffer.

Changes:

  • Fixes buffer_write_bytes_ssize_t to correctly treat _downcast_and_check() failures as errors.
  • Removes/adjusts a few stray whitespace-only lines (plus a trailing newline change in bson/buffer.c).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
pymongo/_cmessagemodule.c Corrects error handling for downcasted byte lengths in a buffer write helper; minor whitespace cleanup.
bson/buffer.c Adds a trailing newline (no functional change).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 61 to 66
static int buffer_write_bytes_ssize_t(buffer_t buffer, const char* data, Py_ssize_t size) {
int downsize = _downcast_and_check(size, 0);
if (size == -1) {
if (downsize == -1) {
return 0;
}
return buffer_write_bytes(buffer, data, downsize);
Copy link

Copilot AI Feb 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR title/description indicate broader signed integer overflow hardening (e.g., introducing a shared _check_int32_size() helper and converting OP_MSG/OP_QUERY/etc size computations to size_t with int32 bounds checks). In this diff/repo state, those changes are not present (no _check_int32_size symbol, and message length/payload length calculations still downcast directly to int32_t without centralized bounds validation). Please either include the missing overflow-checking changes or update the PR title/description to match the actual scope.

Copilot uses AI. Check for mistakes.
Copy link
Contributor

@Jibola Jibola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@HyperPS Thank you for your contribution, I've got a couple more changes I'd need reverted on this PR before it's good to go.

To answer your earlier question, after thorough review by our team, we've concluded this issue is not indicative of a vulnerability, nor is the finalized code change representative of a bug present within the intended and expected usage of the PyMongo library. As such, while we do appreciate the outline and assistance in identifying a code change we plan to move forward with, the thesis that this mitigates memory corruption is false as we have data validation checks either before or after the CPython stage call.

For future reference, if you do believe you have found a vulnerability, please go follow the vulnerability report steps found here as that is our official channel for submitting reports such as this.

Again, thank you for your contribution to our codebase hygiene!

void pymongo_buffer_update_position(buffer_t buffer, buffer_position new_position) {
buffer->position = new_position;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -352,7 +351,6 @@ _set_document_too_large(int size, long max) {
#define _DELETE 2

/* OP_MSG ----------------------------------------------- */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* OP_MSG ----------------------------------------------- */
/* OP_MSG ----------------------------------------------- */

@@ -333,7 +333,6 @@ static PyObject* _cbson_op_msg(PyObject* self, PyObject* args) {
return result;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -658,7 +656,6 @@ _cbson_batched_op_msg(PyObject* self, PyObject* args) {
}

/* End OP_MSG -------------------------------------------- */
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/* End OP_MSG -------------------------------------------- */
/* End OP_MSG -------------------------------------------- */

@@ -862,7 +859,6 @@ _batched_write_command(
Py_XDECREF(iterator);
return 0;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
}
}

@@ -1017,7 +1013,6 @@ _cmessage_exec(PyObject *m)
INITERROR;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -1029,7 +1024,6 @@ static PyModuleDef_Slot _cmessage_slots[] = {
{0, NULL},
};

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change

@@ -1046,4 +1040,3 @@ PyMODINIT_FUNC
PyInit__cmessage(void)
{
return PyModuleDef_Init(&moduledef);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
return PyModuleDef_Init(&moduledef);
return PyModuleDef_Init(&moduledef);
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants