-
Notifications
You must be signed in to change notification settings - Fork 4
TBaseString: Fix possible override #41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
WalkthroughInitialize TStringBase's inline buffer first element to '\0'; when copying, allocate one extra element (size+1), set capacity accordingly, and write a null terminator at index size after memcpy. Tests updated to expect default capacity 512 for Changes
Estimated code review effort🎯 2 (Simple) | ⏱️ ~10 minutes Poem
Tip 🔌 Remote MCP (Model Context Protocol) integration is now available!Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats. ✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR/Issue comments)Type Other keywords and placeholders
CodeRabbit Configuration File (
|
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
include/cppcore/Common/TStringBase.h (1)
162-183
: Equality can break after set/append because mHashId is not updated
set()
andoperator+=
callcopyFrom
, butcopyFrom
does not updatemHashId
. Sinceoperator==
short-circuits to false on differing hashes, comparing a string built withset()
to one built via the constructor (which hashes) will likely fail even when contents match.Centralize hash maintenance in
copyFrom
so all writers keepmHashId
in sync. This preserves existing constructor behavior (double-hashing is harmless; you can remove it later if desired).Apply this diff:
inline void TStringBase<T>::copyFrom(TStringBase<T> &base, const T *ptr, size_t size) { @@ memcpy(targetPtr, ptr, size * sizeof(T)); targetPtr[size] = T(0); base.mSize = size; + // Keep hash consistent for all write paths (ctor, set, operator+=, etc.) + base.mHashId = THash<HashId>::toHash(base.c_str(), size); }Optionally, you can remove the redundant hashing in the constructor:
inline TStringBase<T>::TStringBase(const T *ptr, size_t size) { copyFrom(*this, ptr, size); - mHashId = THash<HashId>::toHash(c_str(), size); }
If you prefer not to touch
copyFrom
, add the hashing line inset()
and bothoperator+=
overloads instead. I can provide that patch too.
🧹 Nitpick comments (4)
include/cppcore/Common/TStringBase.h (2)
95-95
: Prefer value-initialization over char literal for template generality
T mBuffer[InitSize] = {'\0'};
works forchar
but subtly couples the template to character-likeT
. Using{}
value-initializes the entire buffer to zero for any trivially value-initializableT
(includingwchar_t
,char16_t
, etc.) and reads clearer.Apply this diff:
- T mBuffer[InitSize] = {'\0'}; + T mBuffer[InitSize] = {};
121-126
: Ensure empty c_str() stays NUL-terminated after reset
reset()
setsmSize = 0
but doesn’t explicitly write a terminator into the active buffer. If the active buffer is the small buffer and previously held data,c_str()
will still return a pointer to an old non-empty C-string even thoughsize() == 0
. This is surprising and can lead to misuse by callers that trustc_str()
.Consider forcing the first element to zero in
reset()
:inline void TStringBase<T>::reset() { mSize = 0u; mHashId = 0; + if (mStringBuffer == nullptr) { + mBuffer[0] = T(0); + } else if (mCapacity > 0) { + mStringBuffer[0] = T(0); + } }test/common/TStringBaseTest.cpp (2)
47-47
: Brittle assertion on exact capacityAsserting an exact capacity (512) tightly couples the test to an internal implementation detail and will break on legitimate tuning. Prefer asserting invariants (e.g., capacity is at least the small-buffer size or doesn’t shrink across reset) instead of the precise number.
For example:
- EXPECT_GE(string_1.capacity(), 4);
- EXPECT_EQ_AFTER_RESET(capacity, initial_capacity); // if you capture and compare
I can refactor this test to verify behavior rather than the literal constant.
57-57
: Same as above: avoid hard-coding InitSize in testsMirrors the previous comment. Unless you need to lock the ABI, asserting the exact small-buffer capacity makes tests noisy for non-functional changes.
Recommendation: assert relative properties (e.g., capacity unchanged by clear; capacity sufficient for the written data) rather than the magic number 512.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
include/cppcore/Common/TStringBase.h
(2 hunks)test/common/TStringBaseTest.cpp
(2 hunks)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
include/cppcore/Common/TStringBase.h (4)
121-126
: reset() should preserve the c_str() invariant (always NUL-terminated)After reset(), c_str() may still point to mStringBuffer with stale data at index 0. Ensure the first element is 0 on the active buffer so c_str() is always a valid empty string.
inline void TStringBase<T>::reset() { mSize = 0u; mHashId = 0; + // Keep storage but enforce empty-string invariant for c_str() + if (mStringBuffer) { + mStringBuffer[0] = T(0); + } else { + mBuffer[0] = T(0); + } }
205-213
: operator+= overwrites instead of appendingoperator+=(const TStringBase&) currently replaces the content with rhs instead of appending it. This is a semantic bug and will violate user expectations.
If you want to keep copyFrom() as the central routine, implement append here:
inline TStringBase<T> &TStringBase<T>::operator += (const TStringBase<T> &rhs) { if (rhs.isEmpty()) { return *this; } - - copyFrom(*this, rhs.c_str(), rhs.size()); + const size_t newSize = mSize + rhs.mSize; + // Ensure capacity (including terminator) + if (newSize + 1 > mCapacity) { + T* newBuf = new T[newSize + 1]; + // copy existing content + memcpy(newBuf, c_str(), mSize * sizeof(T)); + // copy rhs + memcpy(newBuf + mSize, rhs.c_str(), rhs.mSize * sizeof(T)); + newBuf[newSize] = T(0); + if (mStringBuffer) delete [] mStringBuffer; + mStringBuffer = newBuf; + mCapacity = newSize + 1; + } else { + // Append into active buffer + T* dst = mStringBuffer ? mStringBuffer : mBuffer; + memcpy(dst + mSize, rhs.c_str(), rhs.mSize * sizeof(T)); + dst[newSize] = T(0); + } + mSize = newSize; + mHashId = THash<HashId>::toHash(c_str(), mSize); return *this; }Optionally, extract the growth logic into a private ensureCapacity(size_t needed) helper to avoid duplication.
216-219
: operator+=(char) also overwrites instead of appendingSame semantic issue as the string append. This should append one character, not replace the whole content.
inline TStringBase<T>& TStringBase<T>::operator+=(char c) { - copyFrom(*this, &c, 1); - return *this; + const size_t newSize = mSize + 1; + if (newSize + 1 > mCapacity) { + T* newBuf = new T[newSize + 1]; + memcpy(newBuf, c_str(), mSize * sizeof(T)); + if (mStringBuffer) delete [] mStringBuffer; + mStringBuffer = newBuf; + mCapacity = newSize + 1; + } + T* dst = mStringBuffer ? mStringBuffer : mBuffer; + dst[mSize] = static_cast<T>(c); + dst[newSize] = T(0); + mSize = newSize; + mHashId = THash<HashId>::toHash(dst, mSize); + return *this; }
221-224
: operator[] reads from the wrong buffer (potentially garbage)operator[] always indexes mBuffer even when the active storage is mStringBuffer. This will return stale/garbage data for heap-backed strings.
inline T TStringBase<T>::operator[](size_t index) const { - return mBuffer[index]; + // Optionally add bounds checks in debug builds + return c_str()[index]; }
♻️ Duplicate comments (1)
include/cppcore/Common/TStringBase.h (1)
167-183
: Fix remaining overflow risk, active-buffer copy bug, and keep hash in syncThree issues persist in copyFrom:
- Capacity check ignores the terminator. If size == mCapacity (e.g., stack buffer of 512 and size 512), you write targetPtr[size] past the end. Condition must be size + 1 > mCapacity. [duplicate of prior review, still unresolved]
- targetPtr is always initialized to mBuffer, so when mStringBuffer already exists and has enough capacity you still copy into mBuffer, but c_str() continues to point to mStringBuffer (stale content). Choose the active buffer first.
- For size == 0, you return early without guaranteeing that c_str() points to a valid empty, null-terminated buffer. Ensure [0] = 0 on the active buffer.
- Minor: write the terminator as T(0) instead of '\0' for type correctness.
- Keep mHashId consistent; otherwise operator== can spuriously fail.
Apply this diff:
- T *targetPtr = base.mBuffer; - if (size == 0) { - return; - } - - if (size > base.mCapacity) { + // Use the active buffer and ensure capacity includes room for the terminator + T *targetPtr = base.mStringBuffer ? base.mStringBuffer : base.mBuffer; + if (size == 0) { + targetPtr[0] = T(0); + base.mSize = 0; + base.mHashId = 0; // empty string; adjust if your hash for empty is non-zero + return; + } + + if (size + 1 > base.mCapacity) { if (base.mStringBuffer != nullptr) { delete [] base.mStringBuffer; } - base.mStringBuffer = new T[size + 1]; - base.mCapacity = size + 1; + base.mStringBuffer = new T[size + 1]; + base.mCapacity = size + 1; targetPtr = base.mStringBuffer; } memcpy(targetPtr, ptr, size * sizeof(T)); - targetPtr[size] = '\0'; + targetPtr[size] = T(0); base.mSize = size; + base.mHashId = THash<HashId>::toHash(base.c_str(), base.mSize);
🧹 Nitpick comments (2)
include/cppcore/Common/TStringBase.h (2)
95-95
: Prefer T(0) over '\0' for template-friendly zero-initializationUsing '\0' (char) in a templated array can trigger narrowing or warnings when T ≠ char. Use T(0) (or simply {}) to make intent explicit and type-safe.
- T mBuffer[InitSize] = {'\0'}; + T mBuffer[InitSize] = {T(0)};
25-25
: Use instead of "string.h" in C++ headersPrefer the C++ header and qualified names (std::memcpy/std::memcmp). This avoids namespace pollution and is more idiomatic in C++.
-#include "string.h" +#include <cstring>Optionally:
- memcpy(targetPtr, ptr, size * sizeof(T)); + std::memcpy(targetPtr, ptr, size * sizeof(T));and similarly for memcmp.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (1)
include/cppcore/Common/TStringBase.h
(2 hunks)
Summary by CodeRabbit
Improvements
Bug Fixes
Tests