-
Notifications
You must be signed in to change notification settings - Fork 830
add mem alloc and derive macro #2299
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Cargo.toml
Outdated
"common/io", | ||
"common/management", | ||
"common/mem", | ||
"common/mem-derive", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Merge the "common/mem-derive
crate into common/mem? So it looks more cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, I will make it later. this pr has not finished yet but started integrating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, this is just a pre-suggestion when a PR is in draft state :)
Thanks for the contribution! Please review the labels and make any necessary changes. |
The text I updated them. |
common/mem/Cargo.toml
Outdated
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html | ||
|
||
[dependencies] | ||
jemalloc-sys = { version = "0.3.2" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
jemalloc-sys
is unmaintained. change to tikv-jemalloc-sys
For the For the |
thanks for guidiing, I will check liscense and code lint in next commit. |
@BohuTANG Excuse me, I have a question when debugging test.
how should I get output when debugging in databend unit test? |
I guess we should use |
|
||
let session_size = malloc_size(&session); | ||
|
||
assert!(session_size > 3000); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Codecov Report
@@ Coverage Diff @@
## main #2299 +/- ##
=======================================
+ Coverage 69% 70% +1%
=======================================
Files 647 624 -23
Lines 36262 34301 -1961
=======================================
- Hits 25041 24152 -889
+ Misses 11221 10149 -1072
Continue to review full report at Codecov.
|
self.sessions.get_user_manager() | ||
} | ||
|
||
pub fn get_memory_usage(self: &Arc<Self>) -> usize { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job!!!
How do we get the memory usage for in session query?
For example:
SELECT number FROM numbers(10) GROUP BY number
-- In this query, we will create a custom HashMap to complete data aggregate(not session fields), which will use some megabytes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now we will track Session
object itself. #1148 (comment)
For detail, any member variable in Session
without macro ignore_malloc_size_of
will be computed. If the custom Hashmap to store within the session
object, it can be computed.
Any object that derive the trait MallocSizeOf
can get their memory usage.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, current implementation could be the basic util tools to count the malloc size inside Session
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The current implementation is great. an idea (welcome any opinion):
In databend query, we use tokio runtime to complete the interaction with user sessions(MySQL, ClickHouse). maybe we can monitor the memory usage of Tokio runtime by some way?
For example:
struct RuntimeTracker {
parent: RuntimeTracker,
memory_usage: AtomicUsize,
}
struct WorkerTracker {
runtime_tracker: Arc<RuntimeTracker>
}
thread_local!{
woker_tracker: Option<WorkerTracker> = None;
}
impl WorkerTracker {
pub fn instance() -> WorkerTracker {
worker_trakcer.unwrap()
}
}
struct MyAllocator(DefaultAllocator)
impl Allocator for MyAllocator {
#[inline]
fn alloc(&self, layout: Layout) -> *mut u8 {
WorkerTracker::instance().runtime_tracker.memory_usage.fetch_and_add(layout.size());
self.0.alloc(layout)
}
...
}
// in common_base/runtime.rs:
impl Runtime {
pub fn new() -> Result<Runtime> {
// StdThread call Runtime::new()
// RuntimeThread call Runtime::new()
// Chain for runtime_tracker if need
let outer_tacker = WorkerTacker::instance();
let runtime_builder = ...;
runtime_builder.on_thread_start(move || { woker_tracker.with(|mut tracker| tracker.runtime_tracker.parent = Some(outer_tracker.runtime_tracker);)})
}
pub fn get_runtime_tracker(&self) -> RuntimeTracker {
self.tracker
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since it addresses lots of code from https://crates.io/crates/parity-util-mem, is there any possibility to directly use the crate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
now we will track Session object itself. #1148 (comment)
For detail, any member variable in Session without macro ignore_malloc_size_of will be computed. If the custom Hashmap to store within the session object, it can be computed.
Any object that derive the trait MallocSizeOf can get their memory usage.
Sorry, I missed this comment. only tracking session object in this PR is OK.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhang2014 I understand what you are meaning, but I am not clear about why RuntimeTracker
has a parent
field ?
this idea is great! we can track all memory interested, maybe we can finish it in another pr?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sundy-li maybe not to directly use this crate, here are the reasons:
-
for
common/mem/mem-derive
, we may not be ok to use it directly because we need to modify the rule that derive macroMallocOfSize
with databend private crate name, code in here, the macro will depend on which crate name is. -
for
common/mem/mem-allocator
, for the similarly reason, if we directly use crate from https://crates.io/crates/parity-util-mem, we can not impl traitMallocOfSize
with our own code asmod alloc_size
is private in crateparity-util-mem
here. because databend has a Mutex impl here.
another reason that is that this crate defines a global mem allocator here, and we maybe not ok to define databend own allocator if we want to expend functions with databend global allocators.
if anything worry, please inform me of that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zhang2014 I understand what you are meaning, but I am not clear about why RuntimeTracker has a parent field ?
We need recursive tracing if create a runtime again in other runtime worker.
let runtime_1 = Runtime:create()?;
runtime_1.spawn(async {
let runtime_2 = Runtime::create()?;
runtime_2.spawn(async {
// alloc memory
});
});
CI Passed |
Thank you @Veeupup |
I hereby agree to the terms of the CLA available at: https://databend.rs/policies/cla/
Summary
Summary about this PR:
Changelog
Related Issues
Fixes #1148
Test Plan
Unit Tests
Stateless Tests