-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(dipu): add static_value_array and simplify code #920
base: main
Are you sure you want to change the base?
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,53 @@ | ||
#pragma once | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 是否应该放到 utils/ 目录下 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 我也在犹豫放哪里,有没有啥规范或者惯例,还有就是文件名字有没有更好的 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 附议 |
||
#include <array> | ||
#include <utility> | ||
|
||
namespace dipu { | ||
|
||
template <typename T, typename S> | ||
struct make_static_value_array {}; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 要是写新项目,我肯定不让你用下划线类名 |
||
|
||
template <typename T, std::size_t... I> | ||
struct make_static_value_array<T, std::index_sequence<I...>> { | ||
using value_type = std::decay_t<decltype(T::template value<0>)>; | ||
using type = std::array<value_type, sizeof...(I)>; | ||
type static inline constexpr value{T::template value<I>...}; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. static inline constexpr 为啥要放到 type 后面? 这是啥规范, 我咋觉得 放到前面更容易懂? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这里是个人习惯,C++ 的修饰词太多了,auto、static、inline、constexpr、const、thread_local……通常来说我就习惯写成“类型+其它类型标识”的风格,例如“char const *” 这样。 有个好处是“从右往左”看时顺序非常清晰,比如这种情况: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 类内定义方法和非特化模板函数都是隐式 inline,不需要加,下面也有几个 case 不重复提了 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这里如果不写 inline,编译器就会要求这里只声明,需要在类外定义静态成员变量,否则违反 odr There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 哦哦 static 不是方法,下面几个应该是 |
||
}; | ||
|
||
// Create an array of static values. Each type T should provide a static | ||
// 'value()' function to return a reference of the underlying static variable. | ||
// | ||
// In this way, value can be lazily initialized. | ||
template <typename T, std::size_t N> | ||
// Note: in C++17 "inline constexpr" (without "static") is necessary to make | ||
// sure there is only one entity. | ||
auto inline constexpr make_static_value_array_v = | ||
make_static_value_array<T, std::make_index_sequence<N>>::value; | ||
|
||
// A CRTP helper to simplify access method of static value holder. | ||
// | ||
// Usage: | ||
// | ||
// ``` | ||
// struct integer : static_value_array<foo, 16> { | ||
// template <std::size_t I> auto static value() -> int & { | ||
// auto static instance = 0; | ||
// return instance; | ||
// } | ||
// }; | ||
// | ||
// Then we can use `integer::get(1)` to access `integer::value<1>()`. | ||
template <typename T, std::size_t N> | ||
struct static_value_array { | ||
auto static array() -> decltype(auto) { | ||
auto static constexpr instance = make_static_value_array_v<T, N>; | ||
return instance; | ||
} | ||
template <typename... A> | ||
auto static get(std::size_t index, A&&... args) -> decltype(auto) { | ||
return array()[index](std::forward<A>(args)...); | ||
} | ||
}; | ||
|
||
} // namespace dipu |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,102 +1,77 @@ | ||
#include "DIPUEventPool.h" | ||
|
||
#include <deque> | ||
#include <functional> | ||
#include <iostream> | ||
#include <mutex> | ||
|
||
namespace dipu { | ||
#include "csrc_dipu/base/basedef.h" | ||
#include "csrc_dipu/base/utility.h" | ||
#include "csrc_dipu/runtime/device/deviceapis.h" | ||
|
||
template <typename T> | ||
class EventPool final { | ||
protected: | ||
std::deque<T> event_pool_; | ||
unsigned int allocate_num_ = 0; | ||
namespace { | ||
|
||
std::function<void(T&)> allocator_; | ||
std::function<void(T&)> deleter_; | ||
using mutex_t = std::recursive_mutex; | ||
mutex_t event_mutex_; | ||
class EventPool { | ||
std::deque<dipu::deviceEvent_t> pool; | ||
std::recursive_mutex mutex; | ||
|
||
public: | ||
EventPool(const std::function<void(T&)>& allocator, | ||
const std::function<void(T&)>& deleter) | ||
: allocator_(allocator), deleter_(deleter) {} | ||
|
||
EventPool(const EventPool&) = delete; | ||
EventPool(EventPool&&) = delete; | ||
EventPool& operator=(const EventPool&) = delete; | ||
EventPool& operator=(EventPool&&) = delete; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 有锁作为成员,EventPool 自动变得无法拷贝和移动,所以我都删掉了 |
||
|
||
~EventPool() = default; | ||
|
||
void release() { | ||
std::lock_guard<mutex_t> _(event_mutex_); | ||
for (auto& event : event_pool_) { | ||
deleter_(event); | ||
allocate_num_--; | ||
} | ||
event_pool_.clear(); | ||
} | ||
|
||
void get(T& event) { | ||
bool need_allocator = false; | ||
void acquire(dipu::deviceEvent_t& event) { | ||
{ | ||
std::lock_guard<mutex_t> _(event_mutex_); | ||
if (event_pool_.empty()) { | ||
need_allocator = true; | ||
} else { | ||
event = event_pool_.back(); | ||
event_pool_.pop_back(); | ||
std::scoped_lock _(mutex); | ||
if (!pool.empty()) { | ||
event = pool.back(); | ||
pool.pop_back(); | ||
return; | ||
} | ||
} | ||
if (need_allocator) { | ||
allocator_(event); | ||
} | ||
|
||
dipu::devapis::createEvent(&event); | ||
} | ||
|
||
void release(dipu::deviceEvent_t& event) { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 新的名字确实比原来的规范 |
||
std::scoped_lock _(mutex); | ||
pool.emplace_back(event); | ||
} | ||
|
||
void restore(T& event) { | ||
std::lock_guard<mutex_t> _(event_mutex_); | ||
event_pool_.emplace_back(event); | ||
void clear() { // should it called inside destructor? | ||
std::scoped_lock _(mutex); | ||
for (auto& event : pool) { | ||
dipu::devapis::destroyEvent(event); | ||
} | ||
pool.clear(); | ||
} | ||
}; | ||
|
||
EventPool<deviceEvent_t>* getEventPool() { | ||
const int index = devproxy::current_device(); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 我是觉得 event pool 不应该调用 devproxy 命名空间的方法,所以把它移到外面了 |
||
// GlobalEventPool for different cards , construct when really needed | ||
#define dispatch_event_pool(device_id) \ | ||
if (index == (device_id)) { \ | ||
static EventPool<deviceEvent_t> gDIPUEventPool( \ | ||
[](deviceEvent_t& event) { devapis::createEvent(&event); }, \ | ||
[](deviceEvent_t& event) { devapis::destroyEvent(event); }); \ | ||
return &gDIPUEventPool; \ | ||
struct StaticEventPoolArray | ||
: dipu::static_value_array<StaticEventPoolArray, dipu::kMaxDeviceNumber> { | ||
template <std::size_t I> | ||
inline static auto& value() { | ||
auto static instance = EventPool(); | ||
return instance; | ||
} | ||
}; | ||
|
||
dispatch_event_pool(0); | ||
dispatch_event_pool(1); | ||
dispatch_event_pool(2); | ||
dispatch_event_pool(3); | ||
dispatch_event_pool(4); | ||
dispatch_event_pool(5); | ||
dispatch_event_pool(6); | ||
dispatch_event_pool(7); | ||
dispatch_event_pool(8); | ||
dispatch_event_pool(9); | ||
dispatch_event_pool(10); | ||
dispatch_event_pool(11); | ||
dispatch_event_pool(12); | ||
dispatch_event_pool(13); | ||
dispatch_event_pool(14); | ||
dispatch_event_pool(15); | ||
TORCH_CHECK(false, "support up to 16 cards"); | ||
auto event_pool(int index) -> EventPool& { | ||
TORCH_CHECK(0 <= index and index < dipu::kMaxDeviceNumber, | ||
"device index out of range"); | ||
return StaticEventPoolArray::get(index); | ||
} | ||
|
||
void getEventFromPool(deviceEvent_t& event) { getEventPool()->get(event); } | ||
} // namespace | ||
|
||
namespace dipu { | ||
|
||
void event_pool_acquire(int index, deviceEvent_t& event) { | ||
event_pool(index).acquire(event); | ||
} | ||
|
||
void restoreEventToPool(deviceEvent_t& event) { | ||
getEventPool()->restore(event); | ||
void event_pool_release(int index, deviceEvent_t& event) { | ||
event_pool(index).release(event); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 参数加了 index, 上下层的职责划分也比原来明确些。 |
||
} | ||
|
||
void releaseAllEvent() { getEventPool()->release(); } | ||
void event_pool_clear() { | ||
for (auto& getter : StaticEventPoolArray::array()) { | ||
getter().clear(); | ||
} | ||
} | ||
|
||
} // namespace dipu |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,10 +6,8 @@ | |
|
||
namespace dipu { | ||
|
||
void getEventFromPool(deviceEvent_t& event); | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 其实原来的驼峰也没哈问题,真没必要都改成下划线,纯个人习惯问题。。。 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 确实,这里是习惯了 |
||
void restoreEventToPool(deviceEvent_t& event); | ||
|
||
void releaseAllEvent(); | ||
void event_pool_acquire(int index, deviceEvent_t& event); | ||
void event_pool_release(int index, deviceEvent_t& event); | ||
void event_pool_clear(); | ||
|
||
} // namespace dipu |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -5,6 +5,8 @@ | |
#include <c10/core/Device.h> | ||
#include <c10/util/flat_hash_map.h> | ||
|
||
#include "csrc_dipu/base/basedef.h" | ||
#include "csrc_dipu/base/utility.h" | ||
#include "csrc_dipu/runtime/core/DIPUEvent.h" | ||
|
||
#include "DIPUAsyncResourcePool.h" | ||
|
@@ -213,48 +215,33 @@ struct RawAllocator<at::DeviceType::CPU> { | |
using type = DIPURawHostAllocator; | ||
}; | ||
|
||
template <typename AllocatorImpl, class AsyncMemPoolImpl, int device_id> | ||
c10::Allocator* get_allocator_impl(c10::Allocator* raw_allocator) { | ||
// Construct when really needed | ||
// async_mem_pool is used when cache_allocator being destructed so it should | ||
// be destructed after cache_allocator | ||
static AsyncMemPoolImpl async_mem_pool; | ||
static AllocatorImpl cache_allocator; | ||
static int n = [&]() { | ||
cache_allocator.set_raw_allocator(raw_allocator); | ||
cache_allocator.set_async_mem_pool(&async_mem_pool); | ||
return 0; | ||
}(); | ||
return &cache_allocator; | ||
} | ||
|
||
template <class AllocatorImpl, class AsyncMemPoolImpl> | ||
c10::Allocator* get_allocator(int device_id, c10::Allocator* raw_allocator) { | ||
#define DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(id) \ | ||
if (device_id == (id)) { \ | ||
return get_allocator_impl<AllocatorImpl, AsyncMemPoolImpl, id>( \ | ||
raw_allocator); \ | ||
template <typename AllocatorImpl, class AsyncMemPoolImpl> | ||
struct StaticAllocatorArray | ||
: dipu::static_value_array< | ||
StaticAllocatorArray<AllocatorImpl, AsyncMemPoolImpl>, | ||
dipu::kMaxDeviceNumber> { | ||
template <std::size_t I> | ||
inline static c10::Allocator* value(c10::Allocator* raw_allocator) { | ||
// Construct when really needed | ||
// async_mem_pool is used when cache_allocator being destructed so it should | ||
// be destructed after cache_allocator | ||
static AsyncMemPoolImpl async_mem_pool; | ||
static AllocatorImpl cache_allocator; | ||
static int n = [&]() { | ||
cache_allocator.set_raw_allocator(raw_allocator); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这个 precondition 好强啊,没有任何警示吗 There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. 这块没想到怎么改合适……索性直接复制之前的实现 |
||
cache_allocator.set_async_mem_pool(&async_mem_pool); | ||
return 0; | ||
}(); | ||
return &cache_allocator; | ||
} | ||
}; | ||
|
||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(0); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(1); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(2); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(3); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(4); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(5); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(6); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(7); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(8); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(9); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(10); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(11); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(12); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(13); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(14); | ||
DIPU_ALLOCATOR_DISPATCH_DEVICE_ID(15); | ||
TORCH_CHECK(false, "support up to 16 cards"); | ||
template <class Allocator, class MemPool> | ||
c10::Allocator* get_allocator(int index, c10::Allocator* raw_allocator) { | ||
TORCH_CHECK(0 <= index and index < dipu::kMaxDeviceNumber, | ||
"device index out of range"); | ||
return StaticAllocatorArray<Allocator, MemPool>::get(index, raw_allocator); | ||
} | ||
#undef DIPU_ALLOCATOR_DISPATCH_DEVICE_ID | ||
|
||
#define DIPU_REGISTER_ALLOCATOR(name, device_type, CachingAllocator, \ | ||
algorithm, priority) \ | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
为啥要新建个 ns, 这和上面的不是一样的吗?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
我当时是看到两边命名风格不一样,所以隔开了一下