-
Notifications
You must be signed in to change notification settings - Fork 468
implement pthread_workqueue within libdispatch #206
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is only a first pass, and I haven't tried to grok the algorithms yet.
My biggest remark is with naming short term, FWIW I would have namespaced evrything in that module _dispatch_wq_*
which would yield function names such as _dispatch_wq_add_workqueue
instead of _dispatch_workqueue_add_wq
.
Second, a workqueue is not the pool, it is the thread that is used to run queues. I think the pools should be called thread pool (dispatch_wq_pool is fine since as said a work queue is a thread, so no need to qualify as dispatch_wq_thread_pool
), or somesuch to make the code easier to read for newcomers.
If you don't like dispatch_wq_
you can use dispatch_workq_
which isn't that much longer and would pair nicely with the namespace for many of the pre-existing WORKQ_*
flags of the interface.
I'll try to have a more thoughtful look at the various algorithms, atomics barriers and the like later, but that all looks very promising!
src/queue.c
Outdated
@@ -159,7 +160,7 @@ struct dispatch_root_queue_context_s { | |||
#if HAVE_PTHREAD_WORKQUEUES | |||
qos_class_t dgq_qos; | |||
int dgq_wq_priority, dgq_wq_options; | |||
#if DISPATCH_USE_LEGACY_WORKQUEUE_FALLBACK || DISPATCH_USE_PTHREAD_POOL | |||
#if DISPATCH_USE_LEGACY_WORKQUEUE_FALLBACK || DISPATCH_USE_PTHREAD_POOL || HAVE_INTERNAL_PTHREAD_WORKQUEUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 col wrap
src/queue.c
Outdated
@@ -719,7 +720,7 @@ _dispatch_root_queues_init_workq(int *wq_supported) | |||
result = !r; | |||
} | |||
#endif // HAVE_PTHREAD_WORKQUEUE_SETDISPATCH_NP | |||
#if DISPATCH_USE_LEGACY_WORKQUEUE_FALLBACK || DISPATCH_USE_PTHREAD_POOL | |||
#if DISPATCH_USE_LEGACY_WORKQUEUE_FALLBACK || DISPATCH_USE_PTHREAD_POOL || HAVE_INTERNAL_PTHREAD_WORKQUEUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
src/queue.c
Outdated
#if HAVE_INTERNAL_PTHREAD_WORKQUEUE | ||
if (!disable_wq && qc->dgq_wq_priority != WORKQ_PRIO_INVALID) { | ||
int priority = qc->dgq_wq_priority; | ||
int overcommit = qc->dgq_wq_options & WORKQ_ADDTHREADS_OPTION_OVERCOMMIT; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 columns
src/workqueue/workqueue.c
Outdated
@@ -0,0 +1,849 @@ | |||
/* | |||
* Copyright (c) 2008-2017 Apple Inc. All rights reserved. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that 2008 looks incorrect
src/workqueue/workqueue.c
Outdated
*/ | ||
|
||
|
||
///////////////////////////////// |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we never use ////
like that, the rule is /* */
for things outside of functions and //
comment when inline inside a function
Here what you mean is:
#pragma mark - ...
which is an Xcode thing but is easy to wire in vim ;)
src/workqueue/workqueue.c
Outdated
|
||
if (kext_active) { | ||
_dispatch_spawn_thread(dispatch_thread_monitor_main_kext); | ||
} else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just do an early return and remove the weird } else
and indent below.
src/workqueue/workqueue.c
Outdated
if (kext_active) { | ||
_dispatch_worker_register_kext(); | ||
rc = 0; | ||
} else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
src/workqueue/workqueue.c
Outdated
#if DISPATCH_ENABLE_PWQ_KEXT | ||
if (kext_active) { | ||
_dispatch_worker_register_kext(); | ||
} else |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(and two times more below)
src/workqueue/workqueue.c
Outdated
scheduler.registered_workers[last] = 0; | ||
scheduler.num_registered_workers--; | ||
} else { | ||
if (fcntl(fd, F_SETFL, O_NONBLOCK) == 0) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
doesn't linux support O_NONBLOCK as an argument to open() these days?
src/workqueue/workqueue.c
Outdated
return NULL; | ||
} | ||
|
||
while(true) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we tend to use for (;;) {
for this
Actually I think re-reading myself I like a a etc... |
src/workqueue/workqueue_internal.h
Outdated
typedef struct _dispatch_workqueue * pthread_workqueue_t; | ||
|
||
/* Work queue priority attributes. */ | ||
#define WORKQ_HIGH_PRIOQUEUE 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure that's the right interface, I think we should build something on QoS instead of the legacy interface.
I pushed a revision that tries to address most of the review comments. The main things I know I didn't do in this revision are (a) dealing with the code duplication in the normal/overcommit worker main functions (b) the legacy API (6 priority, not QoS) and (c) some of the overuse of 'queues'. Stepping back, I think the next big step after this PR is to try to move forward to an API where we avoid the extra level of indirection of having the workq level allocating its own workitems that are just wrapping a root_queue_drain function on a dq and have workq go after the root_queues more directly. If we can do that, than the duplicated code changes quite a bit (and maybe mostly disappears). I can take a look in more depth, but I'm tempted to leave the duplication alone since I'm hoping that in a week or so all of that code is different anyways. But, if you disagree I don't mind attempting a cleanup to cut down on the duplication tomorrow. |
Just a short comment based on some previous experience here (just as food for thought for such future optimisations and enhancements): it would be nice to let e.g. a class reserve the space needed for enqueueing it, to avoid potential memory allocations both in dispatch and in the work queue code - it would be helpful for certain interesting use cases. Anyway, nice to see this WIP! |
src/workqueue/workqueue.c
Outdated
int priority; | ||
bool overcommit; | ||
} dispatch_workq_workqueue_s, *dispatch_workq_workqueue_t; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not thrilled with the name dispatch_workq_workqueue_t, but am at a bit of a loss for anything better. Fortunately, this should be a temporary type. If we can eliminate the dispatch_workq_item_t, we won't need this type either (we won't need to have queues of dispatch_workq_item_t's).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would call this a dispatch_workq_thread_context_t
or even dispatch_workq_context_t
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wait no I'm confused that's the things queued at the bottom for the picking. hmmm. I'm quite unsure why you need that type, I'll have pay attention to the algorithm in depth and I haven't had that kind of time quite yet so let's ignore my comments ;)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
went with dispatch_workq_context_t for now (missed the edited comment).
I don't expect to actually need this type in the long run. For the initial merge, I was staying pretty close to the libpwq structure which has a redundant layer of queuing. I was planning on eliminating that redundancy in a second round of changes. We can talk on Monday about whether this plan makes sense or if we should jump to something closer to the final structure in one big step.
I'll work on rebasing this PR to current master and updating to use dispatch_priority_t tomorrow. Then we can figure out what whether it makes sense to merge something like this in the near future, or if we want to go directly to a more optimized version that eliminates the extra level of workitems/queuing that were present in libpwq and replicated here. In either case, I don't think we should rush this into Swift 3.1.0 (maybe 3.1.1 if there is one). |
I think it's too late for Swift 3.1. My experience for a thread pool is that an evil you know is better than brand new code, we should though start right away for Swift 4. and if there's a swift 3.1+ then we can always backport it at that time. |
cf4c265
to
99b8625
Compare
I did rebase this to post epoll-merge master today. But, unless we are in a rush to eliminate the libpwq dependency, I think it probably makes sense for me to rework to non-legacy APIs and try to eliminate the extra level of workitems/queuing before you review further. |
Agreed. |
oh and btw, now that we have the eventing and threading subsystem are quite tied together, and that puts everything that is strongly tied to the kernel in its own place.
|
fe6185a
to
534dd0a
Compare
@MadCoder, I've done a rework to something that I think is more plausible. Much less new code, and the extra level of queuing/workitems inherited from libpwq is gone. The oversubscription heuristics may still need tweaking to maximize performance especially at small core counts (I'm running some experiments over the weekend), but I think the approach/algorithms are ready for review when you have time. The basic idea is to use the pthread_root_queue functionality already in libdispatch augmented with new code that periodically determines how many worker threads are actually runnable. If it looks like we are under target, then additional workers are gradually created to modestly oversubscribe the thread pool. The main changes in queue.c are making dgq_thread_pool_size signed to enable using negative values to encode the degree of oversubscription and some other minor adjustments. As written, it is only monitoring/oversubscribing the default non-overcommit queue. But, if the approach is acceptable and we want to be able to oversubscribe the other non-overcommit queues the approach should generalize ok. |
@dgrove-oss sounds nice, I won't likely have the time to do the review it requires before mid-next week, but I'll try to skim through it over the week-end. |
0ea540c
to
12f6f01
Compare
2fb99da
to
e80a9dd
Compare
@MadCoder, I got busy with a few other things for the last month, but I have time again for the next few weeks. If you could find time to review, I'd really like to push through the libpwq removal work so we have plenty of time to shake out any problems before the next release. Thanks! |
I was very busy too but should have some bandwidth early next week, I'll have a closer look then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On top of all the comments I made, I see several issues with that branch that need addressing IMO
-
we need one regulator per priority "bucket" and maybe even a global one (full bucket isolation was always quite the mistake in dispatch's initial design IMO)
-
you're not taking whether thread requests are in flight into account to maintain concurrency and other places, which will cause idle processes to burn CPU for no good reason. the whole monitor/regulartor logic should use it more. I see that you shortcut starting new threads if ! _dispatch_workq_root_queue_has_work() but that is not sufficient and is I think racy. I need to think about that more. It kind of goes with the notion I have that the queue.c code should know when the root queue implements a real workqueue bucket and do something special with the workqueue.c code when the thing becomes not empty. In any case the timer that goes with it should be suspended if there's no work and restart if there is and there is no thread creation in flight.
-
which segues into the fact that you need to account for threads being created and pre-post thread creation when you do it and account for these until you observe them registering with the monitor. Else if thread creation is stalling for any reason (and given how expensive that is it has tons of reasons to) you will end up with thread explosion. A huge lesson we learned in dispatch is that the thread pool should not create new threads in a given bucket until it has observed that thread reach userland. You need to implement this policy.
src/Makefile.am
Outdated
@@ -77,6 +84,7 @@ AM_OBJCFLAGS=$(DISPATCH_CFLAGS) $(CBLOCKS_FLAGS) | |||
AM_CXXFLAGS=$(PTHREAD_WORKQUEUE_CFLAGS) $(DISPATCH_CFLAGS) $(CXXBLOCKS_FLAGS) | |||
AM_OBJCXXFLAGS=$(DISPATCH_CFLAGS) $(CXXBLOCKS_FLAGS) | |||
|
|||
if !HAVE_INTERNAL_PTHREAD_WORKQUEUES |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think I'd move that up as an else
may not be relevant for very long if we move to cmake long term anyway
src/shims.h
Outdated
@@ -42,8 +42,12 @@ | |||
#if __has_include(<pthread/workqueue_private.h>) | |||
#include <pthread/workqueue_private.h> | |||
#else | |||
#if HAVE_INTERNAL_PTHREAD_WORKQUEUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: #elif
?
pthread_workqueue_t dgq_kworkqueue; | ||
#endif | ||
#endif // HAVE_PTHREAD_WORKQUEUES | ||
#if DISPATCH_USE_PTHREAD_POOL | ||
void *dgq_ctxt; | ||
uint32_t volatile dgq_thread_pool_size; | ||
int32_t volatile dgq_thread_pool_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok I think I get it, your "oversubscribe" thing is about pretending the threads don't exist because they are blocked, so you need to allow negative numbers, makes sense.
src/queue.c
Outdated
(uint8_t)(flags & ~_DISPATCH_PTHREAD_ROOT_QUEUE_FLAG_POOL_SIZE) : 0; | ||
uint8_t pool_size_u = flags & _DISPATCH_PTHREAD_ROOT_QUEUE_FLAG_POOL_SIZE ? | ||
(int8_t)(flags & ~_DISPATCH_PTHREAD_ROOT_QUEUE_FLAG_POOL_SIZE) : 0; | ||
int32_t pool_size = (int32_t)(uint32_t)pool_size_u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need the double cast, uint8_t promotes naturally to any 32bit integer type (signed or not)
src/queue.c
Outdated
@@ -2071,6 +2077,12 @@ dispatch_pthread_root_queue_copy_current(void) | |||
return (dispatch_queue_t)_os_object_retain_with_resurrect(dq->_as_os_obj); | |||
} | |||
|
|||
int32_t | |||
_dispatch_pthread_root_queue_thread_pool_size(dispatch_queue_t dq) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
brace on the next line please
src/event/workqueue.c
Outdated
bool | ||
dispatch_workq_worker_register(dispatch_queue_t root_q) | ||
{ | ||
if (root_q != _dispatch_workq_get_default_root_queue()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why? it feels like you want to do this for any root queue that is in the the "root queue array" as in _dispatch_is_in_root_queues_array()
and have several monitors (one per bucket). I would also make this check the responsibility of the caller since you hook that into pthread root queues and these should know what kind of thing they implement (custom pthread root queues, or the backend for a workqueue pool)
src/queue.c
Outdated
@@ -5475,6 +5517,10 @@ _dispatch_worker_thread(void *context) | |||
(void)dispatch_assume_zero(r); | |||
_dispatch_introspection_thread_add(); | |||
|
|||
#if HAVE_INTERNAL_PTHREAD_WORKQUEUE | |||
bool registered = dispatch_workq_worker_register(dq); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as I commented in the .c I think this should only be called if dq
is a workqueue root queue, this code should (and has to) know which kind of thing it's implementing rather than doing spurious calls here IMO
src/event/workqueue.c
Outdated
} | ||
|
||
_dispatch_workq_count_runnable_workers(mgr); | ||
int32_t count = _dispatch_pthread_root_queue_thread_pool_size(dq); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
given that you cheat with that count when you "over-subscribe" and that it is inherently an atomic value that is constantly moving, I'd really only use things that the monitor knows from under the lock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IOW that count should only be about "do not go above the max concurrency" in the pthread root queue code, but not used for logic here at all.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(it goes with my general review remark about the fact you need to count threads in a pre-posted way anyway to avoid in-flight thread explosion)
src/event/workqueue.c
Outdated
_dispatch_debug("workq: %s is non-empty with pool count %d (%d runnable)", | ||
dq->dq_label, count, mgr->runnable_workers); | ||
|
||
if (mgr->runnable_workers < mgr->target_runnable_workers) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you want to do that only if there are pending thread requests
src/queue.c
Outdated
_dispatch_root_queue_debug("pthread pool is full for root queue: " | ||
"%p", dq); | ||
return; | ||
} | ||
j = i > t_count ? t_count : i; | ||
j = (int32_t)i > t_count ? t_count : (int32_t)i; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please avoid this cast by making i
signed, which will mean fixing _dispatch_global_queue_poke
to take an int too, all the way down to the root of that need which is da_thr_cnt
. this is fine as dispatch really only can manage in the thousands of threads, so making this signed is not an issue, and that would avoid these ugly looking casts
Thanks Pierre! I'll work through the detailed comments; on first read I think I understand your points. On the overall comments, I agree we need to have a regulator for all the priority buckets. Since it looks like the overall approach is plausible, I'll go ahead and do that as part of the next revision. I was stalling on tackling all the buckets to make sure that trying to use dispatch's pthread_root_queue code wasn't going to be a non-starter. On overall points 2 & 3, I agree there are timing holes in the code where it could easily create extra threads (or be slow by 1 timer invocation of the monitor to notice that the system was totally idle due to blocked workers and needed to be kicked). The libpwq code did do the more complicated in-flight accounting you described to contain thread explosions. I should be able to do something similar. |
FWIW for the regulators, overcommit don't need one, the root queue implementation as it is is "good enough". the regulator is only really meaningful for non overcommit buckets. however the overcommit workqueue threads may want to be accounted for globally to have a global thread limit (like Darwin has) that spans several buckets. Our belief today TBH is that the limit for threads and concurrency should encompass any thread that is at a higher priority. Meaning that if you already are using NCPU runnable threads at the highest priority, other non overcommit buckets should have nothing. The current approach you're taking doesn't allow to do that, but I wouldn't try to do it at the first try either, I'm just explaining what we believe to be the right direction overall should it influence some choices you're taking. |
I've made a pass through and handled almost all of the review comments. The open items are:
Thanks! |
I want to test more thoroughly on Monday, but I realized the pending thread logic was actually already in place for the pthread_root_queues. I just needed to add the right cmpxchg on dgq_pending in the new oversubscribe routine and it should just work. |
I think that the pthread root queues should have all it takes, and if we need more, we can certainly move the extra struct definition that is stuck at the end of a dispatch_queue_s to workqueue_internal.h if that helps with things, to me "workqueue" is "below" src/queue.c in the same way event/event* is below source.c, so having workqueue.c call into queue.c is inverted layering. However what you're doing so far feels ok to me. I had a hunch that dgq_pending has what you need, and I expect that when you give back threads if you're above the original limit you can just cap the counter back. IOW if the limit was 10, and you oversubscribed by say 5, so you have 15 threads at the moment, as they go back, eventually the count will become 10, then 11, then ... then 15, well you want to drop anything that's above the limit of 10 (which may need you to keep a copy of that original limit if it doesn't already exist as a thing). |
Just a short comment on 2 for your consideration - not regarding the mechanism used for removing the threads, but rather the decision logic on when to remove them and how quickly. The implementation we did in libpwq used hysteresis for the backoff of threads to avoid degenerate behaviour when you have workloads that ramp up/down the thread count within fairly short time spans. We just had a simple aggregation of the number of threads that were oversubscribed per time unit and had a heuristic of "x idle-thread-seconds" accumulated required for backoff of one thread- in practice it meant that if you had many extra threads idling, we would more aggressively deallocate them in the beginning and then tapering off as we got closer to the desired number. Essentially a similar model used for heat system regulation (accumulating the delta between actual and desired temperatures over time and taking action due to certain thresholds). This approach essentially smoothed out the thread creation/deallocation problems we observed for certain workloads (initially we just naively deallocated threads immediately when there was a surplus) and it proved quite good in practice - hope the explanation makes sense. |
@hassila thanks for the description of libpwq's control strategy; makes sense |
@MadCoder I've reworked the code and it is ready for you to take another look. In addition to the localized fixes, I've made some adjustments to how the workq_monitor system interacts with the global queues to address the oversubscription/pending thread create issue. As a result, the delta got smaller and I think easier to understand/maintain. The key change is that I added a 'floor' parameter to _dispatch_global_queue_poke enabling it to be used both for "normal" poke operations (floor of 0) and for oversubscription requests (floor < 0) thus funneling all the thread creation requests through a single code path. Right now, the only mechanism for reducing the thread pool size is the 5 second timeout on idle threads. Looking through the code in queue.c, I was wondering if a possible approach would be to adapt the HAVE_PTHREAD_WORKQUEUE_NARROWING path to also work on Linux with the workq_monitor (if the monitor sees overload, it would post a narrow request)? Unless you think it is critical to get this support into the initial changes, I'd be inclined to handle this in a follow up change. |
@MadCoder, would be great if you could take a look at the current proposed changes this week and let me know if there is more that is needed before it can be tested/merged. I would like to get this (and ideally the followup changes to modernize priority/QoS API on Linux) into the Swift 4 beta. Thanks! |
I don't think it's urgent to handle the HAVE_PTHREAD_WORKQUEUE_NARROWING case for now |
configure.ac
Outdated
) | ||
AM_CONDITIONAL(HAVE_INTERNAL_PTHREAD_WORKQUEUES, $have_internal_pthread_workqueues) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DISPATCH_USE_INTERNAL_WORKQUEUE as well ? (also avoids the confusing singular/plural difference between the define and the conditional)
src/queue.c
Outdated
@@ -32,6 +32,7 @@ | |||
#endif | |||
#if HAVE_PTHREAD_WORKQUEUES && (!HAVE_PTHREAD_WORKQUEUE_QOS || DISPATCH_DEBUG) && \ | |||
!HAVE_PTHREAD_WORKQUEUE_SETDISPATCH_NP && \ | |||
!HAVE_INTERNAL_PTHREAD_WORKQUEUE && \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not very clear on why HAVE_PTHREAD_WORKQUEUES is true for the internal workqueue case, that doesn't seem quite right, that define is intended to guard the use of the specific pthread_workqueue API being used (which the internal version doesn't emulate). It would be easier to understand if the cases where you need something that is guarded by HAVE_PTHREAD_WORKQUEUES just added a || HAVE_INTERNAL_PTHREAD_WORKQUEUE instead (and then you wouldn't need the confusing addition here)
Alternatively, could add a new DISPATCH_USE_WORKQUEUES at the top here that guards the specific bits that are needed for both HAVE_PTHREAD_WORKQUEUES && HAVE_INTERNAL_PTHREAD_WORKQUEUE
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
resp s/HAVE_INTERNAL_PTHREAD_WORKQUEUE/DISPATCH_USE_INTERNAL_WORKQUEUE/ as discussed elsewhere
src/queue.c
Outdated
@@ -635,7 +639,8 @@ _dispatch_root_queues_init_workq(int *wq_supported) | |||
result = !r; | |||
} | |||
#endif // HAVE_PTHREAD_WORKQUEUE_SETDISPATCH_NP | |||
#if DISPATCH_USE_LEGACY_WORKQUEUE_FALLBACK || DISPATCH_USE_PTHREAD_POOL | |||
#if DISPATCH_USE_LEGACY_WORKQUEUE_FALLBACK || DISPATCH_USE_PTHREAD_POOL || \ | |||
HAVE_INTERNAL_PTHREAD_WORKQUEUE |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't be necessary, you already have DISPATCH_USE_PTHREAD_POOL defined
src/queue.c
Outdated
// because the fastpath of _dispatch_global_queue_poke didn't | ||
// know yet that we're using the internal pool implementation | ||
// we have to undo its setting of dgq_pending | ||
qc->dgq_pending = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
don't you want to do this only if !pwq
? (to match the check in _dispatch_global_queue_poke). Otherwise you'll wipe out the pending info for the legacy fallback above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really the only one of my comments that requires fixing, the rest of my comments are more about fitting in better with the existing configuration/define scheme and some style nits that we can also address later if necessary
src/queue.c
Outdated
@@ -5475,6 +5509,15 @@ _dispatch_worker_thread(void *context) | |||
(void)dispatch_assume_zero(r); | |||
_dispatch_introspection_thread_add(); | |||
|
|||
#if HAVE_INTERNAL_PTHREAD_WORKQUEUE | |||
bool overcommit = qc->dgq_wq_options & WORKQ_ADDTHREADS_OPTION_OVERCOMMIT; | |||
bool manager = dq == &_dispatch_mgr_root_queue; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add ()
a few minor comments but overall the additions to the existing code look great to me! |
src/queue.c
Outdated
// because the fastpath of _dispatch_global_queue_poke didn't | ||
// know yet that we're using the internal pool implementation | ||
// we have to undo its setting of dgq_pending | ||
qc->dgq_pending = 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really the only one of my comments that requires fixing, the rest of my comments are more about fitting in better with the existing configuration/define scheme and some style nits that we can also address later if necessary
af4a6d9
to
8b235c5
Compare
Thanks! I guarded the clearing of qc->dgq_pending, renamed to DISPATCH_USE_INTERNAL_WORKQUEUE, and added () at lines 5513 and 5514. Squashed back into single commit. I'd like to clean up HAVE_PTHREAD_WORKQUEUES being defined when using the internal workqueues in a follow up PR. I think it would make sense to look at this and switching to more of a QoS based priority definition (Pierre's comment about the priority levels being defined in workqueue_internal.h) as a single change. |
looks great, thanks! cleaning up the conflation with HAVE_PTHREAD_WORKQUEUES as a followup makes sense to me |
@swift-ci please test |
@swift-ci please test |
Looks like the build got into a confused state about whether libpwq was included or not. I'm going to try pulling in dgrove-oss@626f8a8 which removes libpwq entirely and see if that is enough. If it isn't, we might need to figure out how to force a clean workspace on the build machines. |
@swift-ci please test |
1 similar comment
@swift-ci please test |
Yeah! Build worked. Do you want me to squish to 1 commit, or leave as two? |
please squash it all and I'll merge |
Provide an implementation of the pthread_workqueue functionality (libpwq) as a fully integrated component of libdispatch. Integration of the workqueue implementation into the libdispatch code base simplifies the process of evolving the APIs between the two layers and thus prepares for future optimization and enhancements. The overall design is to augment libdispatch's existing pthread_root_queue option with periodic user-space monitoring of the status of a root queue's worker threads and to oversubscribe the workqueue when not enough workers are observed to be runnable when the queue has available work. Initially, the integrated pthread_workqueue is only enabled by default on Linux. The current monitoring implementation relies on Linux-specific code to dynamically estimate the number of runnable worker threads by reading /proc. Porting the monitoring code to non-Linux platforms would entail providing similar functionality for those platforms. Remove libpwq git submodule and autotools support for building libpwq from source as part of building libdispatch.
897ab1a
to
703b1d5
Compare
squashed; ready to merge. Thanks! |
this causes build failures on Darwin due to numerous unsigned vs signed build warnings (which we have enabled as fatal) from the uint32_t -> int32_t change. |
implement pthread_workqueue within libdispatch Signed-off-by: Daniel A. Steffen <[email protected]>
Provide an implementation of the pthread_workqueue
functionality as a fully integrated component of
libdispatch. Integration of the workqueue implementation
into the dispatch code base simplifies the process of
evolving the APIs between the two layers and thus
prepares for future optimization and enhancements.
Initially, the integrated pthread_workqueue is only
enabled by default on Linux. Most of the internals are
built on pthread APIs and thus should be fairly portable.
However, Linux-specific code is used to dynamically estimate
the number of runnable worker threads by reading /proc.
Porting the thread pool management code to non-Linux
platforms would entail providing similar functionality
for those platforms (or otherwise restructuring the manager).