最大并发油门油门、最大

2023-09-03 07:59:27 作者:温酒叙余生

我希望有很多可能的解决这个问题,我可以拿出一些自己,比其他一些显然更好,但没有说我一定是最优的,所以我希望从你真正的多线程大师听到外面。

I expect there are many possible solutions to this question, I can come up a few myself, some clearly better than others but none that I am certain are optimal so I'm interested in hearing from you real multi threading gurus out there.

我有大约100件作品中,可以同时执行,因为它们之间没有相关性。如果我执行这些顺序我总执行时间约为1:30秒。如果我排队每份工作的线程池需要约2M,这表明,我认为我试图做太多一次和环境中的所有这些线程之间的切换是否定为这些线程的优势。

I have circa 100 pieces of work that can be executed concurrently as there are no dependencies between them. If I execute these sequentially my total execution time is approx 1:30s. If I queue each piece of work in the thread pool it takes approx 2m, which suggests to me that I am trying to do too much at once and context switching between all these threads is negating the advantage of having those threads.

所以,基于这样的假设(请随时拍我失望,如果这是错误的),如果我只能排队内核数量在我的系统(8本机上)作品的片断在任何一个时间我会减少上下文切换,从而提高整体效率(其它进程的线程不能承受,当然),任何人都可以提出最佳的模式/技术这样做呢?

So based on the assumption (please feel free to shoot me down if this is wrong) that if I only queue up to the number of cores in my system (8 on this machine) pieces of work at any one time I will reduce context switching and thus improve overall efficiency (other process threads not withstanding of course), can anyone suggest the optimal pattern/technique for doing this?

顺便说一句,我使用smartthreadpool。codeplex.com,但我没有。

BTW I am using smartthreadpool.codeplex.com, but I don't have to.

推荐答案

一个很好的线程池已经试图让每个内核提供一个活动的线程。这是不是有一个线程,每核心工作,虽然,因为如果一个线程(在I / O最经典)阻止你使用的核心要另一个线程的问题。

A good threadpool already tries to have one active thread per available core. This isn't a matter of having one thread for work per core though, as if a thread is blocking (most classically on I/O) you want another thread using that core.

试图在.NET线程池,而不是可能是值得一试,或Parallel类。

Trying the .NET threadpool instead might be worth a try, or the Parallel class.

如果你的CPU是超线程(4物理8个虚拟内核),这可能是一个问题。平均hypter线程使事情更快,但也有很多情况下它使事情变得更糟。尝试设置亲和力隔核,看看它给你一个进步 - 如果是的话,那么这很可能是一个情况下,超线程技术是坏的。

If your CPU is hyper-threaded (8 virtual cores on 4 physical) this could be an issue. On average hypter-threading makes things faster, but there are plenty of cases where it makes them worse. Try setting affinity to every other core and see if it gives you an improvement - if it does, then this is likely a case where hyper-threading is bad.

你有收集的结果再次相聚,或共享的资源的不同任务之间?这样做的成本很可能比多线程的储蓄。也许他们是如此不必要的,虽然 - 例如如果您在共享数据的锁定上,但这些数据只读过,你实际上并不需要阅读与大多数数据结构(大部分但不是全部都是安全的并发读取,如果没有写)。

Do you have to gather results together again, or share any resources between the different tasks? The cost of doing this could well be greater than the savings of multi-threading. Perhaps they are so unnecessarily though - e.g. if you are locking on shared data but that data is only ever read, you don't actually need to read with most data-structures (most but not all are safe for concurrent reads if there are no writes).

工作的分区可能是一个问题了。说单线程的方式运作的方式通过的内存区域,但多线程的方式给每个线程的内存其下位与循环工作。这里有会是每个核心更缓存刷新为好下位实际上是被另一个核心。在这种情况下,拆分工作到更大的块可以解决这个问题。

The partitioning of the work could be an issue too. Say the single-threaded approach works its way through an area of memory, but the multi-threaded approach gives each thread its next bit of memory to work with round-robin. Here there'd be more cache-flushing per core as the "good next bit" is actually being used by another core. In this situation, splitting work into bigger chunks can fix it.

有很多其他的因素,可以使多线程的方式执行差于单线程,但这些都是一些我能想到马上的。

There are plenty of other factors that can make a multi-threaded approach perform worse than a single-threaded, but those are a few I can think of immediately.

编辑:如果你正在写一个共享的存储,它可能是值得一试运行,你只丢掉任何结果。这可能会缩小是否这就是问题所在。

If you are writing to a shared store, it could be worth trying a run where you just throw away any results. That could narrow down whether that's where the issue lies.