我有一个C#.NET 4.5应用程序大量使用任务并行库,最终结束了操作天后饿死线程。
I have a C# .NET 4.5 application heavily using the Task Parallel Library that eventually ends up starved for threads after days of operation.
当我抓住从ADPlus的一个挂起转储,并期待在通过Visual Studio中的线程,我看到在我的code 43线程,无明显产地:
When I grab a HANG dump from AdPlus and look at the threads via Visual Studio, I see 43 threads with no apparent origin in my code:
ntdll.dll!_NtWaitForSingleObject@12() + 0x15 bytes
ntdll.dll!_NtWaitForSingleObject@12() + 0x15 bytes
kernel32.dll!@BaseThreadInitThunk@12() + 0x12 bytes
ntdll.dll!___RtlUserThreadStart@8() + 0x27 bytes
ntdll.dll!__RtlUserThreadStart@8() + 0x1b bytes
为什么要在自己的堆栈跟踪做这些线程显示没有管理的起源?
Why do these threads show no managed origin in their stack trace?
在一个给定的进程中的所有线程,甚至TPL线程有这个启动过程。当您启动一个线程在运行,最终CLR调用OS启动一个线程。你在寻找什么是该线程在启动时执行的功能。如果您暂停任何管理的过程中,你会看到在堆的底部也有非托管的呼叫。你看不到管理启动程序的原因,是每个线程得到它自己的堆栈,由OS创建时创建线程。
All threads in a given process, even TPL threads have this startup procedure. When you start a thread running, eventually the CLR calls the OS to start a thread. What you're looking at is the functions that the thread executes at startup. If you suspend any managed process, you'll see that at the bottom of the stack there are unmanaged calls. The reason you don't see the managed start procedure, is that each thread gets it's own stack, created by the OS when it creates the thread.
例如,运行以下内容:
for (int i = 0; i < 10; i++)
{
Thread t = new Thread(new ThreadStart(()=>Thread.Sleep(100000)));
t.Start();
}
Console.ReadKey();
再破入过程中使用的WinDbg,看着熟睡的一个线程,给出了一个调用栈看起来像这样(所有线程都在底部的两个相同的功能,我只是倾销一个用于此运动):
then breaking into the process using WinDbg, and looking at one of the sleeping threads, gives a call stack that looks like this (All of the threads have the same two functions at the bottom, I'm just dumping one for this exercise.):
0:012> !dumpstack
OS Thread Id: 0x3694 (12)
Current frame: ntdll!ZwDelayExecution+0xa
Child-SP RetAddr Caller, Callee
000000001dc8ea70 000007fefd1c1203 KERNELBASE!SleepEx+0xab, calling ntdll!NtDelayExecution
000000001dc8eae0 000007fefd1c38fb KERNELBASE!SleepEx+0x12d, calling ntdll!RtlActivateActivationContextUnsafeFast
000000001dc8eb10 000007fed860a888 clr!CExecutionEngine::ClrSleepEx+0x29, calling KERNEL32!SleepExStub
000000001dc8eb40 000007fed874d483 clr!Thread::UserSleep+0x7c, calling clr!ClrSleepEx
000000001dc8eba0 000007fed874d597 clr!ThreadNative::Sleep+0xb7, calling clr!Thread::UserSleep
[... removed some frames for clarity ...]
000000001dc8f6f0 000007fed874fcb6 clr!Thread::intermediateThreadProc+0x7d
000000001dc8faf0 000007fed874fc9f clr!Thread::intermediateThreadProc+0x66, calling clr!alloca_probe
000000001dc8fb30 0000000077195a4d KERNEL32!BaseThreadInitThunk+0xd
000000001dc8fb60 00000000773cb831 ntdll!RtlUserThreadStart+0x1d
作为参考,这是发
对象包装,我们倾倒的堆栈螺纹:
For reference, this is the Thread
object wrapping the thread that we dumped the stack of:
0:012> !do 2a23e08
Name: System.Threading.Thread
MethodTable: 000007fed76522f8
EEClass: 000007fed7038200
Size: 96(0x60) bytes
File: C:\Windows\Microsoft.Net\assembly\GAC_64\mscorlib\v4.0_4.0.0.0__b77a5c561934e089\mscorlib.dll
Fields:
MT Field Offset Type VT Attr Value Name
000007fed763eca8 4000765 8 ....Contexts.Context 0 instance 0000000000000000 m_Context
000007fed765a958 4000766 10 ....ExecutionContext 0 instance 0000000000000000 m_ExecutionContext
000007fed7650e08 4000767 18 System.String 0 instance 0000000000000000 m_Name
000007fed76534a8 4000768 20 System.Delegate 0 instance 0000000000000000 m_Delegate
000007fed7655390 4000769 28 ...ation.CultureInfo 0 instance 0000000000000000 m_CurrentCulture
000007fed7655390 400076a 30 ...ation.CultureInfo 0 instance 0000000000000000 m_CurrentUICulture
000007fed76513e8 400076b 38 System.Object 0 instance 0000000000000000 m_ThreadStartArg
000007fed7654a00 400076c 40 System.IntPtr 1 instance 24a5ed0 DONT_USE_InternalThread
000007fed7653980 400076d 48 System.Int32 1 instance 2 m_Priority
000007fed7653980 400076e 4c System.Int32 1 instance 12 m_ManagedThreadId
000007fed7658c48 400076f 50 System.Boolean 1 instance 1 m_ExecutionContextBelongsToOuterScope
000007fed7672e70 4000770 378 ...LocalDataStoreMgr 0 shared static s_LocalDataStoreMgr
>> Domain:Value 00000000005f40b0:NotInit <<
000007fed7672df0 4000771 8 ...alDataStoreHolder 0 shared TLstatic s_LocalDataStore
>> Thread:Value <<
在 System.IntPtr
brillantly 的名为 DONT_USE_InternalThread
持有的指针操作系统线程。 (我的猜测是,它可能是从的CreateThread
手柄,但我没有调查就太多了。)
The System.IntPtr
brillantly named DONT_USE_InternalThread
holds the pointer to the OS thread. (My guess is that it's probably the handle from CreateThread
, but I didn't investigate it too much.)
(编者注:高明
拼写这种方式有意请不要'修复'吧)的
(Note to editors: brillant
is spelled that way intentionally. Please don't 'fix' it)