.NET JIT编译的性能(包括动态的方法)是怎么受的C#编译器图像调试选项?是怎么、编译器、选项、图像

2023-09-04 07:11:43 作者:寒尘

我试图优化我的应用程序为它启动后表现良好权。目前,它的分布包含304二进制文件(包括外部的依赖),共计57兆字节。这是一个WPF应用程序做主要是数据库访问,没有任何显著计算。

I am trying to optimize my application for for it to perform well right after it is started. At the moment, its distribution contains 304 binaries (including external dependencies) totaling 57 megabytes. It is a WPF application doing mostly database access, without any significant calculations.

我发现调试配置提供了更好的方式(〜5倍的增益)时间对于大多数操作,因为它们是应用程序的过程的生命周期内进行的第一次。例如,在应用程序内打开一个特定的屏幕将在0.3秒NGENed调试,0.5秒即时编译调试,1.5秒NGENed版本和2.5秒,即时编译版本。

I discovered that the Debug configuration offers way better (~5 times gain) times for most operations, as they are performed for the first time during the lifetime of the application's process. For example, opening a specific screen within the app takes 0.3 seconds for NGENed Debug, 0.5 seconds for JITted Debug, 1.5 seconds for NGENed Release and 2.5 seconds for JITted Release.

据我所知,在JIT编译时间的差距是由JIT编译器采用更积极的优化的版本的二进制文件。从我可以告诉,调试和Release配置相差的 / P:DEBUGTYPE / P:优化开关传递到C#编译器,但我看到了同样的性能差距,即使我建立与 / P中的应用:配置=发行/号码:DEBUGTYPE =全/号码:优化=假 - 也就是说,相同的图像调试的选项, / p:配置=调试

I understand that the gap in JIT compilation time is caused by the JIT compiler applying more aggressive optimizations for the Release binaries. From what I can tell, Debug and Release configurations differ by the /p:DebugType and /p:Optimize switches passed to the C# compiler, but I see the same performance gap even if I build the application with /p:Configuration=Release /p:DebugType=full /p:Optimize=false – that is, the same image debug options as in /p:Configuration=Debug.

我确认选择很通过查看 DebuggableAttribute 应用到最终组装应用。五合一NGEN输出,我看<调试> 添加了一些组件被编译的名字 - 如何NGEN调试和非调试组件之间的区别?被测试的操作使用动态code一代 - ?什么级别的优化应用到动态的code

I confirm that the options were applied by looking at the DebuggableAttribute applied to the resulting assembly. Observing the NGEN output, I see <debug> added to the names of some assemblies being compiled – how does NGEN distinguish between debug and non-debug assemblies? The operation being tested uses dynamic code generation – what level of optimization is applied to dynamic code?

注:我使用的是32位架构由于外部依赖。我应该期待的x64不同的结果?

Note: I am using the 32-bit framework due to external dependencies. Should I expect different results on x64?

注:我还没有使用条件编译。所以编译源是相同的两个配置。

Note: I also do not use conditional compilation. So the compiled source is the same for both configurations.

推荐答案

如果像你说的,你有304组件加载,那么这很可能是你的应用程序的原因运行缓慢。 这似乎是一个非常高的数字组件被加载。

If, as you say, you have 304 assemblies to be loaded, then this is likely a cause of your app running slow. This seems like an extremely high number of assemblies to be loading.

每次CLR达到code从不是已经在AppDomain中加载的另一个组件,它必须从磁盘加载它。

Each time the CLR reaches code from another assembly that's not already loaded in the AppDomain, it has to load it from disk.

您可以考虑使用ILMerge合并其中的一些组件。这将减少延迟,从磁盘上加载组件(你拍一,大,盘触及前期)。

You might consider using ILMerge to merge some of those assemblies. This will reduce the delay in loading the assemblies from disk (you take one, larger, disk hit up-front).

这可能需要一些试验,因为不是所有喜欢被合并(特别是那些使用了反射,而取决于组件文件名从未改变)。它也可能会造成非常大的集。

It may require some experimentation, as not everything likes being merged (particularly those which use Reflection, and depend upon the assembly filename never changing). It may also result in very large assemblies.

 
精彩推荐