中的EntityFramework .INCLUDE()与.Load()的性能性能、EntityFramework、INCLUDE、Load

2023-09-02 10:36:18 作者:一曲一长叹

在查询一张大桌子,你需要稍后访问code导航性能(我明确不希望使用延迟加载)究竟会表现得更好 .INCLUDE() .Load()?或者为什么用比其他的人吗?

在本实施例中所包括的表中的所有只有大约10个条目和雇员有大约200个条目,并且可能发生的是大部分的那些将与反正加载包括因为它们匹配where子句

  Context.Measurements.Include(M => m.Product)
                    .INCLUDE(M => m.ProductVersion)
                    .INCLUDE(M => M.LINE)
                    .INCLUDE(M => m.MeasureEmployee)
                    .INCLUDE(M => m.MeasurementType)
                    。凡(米=> m.MeasurementTime> = DateTime.Now.AddDays(-1))
                    .ToList();
 

  Context.Products.Load();
Context.ProductVersions.Load();
Context.Lines.Load();
Context.Employees.Load();
Context.MeasurementType.Load();

Context.Measurements.Where(米=> m.MeasurementTime> = DateTime.Now.AddDays(-1))
                    .ToList();
 

解决方案

答案是看情况,都试一下。

在使用包括(),你加载到底层数据存储的单一调用所有数据的好处。如果这是一个远程SQL Server,例如,可以是一个重大的性能提升。

缺点是,包括()查询往往得到的真正的复杂的,特别是如果你有任何的过滤器(其中()要求,例如)或尝试做任何分组。 EF会生成使用子非常沉重的嵌套查询 - SELECT 适用语句来得到你想要的数据。它也是效率低得多 - 你回到数据的单一行,在它的每一个可能的子对象列,因此为您的顶层对象的数据将被重复很多次。 (例如,10个孩子一个单亲对象将产品的10行,每行的父对象的列相同的数据。)我已经在同一时间为EF运行时,单EF查询,得到如此复杂,他们所造成死锁更新逻辑。

加载()方法要简单得多。每个查询是一个单一的,简单,直接 SELECT 语句对一个表。这些是无微不至,除了要容易得多的,你所要做的许多人(可能很多倍)。如果您有嵌套集合的集合,你甚至可以通过你的顶层对象和加载及其子对象需要循环。它可以一发不可收拾。

作为一个快速的规则 - 拇指,我尽量避免任何超过三个包含在一个单一的查询调用。我发现,英孚的查询得到丑陋的认识超越了;这也符合我的规则的拇指为SQL Server的查询,即最多四个JOIN语句在单个查询工作得很好,但在那之后它的时间来考虑重构。

然而,所有这只是一个起点。这取决于你的架构,你的环境,你的数据,以及其他许多因素。最终,你将只需要尝试一下各种方式。选择一个合理的默认模式的使用,看它是否足够好,如果没有,optimizse的味道。

When querying a large table where you need to access the navigation properties later on in code (I explicitly don't want to use lazy loading) what will perform better .Include() or .Load()? Or why use the one over the other?

In this example the included tables all only have about 10 entries and employees has about 200 entries, and it can happen that most of those will be loaded anyway with include because they match the where clause.

Context.Measurements.Include(m => m.Product)
                    .Include(m => m.ProductVersion)
                    .Include(m => m.Line)
                    .Include(m => m.MeasureEmployee)
                    .Include(m => m.MeasurementType)
                    .Where(m => m.MeasurementTime >= DateTime.Now.AddDays(-1))
                    .ToList();

or

Context.Products.Load();
Context.ProductVersions.Load();
Context.Lines.Load();
Context.Employees.Load();
Context.MeasurementType.Load();

Context.Measurements.Where(m => m.MeasurementTime >= DateTime.Now.AddDays(-1))
                    .ToList();

解决方案

The answer is "it depends, try both".

When using Include(), you get the benefit of loading all of your data in a single call to the underlying data store. If this is a remote SQL Server, for example, that can be a major performance boost.

The downside is that Include() queries tend to get really complicated, especially if you have any filters (Where() calls, for example) or try to do any grouping. EF will generate very heavily nested queries using sub-SELECT and APPLY statements to get the data you want. It is also much less efficient -- you get back a single row of data with every possible child-object column in it, so data for your top level objects will be repeated a lot of times. (For example, a single parent object with 10 children will product 10 rows, each with the same data for the parent-object's columns.) I've had single EF queries get so complex they caused deadlocks when running at the same time as EF update logic.

The Load() method is much simpler. Each query is a single, easy, straightforward SELECT statement against a single table. These are much easier in every possible way, except you have to do many of them (possibly many times more). If you have nested collections of collections, you may even need to loop through your top level objects and Load their sub-objects. It can get out of hand.

As a quick rule-of-thumb, I try to avoid having any more than three Include calls in a single query. I find that EF's queries get to ugly to recognize beyond that; it also matches my rule-of-thumb for SQL Server queries, that up to four JOIN statements in a single query works very well, but after that it's time to consider refactoring.

However, all of that is only a starting point. It depends on your schema, your environment, your data, and many other factors. In the end, you will just need to try it out each way. Pick a reasonable "default" pattern to use, see if it's good enough, and if not, optimizse to taste.

 
精彩推荐
图片推荐