如何解决间歇性SQL超时错误间歇性、如何解决、错误、SQL

2023-09-03 08:28:28 作者:Gentle.(西格.)

我们一直有每天少数情况下,我们得到的SQL超时错误的转换来自多个应用程序(System.Data.SqlClient.SqlException:在。超时时间已超时时间之前的操作或服务器执行完毕没有响应。)我们有我们的网络超过100个不同的应用程序,Web和桌面应用程序。从VB6和经典ASP一切.NET 4我能找到的各种数据,展示了副作用,但不能查明是什么原因造成这一点。我们的DBA说,没有什么是错的SQL服务器,它说没有什么不对的网络服务器或网络,所以当然我留在中间试图解决这一点。

我真的只是想要建议什么其他故障排除我可以做的尝试和跟踪下来。

我们正在运行SQL Server 2008 R2中的群集。有极少数的连接到它,包括从Windows Server 2003〜2008年不同品种不同的服务器。

在与 SQL Server 建立连接时报错解决办法

下面是我到目前为止,完成的:

的长时间运行的查询和死锁生成SQL跟踪。这说明没有死锁的问题的时候,和长时间运行的查询都与我们的超时错误一致,但看起来是一个副作用,而不是原因。查询是非常基本的,通常返回瞬间最终以30,60或120秒,在时间运行。出现这种情况了几分钟,然后一切拿起后工作正常。 使用性能监视器来跟踪连接池中的连接。这有时会显示在靠近超时的时间连接数一些尖峰,但仍然没有连半默认100连接限制。同样,没有什么在这里,似乎指向一个原因。 单独的Web应用程序集成到不同的应用程序池。我们试图缩小我们认为可能是主要的问题(最健谈,等等),并把它们放在单独的应用程序池的应用程序,但是,这并不似乎有什么影响或帮助我们缩小东西。 在SQL Server上的监控磁盘的使用情况。我们所做的SQL服务器上的一些监测,看看当这些超时出现的问题,并没有尖刺或任何迹象。 验证的TempDB 不是问题的原因。

我会回来,并添加更多,如果我认为还有什么我们已经尽力了。请让我知道一些想法什么来解决下一个。

解决方案   

的长时间运行的查询和死锁的运行SQL跟踪。这说明无   在死锁的问题的时候,和长时间运行的查询所有   我们超时错误一致,但看起来是一个副作用,   不是原因。查询是非常基本的,通常返回   瞬间最终以30,60或120秒,在时间运行。本   发生了几分钟,然后一切拾取并工作正常   在那之后。

看起来有些查询/交易锁定数据库,直到他们完成。你必须找出哪些查询阻止和重写他们/时的其他时间,以避免阻塞其他进程运行它们。在这个时刻等待查询只是超时。

这是多余的点深入到你的事务日志和数据库的自动增量大小。将它们设置在一个固定的大小,而不是当前的文件的百分比。如果文件越来越高所花费的时间来分配足够的空间,最终将不再作为您的交易超时。而你的分贝来到止步不前。

We've been having a few instances per day where we get a slew of SQL Timeout errors from multiple applications (System.Data.SqlClient.SqlException: Timeout expired. The timeout period elapsed prior to completion of the operation or the server is not responding.) We have over 100 different applications on our network, both web and desktop apps. Everything from VB6 and Classic ASP to .NET 4. I can find all kinds of data that show the side effects but can't pinpoint what is causing this. Our DBA says nothing is wrong with the SQL server, and IT says there's nothing wrong with the web servers or network, so of course I'm left in the middle trying to troubleshoot this.

I'm really just looking for suggestions on what other troubleshooting I can do to try and track this down.

We're running SQL Server 2008 R2 in a cluster. There's a handful of different servers that connect to it, ranging from Windows server 2003 to 2008 of different varieties.

Here's what I've done so far:

Run SQL trace of long running queries and deadlocks. This shows no deadlocks at the times of the problems, and long running queries all coincide with our timeout errors, but look to be a side effect, and not the cause. Queries that are very basic that typically return instantly end up taking 30, 60 or 120 seconds to run at times. This happens for a few minutes then everything picks up and works fine after that. Use performance monitor to track connection pool connections. This sometimes shows some spikes in the number of connections near the times of the timeouts, but still not even halfway to the default 100 connection limit. Again, nothing here that seems to point to a cause. Separate web applications into different App Pools. We tried to narrow down the apps we thought may be the main problem (most chatty, etc) and put them in separate Application Pools but that doesn't seem to affect anything or help us narrow down anything. Monitor disk usage on SQL Server. We've done some monitoring on the SQL server and see no spikes or any signs of problems when these timeouts are occurring. Verified TempDB was not the cause of the problem.

I'll come back and add more if I think of what else we've tried. Please let me know some ideas on what to troubleshoot next.

解决方案

Run SQL trace of long running queries and deadlocks. This shows no deadlocks at the times of the problems, and long running queries all coincide with our timeout errors, but look to be a side effect, and not the cause. Queries that are very basic that typically return instantly end up taking 30, 60 or 120 seconds to run at times. This happens for a few minutes then everything picks up and works fine after that.

It looks like some queries/transaction lock your database till they are done. You have to find out which queries are blocking and rewrite them/run them at an other time to avoid blocking other processes. At this moment the waiting queries just timeout.

An extra point to dig into is the auto increment size of your transaction log and database. Set them on a fixed size instead of a percentage of the current files. If files are getting taller the time it takes to allocate enough space will eventually longer as your transaction timeout. And your db comes to a halt.