Use SET STATISTICS IO and SET STATISTICS TIME to Help Tune Your SQL Server Queries

This is not an article on how to tune queries (that subject would take a book), but an article on how to use the often overlooked Transact-SQL SET STATISTICS IO and the SET STATISTICS TIME commands to help us tune our queries.

On the surface, the goal of performance tuning a query seems simple enough. Essentially, we want our queries to run faster. Whether it is trimming a query that takes 10 minutes down to 1 minute, or trimming a 2 second query down to 1 second, our ultimate goal is to reduce how long it takes our queries to run.

Given the simplicity of our goal, how come it is often so hard to accomplish it? While there are many reasons why query performance tuning is difficult, this article focuses on one of these. And that is that query tuning takes place if a fluid environment that is often changing from second to second, making it hard to really know what exactly is going on.

So what do I mean by this? If you are like most people who are trying to performance tune your queries, you are either performing your work on a test server with test data or a copy of your live data, or a production server with live data. In either case, neither server is ever exactly the same from one test run of your query to another test run. Remember, SQL Server dynamically adjusts itself as the demands on its resources change from moment to moment. While query tuning on a test server will certainly be less dynamic than on a production server, virtually any SQL Server will present similar problems.

If you don’t believe it, run the same query over and over again on a busy server, and in most cases, the time it takes to run from execution to execution will vary. Sure, not always by a lot, but it will change enough that it makes query tuning more difficult than it should be. For example, as you are query tuning on a production server, you may notice that your original version runs in about 10 seconds. You review the query and notice that you can make a slight change to the query that should speed it up. When you run it, the query now takes 12 seconds.

What has happened? Was your idea wrong, or perhaps was the server experiencing a heavier load than usual at that very moment when you ran the new query, and this was the cause of the increase run time? While you can run your queries over and over again to get an average time, that is a lot of work. What you need is a more scientific way to compare the performance of one query version to another.

Measuring SQL Server Resources Are Key to Easing Query Tuning Problems
When you execute a query on SQL Server, various server resources are used. One of those resources is a specific amount of CPU time. If you run the same query over and over again–and assuming the database has not changed–the CPU resources needed to run the query will remain very close to the same from execution to execution (usually less than 500 milliseconds or less of variation from run to run). I am not talking about the amount of time it takes for a query to run from beginning to end, but the amount of CPU resources needed by the query to run. The amount of time that it takes a query to run to completion will vary based on how busy the server is.

Another resource that SQL Server needs is IO. Whenever you run a query, SQL Server must retrieve data from the data cache (logical reads), and if the data it needs is not in the data cache, then it must read it from disk (physical reads).

As must be obvious from this discussion, the more CPU and IO resources that a query needs to run, the slower the query will perform. So another way to describe the task of query tuning is that you want to rewrite your queries in such a way as that they use less CPU and IO resources. If you accomplish this, then the performance of your query will have improved.

If you think of query tuning in terms of reducing the server resources it uses, (rather than in terms of how long it takes for a query to run) the easier it is to measure whether or not the actions you take to tune your query are actually helping or hurting, especially on a busy server whose resource usage is changing all the time. Let’s find out how we can measure these resources so we can tell if our queries are using more or less resources as we tune them.

Both the SET STATISTICS IO and the SET STATISTICS TIME Transact-SQL commands have been around a long time. But for some reason, they are often ignored by many DBAs when it comes time to performance tuning queries. Perhaps it is because they aren’t very “sexy”. But whatever the reason, they are still handy, as we will see.

Both commands can be “turned on” using Transact-SQL commands using Query Analyzer (7.0 and 2000) or by setting the appropriate connection properties of the current connection in Query Analyzer. In this article, we will use the Transact-SQL commands for illustration.

The SET STATISTICS IO and the SET STATISTICS TIME commands act like settings, turning on and off various reported information on the resources our queries use. By default, these settings are not turned on. So let’s look at an example of how to turn these commands on, and take a look at what information they provide.

To begin our example, bring up Query Analyzer and connect to a SQL Server. For this example, let’s use the perennial favorite, Northwind, so go ahead and make it your default database for this connection.

Next, run this query:

SELECT * FROM [order details]

Assuming you haven’t altered this table, 2,155 rows will be returned. This is your typical result, which you have seen many times over in Query Analyzer.

Now, let’s run the same query, but this time, let’s run the SET STATISTICS IO and the SET STATISTICS TIME commands before we run the query. Keep in mind that these two commands “turn on” performance tuning data only for the existing connection. If you run one or both of these commands, then drop the connection and start another, then you will have to reissue these commands for the new connection. If you ever need to “turn off” these commands during the current connection, you can by substituting OFF for ON in the commands.

Before we begin our example, run these two commands (don’t run these on a production server) first. These two commands will clear out SQL Server’s data and procedure cache. This helps to ensure that we are starting from a level field. If you don’t do this every time before you run the following examples, then your results may not be consistent.

DBCC DROPCLEANBUFFERS
DBCC FREEPROCCACHE

Next, enter and run the following Transact-SQL commands:

SET STATISTICS IO ON
SET STATISTICS TIME ON

Once you have done this, then run this command again:

SELECT * FROM [order details]

(If you run all of the above commands at the same time, then your output will look different than mine, and it will be harder to see what is happening.)

After you run these command, you will see new information in the results window that you did not see before. At the very top of the window, you will see something like:

SQL Server parse and compile time:
CPU time = 10 ms, elapsed time = 61 ms.

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

After the above data, the rows resulting from the query will be displayed. At the very bottom of the 2,155 rows will be this additional new information displayed:

Table ‘Order Details’. Scan count 1, logical reads 10, physical reads 1, read-ahead reads 9.

SQL Server Execution Times:
CPU time = 30 ms, elapsed time = 387 ms.

(The results you get may vary somewhat, but we will talk about this below as we explain everything you see above.)

So what does all of this mean?

SET STATISTICS TIME Results
The SET STATISTICS TIME command is used to measure a variety of times, some of which may or may not be of much interest to you when performance tuning queries. This command is responsible for the following information displayed on your screen:

From the top of the output:

SQL Server parse and compile time:
CPU time = 10 ms, elapsed time = 61 ms.

SQL Server parse and compile time:
CPU time = 0 ms, elapsed time = 0 ms.

From the bottom of the output:

SQL Server Execution Times:
CPU time = 30 ms, elapsed time = 387 ms.

At the top of the output you see what appears to be the same time measurement twice, but the first one most likely will have some CPU and elapsed time, and the second one most likely will not. What is this telling us?

“SQL Server parse and compile time” tells us how much CPU time and total time it took SQL Server to parse and compile the Transact-SQL statement, “SELECT * FROM [order details]”, and then put the query plan that was created into SQL Server’s procedure cache for use by SQL Server.

In my case, it took 10 ms. of CPU time, and a total time of 61 ms. of elapsed time. Your CPU time will probably vary because your server is different than mine, and your elapsed time will vary because both your server, and your server’s load, will be different than mine.

The second time you see “SQL Server parse and compile time,” what this is telling you is the amount of time it took for SQL Server to retrieve the plan from the procedure cache so that it can be executed. Generally, the times here will be 0 because of how quick this task is.

If you should rerun the Transact-SQL command, “SELECT * FROM [order details]” again without clearing the cache (as recommended above), most likely both instances of CPU and compile time will be 0, as the cached query plan will be reused, so no new compile time is required.

So how is this information useful to you when performance tuning? Well, not a whole lot, but I needed to explain what this information meant. You will be surprised that most DBAs won’t know what it means.

What is most of interest to us is the time information displayed at the bottom of the query’s output. Here is it is again:

SQL Server Execution Times:
CPU time = 30 ms, elapsed time = 387 ms.

What this tells you is how much CPU time was used to execute the query, and how long the query took to run. Of these two pieces of information, only the first is of much use. The CPU time is a relatively consistent measurement of the amount of CPU resources it takes for your query to run, and this is relatively independent of how busy your CPU is. Sure, you probably will see some variation in this number, but it won’t be significant, not as wildly changing as the elapsed time number. The elapsed time number is a measurement of how long the query took to execute (not counting the time for locks or reads). This number will jump around a lot because of the ever-changing load on your server.

Because the CPU time is relatively consistent, you can use it as a way to help you determine whether the changes you make in your queries, during performance tuning, are actually helping or hurting. More on this later.

SET STATISTICS IO information is displayed at the bottom of the output results. Here is it again:

Table ‘Order Details’. Scan count 1, logical reads 10, physical reads 1, read-ahead reads 9.

Some of this information is very useful, and others not. Let’s take a look at each and learn what they mean.

Scan Count: Refers to the number of times that tables referenced in the query have been accessed. In our example, the table in our query was only accessed 1 time. For queries that don’t include joins, this information is not too helpful. But if your query has one or more joins, then this information may be of use to you.

For example, an outer table of a nested loop would have a Scan Count of 1. But for an inner table, the Scan Count refers to the number of times through the nested loop that the table was accessed. As you might imagine, the smaller the Scan Count for the inner table, the less resources it uses and the faster the performance of the query. As you tune a query with joins, you may want to pay attention to the Scan Count, watching to see if it is increasing or decreasing as you tune the query.

Logical Reads: This is the most useful piece of data provided by the SET STATISTICS IO or the SET STATISTICS TIME commands. As you probably know, before SQL Server can do anything with data, that data must be in its data cache. In addition, you probably know that when SQL Server reads data from the data cache, it reads it in 8K pages.

So what is a logical read? A logical read is the number of pages SQL Server had to read from the data cache in order to produce the results specified by your query. SQL Server will never read any data pages more or less than it needs when executing a query. Because of this, the exact same number of logical reads will always occur when the same query runs against the exact same data on SQL Server.

Why is knowing the number of logical reads SQL Server takes to execute a query important to your query tuning? Because it is the one thing that never changes from one execution of a query to another execution of the same query. And because of this, it is the perfect gauge to use when you are query tuning to determine if you are succeeding or failing.

As you tune a query, if the number of logical reads goes down, then you know that your query is using less server resources and will result in increased performance. And if the number of logical reads goes up, then whatever you are doing to “tune” your query is hurting the performance of your query. The fewer the logical reads performed by a query, the more efficient it is, and the faster it will perform, assuming all other things are held equal.

Physical Reads: What I am about to tell you here may be a little confusing, but it should eventually make sense once you think about it. First of all, a physical read refers to when SQL Server must go to disk to get data pages it needs to put into the data cache before it can execute the query. As I mentioned above, all data must be in the data cache before SQL Server can execute a query. Whenever SQL Server begins to execute a query, the first thing it does is to check and see if the data pages it needs are in the data cache. If they are, great. But if they are not, then SQL Server must go to disk and then move the pages it needs from disk into the data cache.

As you can imagine, it takes a lot more server resources for SQL Server to perform a physical read than a logical read. Ideally, for best performance, we should do our best to avoid physical reads when we can.

Now here is the part that is confusing. You should ignore physical reads when performance tuning your queries, instead focusing only on logical reads. How can this be, I just said that physical reads were much more resource intensive than logical reads?

This is a true statement, but the number of physical reads SQL Server uses to execute a query cannot be reduced by query tuning. Reducing physical reads is also an important task for the DBA, but it is a focus on whole server tuning, not query tuning. When you tune queries, you have no control over the size of the data cache or how busy the server is, or whether or not the data your query needs is in the data cache or on disk. The only thing you can directly control through query tuning is the number of logical reads that are required to produce the results you need.

Because of this, you can safely ignore the Physical Read data provided by the SET STATISTICS IO command. (Side Note: One way to help reduce physicals reads, and speed up your SQL Server, is to ensure your server has plenty of physical RAM.)

Read-Ahead Reads: Like physical reads, this data won’t be of use to you when query tuning. This value tells you the number of physical page reads that SQL Server performed as part of its read-ahead mechanism. To help optimize its performance, SQL Server reads physical data pages ahead of when it thinks your queries might need the data. The pages that are read-ahead may or may not be used, based on how well SQL Server guessed what your needs were.

In my example, notice that the read-ahead pages were 9, that the physical reads was 1, and that the logical reads was 10. Do these numbers add up? Yes they do. Here’s what happened when I ran the query on my server. First of all, SQL Server started to check to see if the data pages it needed to fulfill the query were in the data cache. About this same time, SQL Server was smart enough to figure out that they were not, and then the read-ahead mechanism kicked in and read the first 9 data pages out of the 10 it needed from disk and placed them in the data cache. When SQL Server checked to see if all 10 data pages were in the cache, it found that 9 of them were already there (because of the read-ahead mechanism), but one was missing. Because of this, SQL Server then had to retrieve the last single database and place it in the data cache. Once all of the data pages were in the cache, then SQL Server was able to process the query. SQL Server is smarter than you thought, isn’t it?

So What Does All This Mean to You?
As I said at the beginning of this article, it is important when performance tuning queries that you have some scientific basis in which to know whether or not your tuning efforts are paying off. The problem, as you remember, is that SQL Servers are often dynamic, and using total query run time as your sole measurement if a particular query you are tuning is performing better or worse, is not a fair way to judge your efforts.

A better way is to compare more scientific values, such as the number of logical reads or the CPU time it takes a query to run. So when you are performance tuning queries, you may want to begin using the SET STATISTICS IO and the SET STATISTICS TIME commands to provide you with the solid data you need in order to know exactly how successful your query tuning efforts really are.

转自 http://www.sql-server-performance.com/2002/statistics-io-time/

posted @ 2014-04-20 21:40 princessd8251 阅读(206) 评论(0) 收藏举报

刷新页面返回顶部