When I posted about Empirical Evidence and finding a Troubleshooting Methodology, I talked about implementing a solution just because it seemed to work. This can manifest itself in different ways. Some are alright and harmless (what your shotgun hit happened to be the right fix) and some can be dangerous and hide the real problem.
One area I see this displayed is query tuning. In my experiences with SQL Server clients and employers, I often find developers over complicate oversimplify over complicate or oversimplify the job of tuning (I really don’t know which it is 🙂 ) .
A recent example that helps exemplify this is a query I was reviewing. There were HASH join hints in several of the joins. When asked why, the response was basically, “it worked, it helped force a join order and the query comes back faster”.
Looking through the query with a slightly more informed view we saw that the real problems was not fixed with a join hint. Instead it was common query culprits: poor indexes, out of date statistics on one of the tables and some code that could be better organized. Adding some indexes and a slight change to the proc improved it dramatically and it only took 20 minutes of experimenting to get it to work. We knew what to look for and try because of the query plan. Prior to simplifying and looking at the query plan the simple things were missed and a join hint was added, masking the original problem.
My take on query tuning? It’s not that tough, it isn’t magic and it normally doesn’t involve trickery. Let’s talk through some “pseudo-tuning”. There are plenty of posts out there going in depth with the operators, the plans and the fine print. I may post my own follow-up with more gory details as well but check out the blog roll on this site for some great links. My goal in this post is to demystify and take the fear out of query tuning.
Keep it Simple Stupid
I enjoyed reading several of Dick Marcinko’s books when I was younger. He talked a lot about”Keeping it Simple Stupid”. It makes sense in nearly every industry. If we presume something is difficult we will make it difficult to grasp from the get go. We will try to over complicate a solution or approach because “it can’t be that simple”.
Studying a query problem is no different, yes as you get deeper some of the knowledge goes deeper but the basic concepts remain simple. Without getting bogged down by operators yet in this post let’s talk through a tuning experiment and some of the approaches I try to take. I will follow up with posts that go through some specifics with some example screenshots at some point.
Simplify the problem: A query is taking longer than it should. What do we look at first?
So you can look at a few things at this point. If you are in a shared environment you might want to quickly rule out blocking. Open a new query window so you are in a new session. In SQL Server 2005 and greater you can query one of several DMVs.
I normally take a look at SYS.DM_EXEC_REQUESTS or SYS.DM_OS_WAITING_TASKS. In either of these, I am looking for my session and looking at the BLOCKING_SESSION_ID column. If there is no value in there, you aren’t being blocked.
Alright so we ruled out a block really simply now what?
Oh No! The Query Plan, that XML or those lines, those words, those arrows!!!!
Boston Legal was on in the background when I started this post, so I actually am picturing a developer walking around their cube, nervously shuffling their feet like Jerry Espenson on the show.
So you’ve ruled out a block, what now? Some developers will just start adding huge indexes that include nearly every (if not every) column in the table. Some will add join hints, some will move join order around hoping for a change, some will start pacing (see above). But most developers will hopefully look at what SQL is trying to do, look at the estimated query plan for the execution that just isn’t going anywhere. (looking at statistics is not a horrible option here either, nor is looking at indexes that are sensible for your query, not saying the only right option is looking at the plan).
Alright, so I’ll look at the query plan
Good. So we ask for an estimated query plan. We see a series of operators and arrows. It can be overwhelming to look at for a larger query for the first time. It can also be overwhelming to try and understand every operator you are seeing but what are some basic principles to apply to our “pseudo-tuning” example?
- Read your query plan from right to left. The right most operator is the one first working with the data, and the data is passed from right to left to the next operator.
- The size of the arrows is indicative of the amount of data being passed on to the next operator. Highlight over the arrow for more details on the estimated number of rows and other details. Does it make sense?
- Look for the higher cost operators. Each operator has an operator cost. The total of these costs makes the total subtree cost on the left most operator, the more expensive operators will have a higher Cost percentage.
- A high cost percentage isn’t bad in and of itself. Even in a simple query with a low total subtree cost, the total must be 100% and there may be an expensive operator.
- But if you are having issues, the higher cost operators should help point a direction or issue. The cost percentage isn’t the concern. Look at what the operator is, look at the number of rows, etc.
- When you look at your operators, look at the Scans, hash joins, hash match aggregates. These are not all bad in and of themselves but can point to a potential missing index or index not being utilized in a join or where.
- When you see an expensive scan, look at your underlying index, look at the predicate or join conditions for that table. Would an index on the columns involved make a difference? Can you get a seek instead of a scan through creating an index.
- Where you have an index that appears to not be used, ask what you are returning. Are you returning all the rows anyway? Maybe your index won’t be useful. But if you are index is on a selective enough column and it isn’t being used ask if you are doing something preventing it from being used. Is your search argument truly a search argument?
- Look for bookmark lookups (Gail Shaw has a post on this)
Those pointers are meant to show you that looking at a query plan and getting a direction isn’t meant to be complicated. You can of course go into much more detail and that will happen as you start using the plans. Once you start with the simple stuff you will, if you have any interest, further that knowledge, start searching for terms you find in query plans and hopefully start understanding a lot of the nuances of SQL Server.
The point is to get in there and learn. To start with the simple and build on. Once you get your information from the query plan, experiment with the solutions. Look at writing more efficient SQL, look at better indexing strategies, and experiment. Look at the query plan each time and see the effect.
Here are some resources I just found doing a quick search of my own memory, my feed reader and Google:
Inside SQL Server 2005 – T-SQL Querying. If you don’t have this book, you should. Query processing is explained as well as T-SQL constructs and pointers for performance.
Craig Freedman’s blog. Craig is blogging from the inside (like many developers/program managers at Microsoft) and has excellent posts on why and what the optimizer does what it does.
Denny Cherry has an excellent blog and his posts tagged with “back to basics” are just that. Some good information on indexing in particular. I seem to remember he had a good walk through of a query plan with pictures but I can’t seem to find it.
I list every feed I subscribe to in my Blog Roll also.
Hopefully this helps take some of the fear of or inattention to query plans. It really is just a matter of applying some specific knowledge to a simple approach. When we go through some of the specifics in future posts you’ll see that the same basic steps are used.