How many times have you heard or uttered those words? You work hard and understand the business requirements, you try and get the logic to work to make the procedure return what was expected or update the data as expected. You do some unit tests in your environment, it may not scream but it comes back in what seams like a reasonable time, the results appear right and you say, “Ship It!” It passes QA (if you are someplace that has a proper QA process) and it gets to production. Then the DBA walks by (or a user calls you) and something is wrong… It isn’t returning, or it is taking far too long or timing out even.
Why? It “Worked fine in development”!!!
If you have experienced that with SQL Server development (and who hasn’t at least once) then I am hopefully writing to you here. I will warn you ahead of time, though: some of the things in this post may seem obvious. I think we need to take a quick back to basics approach here though based on most of the cases I see…
The Basics
- How much data is in dev?- Seems kind of basic but it is often overlooked. Especially without a production mirroring Test/UAT/Performance environment (Which more environments than you think suffer from). If your development database has a fraction of the data in production you may be in for some surprises.
- Parallelism – Most machines now have more than one (physical or logical) CPU but what if your development environment doesn’t and prod does? (or vice versa which would be kind of odd). What if you don’t mirror your Degree of Parallelism settings? You could go to production and a query that seemed to perform well is now not. One recent example I saw had a query plan that was influenced by parallelism choosing an index scan when it should have been doing a seek. It was reading far too much data but all things were the same between dev and prod except for the number of procs (1 to 8). The fix wasn’t to disable parallelism, the fix was an important missing index. In development the plan was still inefficient but it didn’t “feel” inefficient to the developer.
- You do look at your query plans, right? – As I blogged about here, reading a query plan isn’t rocket science. Once you start looking at them they become clearer over time. The real danger, however, is not looking at them. Even if your query comes back in half a second in dev, look at the plan and make sure you aren’t doing anything silly.
- While you are at it what about IO Statistics or Profiler? – Whenever I am tuning a query, I generally have SET STATISTICS IO ON at the top of that query. I want to see what the reads look like. You will find debates about tuning to reads but from my point of view it is important to keep your reads lower. Not to the point where doing so increases duration because of some crazy logic. Generally it is my finding that the less reads a query does, the less duration it takes. Not always true but look at your reads in dev. If it is high there and in production there are a lot of similar queries and more than the one user in dev you may be asking the production server to flush pages from memory at a faster rate than you would otherwise like. This means more physical I/Os, which are not as happy as logical I/Os. Profiler is a great tool (as Gail blogged about here) to show you the impact of a UDF. Your query plan and I/O Statistics are not going to show you the impact of that UDF you have in your select list because you are a procedural programmer turned SQL Developer. In fact don’t just look at the duration/reads in profiler but look through Gail’s post and look at the statement completed events to see just how many times that UDF code that was so efficient alone is being called.
- You do keep your statistics updated, right? – Sounds like a “No Duh” but it isn’t in a lot of places. How often do you update your statistics in the database? When you refresh dev, do you update statistics? Do you have a job to do so in Prod as well? If your statistics are out of date, your query plans may not make a lot of sense and they could be sub-optimal. The optimizer is a powerful tool but it can only be as good as the information it has when figuring out an approach.
- How often do you compare schemas? – Your dev environment does look like production, right? What about your indexes? Do you ever run a tool like Visual Studio Team System for Database Professionals or SQL Compare from Redgate (not a paid endorser but this is the tool I personally use for schema comparisons, even with VSTS DB in house which I like for other reasons), etc? You may be surprised but I have seen a lot of cases where a developer has created an index in dev or even in QA/UAT/Performance (Why, oh why, did you give the developers permissions to those environments…) and got quite busy and forgot that they meant to check that the index was scripted and included in a deployment plan.
There are other causes and there are even some strange occurrences that take a bit of time digging into. Generally, though, these are the reasons I tend to uncover in this scenario. Another cause is not meant as a slight against any one role in a project, it is something we all mutually contribute to:
The development is NOT done when the expected results are returned or the expected actions occur!
I have seen overworked developers and DBAs, under funded project managers (resource and time wise), over zealous management (as far as this date will be met, no matter how realistic it is) and a variety of other causes to that rant above. The simple truth is, development is done when it performs the expected action, the unit tests in dev have verified it and the unit tests also looked at performance and the piece of code is performing as best as it possibly can given the circumstances. Low hanging fruit type problems abound in most environments I have encountered. I know the projects never have enough time but sometimes 20 minutes extra on a procedure could have meant that one wouldn’t have showed up on a “worst offenders” list.
I don’t want to put all of the onus on development either. DBAs should setup standards and best practices and have code reviews. They should mentor their teams in performance areas. Senior developers should do the same.
Management needs to put an emphasis on proper testing, including performance testing (either in dedicated environments that mirror production hardware and data as best as possible or in a test environment that is as close as possible). There will always be performance problems that slip past development just like non-performance bugs. Without searching for them before production, it’s your users that will be your bug reporters. That could mean the difference between retaining a client or system adoption.
Ha, I see this all the time 😉
I wrote about something similar in this postYour testbed has to have the same volume of data as on production in order to simulate normal usage where I demonstrate that badly written queries on a system with a lot more data will perform even worse than on a system with a lot less data
Our development environment is all virtual and, therefore, is all 32-bit. All of our production servers are 64-bit. We literally HAVE to test performance enhancements in production because of the difference…
Kevin, while I agree that 32bit and 64bit and virtual environments/physical environments have their differences that is still not really a valid reason for not testing something first, in my opinion. Sure you may not be able to do a complete and thorough performance analysis going through all aspects of what a performance test would look like.
You can still replicate activity, trace it, look at the query durations. look at the reads performed by the query. You can also at least look at your query plans, your IO STATISTICS, TIME STATISTICS and make sure you are doing some level of tuning.
You could even run perf testing scenarios through your application in dev and just realize that your hardware may provide you with some false bottlenecks but interpret the results with that in mind.
It’s better than just saying "Will this index help the reads and will it hurt any insert/update/delete operations?" and putting it directly into prod. Same with proc changes, etc.
I think you agree with that and you are more or less talking about the true thorough performance test but just in case someone misreads 🙂 Can’t you have 64 bit virtual environments also? I am fairly confident you can with VMware and would imagine you can also with Hyper-V?