A Picture That Says, “Don’t Shrink!” (and other things…)

A Picture That Says, “Don’t Shrink!” (and other things…)

So I wrote a lot of posts around shrinking (Here, here, here). Some of my posts were turned into a wiki article on SQLServerPedia. I thought I was done posting about this but one more.

I talked about why we shouldn’t do it on a regular basis. Why it should really be reserved for use in dev or extenuating circumstances due to a lack of proper planning. I also talked about why it is bad to let your databases grow in small chunks.

I can’t stress it enough… Right sizing your database is critical. This requires PLANNING. If you don’t plan for performance you will be killed by performance in no time. If your company or your project manager says something like, “We don’t have time to plan for that” then you will pay for it later at least 75% of the time and you will pay more than you would have up front.

The below picture is from a system somewhere that I got involved in after it had been “alive” (persistent vegetative state…) for a couple years. It was set to do a DB shrink each night after a transaction log truncate (Again –> Don’t do this! If you want to recover to a point in time, use FULL Recovery mode and TAKE LOG BACKUPS at a frequency that manages your recovery needs (first) and your transaction log growth (second). This was done on each database on the instance.

Lots of Red and Lots of Fragments
Lots of Red and Lots of Fragments

This is with the windows defragment tool. A quick reminder: Blue (there are about 5 or 6 skinny blue lines in the above picture) indicates contiguous space, no physical file fragmentation. Red indicates fragments. No, I didn’t play with the colors. Most other drives remind me of Christmas also.

What does this mean? This means that I/O operations against this drive have an unfair advantage working against them. Reads will not be as well performing as they should. (For instance copying an LDF file off of this – data file drive – to move the .ldf to a log file drive took 4 minutes. No other activity, file was 1GB..)

How did it get this way? Every night the files on this data file drive were shrunk as part of a shrinkdb operation. Every night after this shrink an ETL process loads it. Files then grow in chunks during the ETL (and some of them grow during the day with use, like the report server temp database). These growths and shrinks lead to file fragmentation.

How long did it take? Much like my current status on the scale (14.6 stone when I should be 12.5.. I’ll let you bing, google or ask wolfram alpha if you care) this didn’t happen overnight. It took a period of neglect. Neglect including:

  • The Shrinks/growths
  • Never planning for the expected space and proper sizing in the first place
  • Allocating the bare minimum to the drive (88% full right now) and adding more space from the SAN as “needed”.
  • Never running a defragment process during maintenance.

What do we do? Well for the short term we are going to grow the databases to an appropriate amount (hopefully getting more space allocated before all of this obviously), back them up, turn off the SQL Server services, remove any files that shouldn’t be on this drive and defragment the volume.

What do we do in the future? The shrinking operations were killed awhile ago (except for one that was hidden in an ETL process, since removed), I will make sure that a process is put in place to check for fragmentation more frequently. I will keep fighting the fight to properly size a database from the beginning.

Not My Job

Nearly no other phrase can irk me quite like that. You don’t even have to say it, you can just experience it. Now I am guilty here also. Where this case happened, the databases are managed from SQL out by the DBA, the servers/storage/etc are managed by the storage and server team. Put trust in that team and never really looked at fragmentation. OOPS. Now it isn’t my job and normally I go out of my way to worry about things that could affect databases I care about, for some reason I never checked fragmentation here.

When talking to the sysadmin about I/O issues, I started talking about a desire to split some of the report server databases onto their own drives (large reports running, seeing a lot of blocking with the built in report server procs that write/read to the databases) and the reaction was very quickly “No, you have high fragmentation.. let me show you” and then we opened disk management in window and looked at the fragmentation. My reaction was guilt and shame for not checking. Then my second reaction was (internally) “How did you know this and not say anything?!” The conversation basically revealed that this was a known issue for some time. Now before the DBA role was filled there was no one else looking at database performance on the applications team. How did this sysadmin expect project managers and developers without much systems/DBA knowledge play Carnac and figure out the fragmentation?

The attitude that killed here? “I just create the disks and give them to the people who request it, I give them the space they asked for and that’s it.”

You can’t do that. If you want to be a good DBA you need to look above and below your part of the food chain and if you see an area that needs improvement, SPEAK UP. Same goes for sysadmins and storage teams. You are one company with the same customers and the same goal of happy customers.

Reminder –> Are you looking at your physical file fragmentation levels? I am sure you are a great DBA and update your stats, rebuild indexes when appropriate. Do you look at your I/O subsystem? Do you check your fragmentation? Do you get blinded by the simple in search for a complex situation? I do at times. Check your fragmentation levels and run some benchmarks. Fix this headache before it becomes a resume altering situation.

Subscribe for Updates

Name

2 thoughts on “A Picture That Says, “Don’t Shrink!” (and other things…)”

  1. Hey Brad –

    Great question. So there was a bit of improvement. During some of the benchmarks the latency of the IOs was improved slightly. That being said, there is still an IO bottleneck and this at least proved that it is a bit deeper.

    Reply

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share This