SQL Server Blog

Temptation – When The Cloud Looks Like Heaven (but isn’t…)

temptation

A heaven? A gateway? A hope? Let’s continue the SQL Server Regrets series… The planned next post was about the cloud anyway, my reminder for this post was the really long day a lot of tech folks relying heavily on AWS experienced today… Another cloud-based DNS related outage/wave of interruptions.

A heaven….. A hope?- Someone else can worry about the tech stack – we just worry about our little section and save all that administrative overhead. The closer to “serverless” or Platform as a Service DB options, the less risk. Right? All of the problems and headaches are gone. Automatic backups, SLAs met, simple and always on technology that just works and gets billed by the hour and by the usage. What could go wrong?

Once again, today, we all got to taste a little bit of that gateway… For us – some clients who use Zoom meetings had issues when we aren’t using our Teams meetings (but that’s only because the outage today was AWS’ and not Azure’s but they trade places…) and our ticketing and PSA tool had some note writing and e-mail writing features down so we couldn’t make proper notes in tickets. The kids had some trouble with their online learning at home and I had some other minor annoyances throughout the day. That little piece of heaven was interrupted for millions today.

“Temptation…” It’s that little whisper in every architecture review: “You don’t need to understand all of this. Someone smarter already figured it out for you!” It’s the comfort of abstraction… Just trust the service without testing for failure. Assuming the cloud will keep you safe from the very entropy you used to prepare for and drill for.

But that’s not true faith (we’ll have to feel so extraordinary in a later post about security, though…) That’s neglect wrapped in convenience.

This is not an anti-cloud post. It’s not an anti-PaaS post. It’s a post to inspire you to think before you find yourself all twisted up, down, and turned around by not thinking as you plan your migrations to the cloud…

The Temptation of Ease

It usually starts with good intentions. You’re tired of managing patch nights, failed jobs, and weekend DR drills. The budget presentation slides write themselves (less maintenance, predictable costs, scalability, and innovation all with the click of a button and charge on a credit card!) You can focus on what matters… Your business. Your clients.
And that’s true, at first glance.

But the same abstraction that makes it appear effortless also blinds you to the details that keep you safe. When the console hides the storage layer and your engineers no longer know where your backups live, you’ve traded understanding for convenience.

“Oh, you’ve got green eyes… Oh you’ve got blue eyes… Oh you’ve got grey eyes…”

-The promise looks different from every angle, but it’s still temptation.

The closer you get to “serverless,” the more faith you place in someone else’s SRE, someone else’s DR plans, someone else’s idea of acceptable risk. That’s fine, we support a lot of clients in PaaS – and we’ve helped many go there when it makes sense. It’s perfect —–> if you’ve read their fine print and tested your plan B.

The Layers of Responsibility

The move from your own datacenter to IaaS to PaaS is often described as a ladder of progress. It’s also a sliding scale of control and responsibility and ability to fix crap.

On-Prem / Data Center – You own the hardware, the OS, the SQL installation, and the consequences. It’s all yours… every patch and outage, but also every choice and contingency. In your own data center – the buck truly stops with you and you are as good as your team and emergency team.

IaaS (Infrastructure-as-a-Service) – The cloud provider takes on the physical work (power, cooling, hardware failure and hardware updates.) The logical stack is still yours, though. You patch, you configure, you tune. It feels familiar, and that’s its appeal to many. But when a region or AZ goes dark, you still need to know how your availability and backups are handled. And maybe before a region goes dark – you could have sorted out how to handle a DNS or Compute issue in an entire region or zone in a region…

PaaS (Platform-as-a-Service) – This is where temptation really blinks the most at you. You have less responsibility, but you also lose meaningful control (if you are into controlling and fully understanding your infrastructure… pros and cons – I get it – and you have to pick your poison.) You can’t RDP into a box and fix what broke. You’re designing around someone else’s engineers, deployment cadence, and limitations – even when they roll out patcing (when we recently pushed the critical SQL Server security patches, our RDS clients had to wait to get them until the folks at AWS agreed they were important and ready.) SLAs improve on paper, but real resilience still depends on your design. There can be “truth” in providers telling their customers yesterday, “well it wasn’t our fault… AWS had this big outage….” – but not everyone experienced it the same – not every AWS reliant shop was down – they built their redundancies in. They didn’t fall victim to the temptatio of ignoring it all and leaving it in AWS’ hands.

Some say AWS support is better than Azure’s. Others swear the opposite. After helping clients in both (and adding GCP to the mix), I can tell you: support in all three leaves a lot to be desired when things are truly down. You may get polite tickets and escalation paths, but in the heat of an outage, the only guaranteed resource you have is your own preparation. We’ve seen clients as burned by AWS support as Azure support sadly.

Before you go too far down the PaaS path, ask yourself: how hard will it be to leave?
I’ve got a draft post titled, “Azure Managed Instances: You Can Check Out But You Can Never Leave.” It’s funny until you live it. Migrating into the cloud is rarely hard. Migrating out can be a different and even more painful story… Microsoft has since made it easier to leave Managed Instance – so that post will stay in drafts – but it’s not always as easy as one thinks.

The pro of PaaS is clear: less operational support needed from your team, automatic patching, reduced overhead. The cost is freedom and control. Make the trade knowingly.

Building Paranoid Resilience

A healthy paranoia is part of any good DBA’s DNA. That mindset shouldn’t fade just because your servers live in someone else’s datacenter.
Build like everything will fail… someday it will. “Bolts from above hurt the people down below…” Failing to plan for failure is the biggest sin any organization to make.

  • Design for multi-region awareness.
  • Keep backups under your own encryption keys.
  • Test recovery through deliberate chaos (not just tabletop exercises – but do those, too!).
  • Question the assumptions baked into your cloud provider’s SLA language – remember they are basically promising to give you a few bucks back if they are down… Remember 99% latency guarantees on GP3 drives in EC2 mean you can have terrible IO for 14 minutes a day and they don’t even breach their SLA…
  • Make sure your monitoring still works when DNS doesn’t. Because this lesson is at least an annual lesson for the world…. And your customers are not stupid. They won’t buy “but the devil [cloud vendor] made me do it.” as an excuse….
  • Before you go PaaS, make sure you understand what “restore” actually means and how fast it can happen.

The clients who weather outages best are the ones who never stopped thinking like DBAs even after moving to the cloud.

Before You Make the Move: A Quick Checklist

Map dependencies. Know what talks to what… Know where it breaks if a region fails.
Define ownership. Write down exactly who’s responsible for backups, patches, and DR… You or the provider?
Export what matters. Keep critical data and scripts retrievable outside the same cloud credentials. Consider cross cloud backups of your data in immutable backup solutions in the other vendor. Save your runbooks. LOCALLY. SOMEPLACE NOT IN A CLOUD (Rememer Sharepoint and Jira are in the cloud… Your ticketing tool IS IN THE CLOUD… Your email provider IS IN THE CLOUD…. Realize that when the crap hits the fan in the cloud it’s not just your environment – it’s your support infrastructure… Assume it will all fail – and build around it. TODAY… Or learn the lesson again when we all laugh about DNS in 3,6,9, or 12 months…)
Test failover regularly. Practice, don’t just trust documentation.
Watch your costs. Set budgets and alerts; performance spikes can bankrupt confidence. Ive not yet met a customer who went PaaS who hasn’t been shocked by the costs to get the performance they thought they were going to… And listen to your DBAs and consultants – tuning queries will reduce your costs – and fast…
Document reality. After migration, update your runbooks. Old assumptions kill fast during an outage.
Keep local expertise. Don’t outsource your knowledge along with your hardware. I’m partial to smart consulting firms like ours – and we would love to partner with you in our DBA as a Service and you sould consider us – but keep knowledge in house as well. Don’t outsource it all… Work with folks who want you to keep control and knowledge.
Count the cost. Moving fast feels good; paying for recovery later doesn’t. Understand the financial and operational weight of each choice before you click “migrate.”
Understand Your Support. Do you even have the right support level? Direclty with the cloud vendor? Or through a third party?

It’s Not the Last Time

When it comes to these outages. We’ll never get to sing, “oh it’s the last time” and for a lot of future outages we’ll never even get to say “oh, I’ve never met an outage like you before…” These issues are here – and much like how all that we care about in this world is being consolidated to a few PE companies – most of the infrastructure is being consolidated to a few clouds. Plan for downtime. Build with resiliency in mind. Care and prepare for the worst. Failing to plan for failure is the big failure. And, for now, us humans still have agency over that. Avoid the temptation of thinking a heavenly gateway of hope called the magic of the cloud is magic…

We can help you on your journey to the cloud. We’ve helped some folks plan ahead. We’ve sadly helped many more who called us after they made their move and realized maybe they should’ve talked to folks who’ve seen the good, the bad, and the ugly before chasing the dreams and hope. I’d love to say that each call is the last time.. But sadly it isn’t. Get some help before you make the move. The worst that happens is you made a small investment in hearing that your plans are sound. The best that happens is you have your plans shored up and avoid the costly regret that so many have experienced. We won’t let you hit the ground. Call us for some help – the contact buttons are all over the website.

Mike Walsh
Article by Mike Walsh
Mike loves mentoring clients on the right Systems or High Availability architectures because he enjoys those lightbulb moments and loves watching the right design and setup come together for a client. He loves the architecture talks about the cloud - and he's enjoying building a Managed SQL Server DBA practice that is growing while maintaining values and culture. He started Straight Path in 2010 when he decided that after over a decade working with SQL Server in various roles, it was time to try and take his experience, passion, and knowledge to help clients of all shapes and sizes. Mike is a husband, and father to four great children and lives in the middle of nowhere NH.

Subscribe for Updates

Name

Leave a Comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.

Share This