If you're looking for help with C#, .NET, Azure, Architecture, or would simply value an independent opinion then please get in touch here or over on Twitter.
One of the primary concerns of any application is how to store and retrieve information and Azure offers many options both familiar and new.
Although some of the newer cloud oriented options such as Table Storage can be more cost effective and scalable, relational databases are still an important part of most cloud applications and Azure presents a number of options for managing relational data with the most accessible and familiar probably being SQL Database. However SQL Database is most definitely not SQL Server as you know it and comes with a number of limitations and differences that will mean you either need to develop your application using techniques you may not have with SQL Server or consider other approaches. Picking up an existing codebase built against SQL Server and dropping it straight into SQL Database might get you started but is likely to run into difficulties.
Below I’ve looked at some of the options and hopefully my experiences with them will help you decide which way to go. I’ve got the most to say about SQL Database simply because it’s the one I’ve used the most and I suspect will be what most people are interested in.
The most important thing to realise about SQL Database is that it’s basically a $5 database running in the cloud, it’s not SQL Server running on a big piece of tin in your data centre dedicated to serving requests for your application.
That said the vast majority of the time in practice I’ve found it to perform extremely well but, like all cloud services, you do need to begin using it with your eyes open.
When you create a SQL Database in Azure your database is running on what is effectively a multi-tenanted SQL Server cluster that is attempting to balance your needs with the other tenants while still allowing Microsoft to operate this at the price it does. You have a SQL database but you have no guarantees over server resources (there is now a Premium Database option that I will touch on towards the end of this post).
The first place you’ll catch sight of this is in the SLA:
A 5-minute interval is marked as unavailable if all the customer’s attempts to establish a connection to SQL Azure fail or take longer than 30 seconds to succeed, or if all basic valid read and write operations (as described in our technical documentation) fail after connection is established.
What does this mean in practice? Lets say you have a website that is moderately active and user activity results in it making 1000 requests to your database every 5 minutes and those requests are all short, low impact, read operations that typically take a few milliseconds to resolve in normal circumstances.
However lets say Microsoft suffer some network difficulty within their data centre or another tenant causes issues with the server and that causes a dramatic change in behaviour of your database requests resulting in 999 of those requests taking 40 seconds to succeed and 1 of them takes 30 seconds to succeed.
At this point your customers are probably giving up on your website but SQL Database is still within it’s SLA and Microsoft are under no obligation to either give you service credits (which is probably the least of your concerns at this point) or more importantly resolve the issue.
How likely is this to happen? I’m not sure I can give you odds but I can say that it’s rare based on sustained usage however it’s possible and it’s happened to me once and it lasted, intermittently, for 48 hours. On that occasion despite it being an acknowledged fault in the data centre (firewall) the Azure Operations Team initially refused to resolve it as the issue was within SLA. Fortunately common sense prevailed and despite being within the “letter” of the SLA it was not within the “spirit” of Azure and the situation was ultimately resolved.
What is far more likely, actually it’s a certainty, is that you will find yourself being throttled by SQL Database either due to your own usage or the usage of another tenant. This will present itself to your application as a SQL exception with a specific error code and a request to retry in 10 seconds time. These errors are known as transient faults and in most situations you try again and things will go through fine.
Because SQL Database is multi-tenanted it’s almost impossible to say when you’ll be throttled but something I’ve seen from experimentation and observation is that as the number of simple low read select queries per second grows past 220 throttling begins to kick in. That’s by no means hard and fast but it’s the best data I have to share.
However another cause of retry that I’ve seen on consistently busy systems seems to be caused by SQL Database doing a failover to another node on what looks to be a scheduled basis as it has a pretty consistent cadence of around 23 to 24 hours, presumably this is Microsoft moving the active node around to enable maintenance on the inactive nodes. This transient fault can be a little more problematic to deal with as it can take longer than 30 seconds for the failover to complete.
Ultimately you need to design your application to function with failure – this can mean simply accepting that sometimes things will fail or building in retry logic (it’s fairly easy to do and is generally recommended) and considering how long you want to wait balancing off the expense of immediate failure against the expense of recovery. In either case ultimately you do have to design for failure – but this is no different to a sound data centre applications architecture.
Finally I want to touch on database size – SQL Database is limited to an individual database size of 150Gb and at that capacity is an expensive monthly resource when compared to, for example, Table Storage. By comparison 150Gb of SQL Database will cost, at the time of writing, $225.78 per month whereas 150Gb of Table Storage will cost $14.25 per month.
SQL Database has many more limitations when compared to a big tin SQL Server installation and it’s well worth familiarising yourself with these in depth and the documentation is a good starting point.
As long as you make yourself aware of these limitations and plan to work with them (and there are many techniques to deal with these constraints that I’ll go through in more depth in the future) then SQL Database is still an effective solution for relational data within Azure and, certainly if you’re starting with a greenfield project, will probably be your best option for relational data.
Premium SQL Database
Recently Microsoft introduced Premium SQL Database which comes in two flavours both of which give you a guarantee of a minimum level of resource however it comes at a steep price. The service is currently running in preview with a 50% discount on the price but even at that a single 1.6GHz core with 8Gb of memory will cost you $479.25 per month for a single 150Gb database. A dual 1.6GHz server with 16Gb of memory will cost basically twice that at $944.25.
You still need to be aware that this is still a multi-tenanted offering and if you exceed the resource limits you have paid for then your application is subject to the same restrictions as standard SQL Database and in fact you may suffer transient faults in any case due to things like server failover.
As the service is in preview there isn’t an SLA available yet but I would expect it to be very similar to the existing SQL Database SLA, certainly in terms of service availability.
My recommendation would be to only use the Premium Database if you are porting a project from a traditional data centre into Azure or have a very specific dependency on SQL but I would look carefully at if this makes economic sense and consider concluding the port with code change to support standard SQL Database.
It’s fairly straightforward to run SQL Server inside Azure using Virtual Machines and in fact you can deploy this using a gallery item that also includes the SQL Server license. This approach can work but there are a number of things to be aware of.
Firstly the cost – you need a SQL Server license to deploy this and, certainly compared, to SQL Database that isn’t cheap. A two node SQL Server cluster using the Standard edition of SQL Server and large virtual machines (4x 1.6GHz cores and 7Gb of memory) will cost you $1354.08 per month and that’s before you factor in storage costs but it does include licensing.
Secondly on top of that you have to manage this yourself just as you would a standard SQL Server and finally the real kicker: it still might not behave as your on premise SQL Server behaves as you are running on top of Azure storage and the network latencies and other characteristics will almost certainly be (at best) different to your on premise solution.
Like Premium SQL Database I would really only recommend this if you are porting a project from a traditional data centre into Azure or you have a very specific requirement (for example there are many SQL Server features not available in SQL Database) but you really need to look and see if it makes economic sense.
MySQL has always been a popular database in the open source world, at least until Oracle bought it! It and it’s derivatives are completely viable to run within Azure and may well be a good option for you. I’m not going to go into the in’s and out’s of MySQL but there are two fairly straightforward options for running this within Azure.
Firstly you can run your own virtual machine and install MySQL – personally if doing this I would run a Linux virtual machine, it’s cheaper than an equivalent Windows machine and is MySQLs natural home. Like running SQL Server yourself if you take this route you are responsible for all the management activities you would need to undertake if you were running it on premise.
Alternatively cleardb provide a MySQL as a service option available from the Azure store. I can’t give any real information based on experience however this blog is running WordPress underpinned by this service and setup was simple. As a service administration and backup activities are taken care of for you and they offer a free option to get you started.
Azure offers a variety of ways for you to handle relational data, and ultimately through virtual machines you can host any database you like, but for most applications SQL Database is going to prove a cost effective and reliable option with the added benefit of familiarity to most .Net developers. Just make sure you bear in mind it’s limitations and design your application accordingly. I’ll be coming back to this in a future post.