Month: October 2013

Saving Images to a Blob Container with Azure SDK 2.2 and VS2013

October 24, 2013 by James

On the face of it the below is quite an obscure post but since this has bitten me and image upload is so common I figure it will bite others too. And to be fair I have called this blog Azure From The Trenches so inevitably it’s going to get a bit grubby every now and then!

I’ve been quite happily working on my companion application for this blog over the last few evenings and tonight ported it all over to VS2013, Azure SDK 2.2 and the various NuGet updates that have been released alongside all that and hit an odd problem – a simple image upload that worked previously no longer does.

Part of my application uploads images into blob storage via Web API and I’d been using code that looks somewhat like the below:

Image myImage = new Image(); // paraphrasing here
using(CloudBlobStream uploadStream = blobRef.OpenWrite())
{
    myImage.Save(uploadStream, ImageFormat.Png);
}

This has been working with no issues at all (on my laptop, desktop and in Azure) until I ran the app after the upgrade this evening and then I began to get a NotSupportedException with the following root cause:

Microsoft.WindowsAzure.Storage.Blob.BlobWriteStreamBase.get_Length() is not supported

Cracking out dotPeek to take a look at the storage client code in the latest NuGet package (2.1.0.3) reveals that yes, getting the length of a stream on a block blob (the kind I am using) is unsupported:

public override long Length
{
  get
  {
    if (this.pageBlob != null)
      return this.pageBlobSize;
    else
      throw new NotSupportedException();
  }
}

However this is the exact same implementation on the previous version of the client and so it’s not that – and it makes sense right? You can’t really expect to get the length of a stream that is writing to a remote server.

This led me off into System.Drawing. I’ll spare you the grisly details but if your Image class doesn’t have raw data, and chances are you don’t, and you call Save without an EncoderParameters parameter (and even sometimes when you do) then a native method called GdipSaveImageToStream is called and this expects a COM stream. Your .net stream is converted to a COM stream using a class called ComStreamFromDataStream and unfortunately this class has a nasty habit of calling get_Length() on your .net stream.

And this is what causes the crash.

It’s easy to fix: grab yourself a byte array or memory stream first and upload that. I’ve wrapped this into my application framework with an UploadImage method but essentially this does the following:

Image myImage = new Image(); // paraphrasing here
using (MemoryStream ms = new MemoryStream())
{
    image.Save(ms, ImageFormat.Png);
    ms.Position = 0;
    blobRef.UploadFromStream(ms);
}

I’m still none the wiser as to why this has become a problem now but I guess with the move to VS2013 there is a lot of change and the layers of abstraction are now so deep that without spending hours reading Microsoft code I’m never going to know.

Relational Data in Azure

October 11, 2013 by James

One of the primary concerns of any application is how to store and retrieve information and Azure offers many options both familiar and new.

Although some of the newer cloud oriented options such as Table Storage can be more cost effective and scalable, relational databases are still an important part of most cloud applications and Azure presents a number of options for managing relational data with the most accessible and familiar probably being SQL Database. However SQL Database is most definitely not SQL Server as you know it and comes with a number of limitations and differences that will mean you either need to develop your application using techniques you may not have with SQL Server or consider other approaches. Picking up an existing codebase built against SQL Server and dropping it straight into SQL Database might get you started but is likely to run into difficulties.

Below I’ve looked at some of the options and hopefully my experiences with them will help you decide which way to go. I’ve got the most to say about SQL Database simply because it’s the one I’ve used the most and I suspect will be what most people are interested in.

SQL Database

The most important thing to realise about SQL Database is that it’s basically a $5 database running in the cloud, it’s not SQL Server running on a big piece of tin in your data centre dedicated to serving requests for your application.

That said the vast majority of the time in practice I’ve found it to perform extremely well but, like all cloud services, you do need to begin using it with your eyes open.

When you create a SQL Database in Azure your database is running on what is effectively a multi-tenanted SQL Server cluster that is attempting to balance your needs with the other tenants while still allowing Microsoft to operate this at the price it does. You have a SQL database but you have no guarantees over server resources (there is now a Premium Database option that I will touch on towards the end of this post).

The first place you’ll catch sight of this is in the SLA:

A 5-minute interval is marked as unavailable if all the customer’s attempts to establish a connection to SQL Azure fail or take longer than 30 seconds to succeed, or if all basic valid read and write operations (as described in our technical documentation) fail after connection is established.

What does this mean in practice? Lets say you have a website that is moderately active and user activity results in it making 1000 requests to your database every 5 minutes and those requests are all short, low impact, read operations that typically take a few milliseconds to resolve in normal circumstances.

However lets say Microsoft suffer some network difficulty within their data centre or another tenant causes issues with the server and that causes a dramatic change in behaviour of your database requests resulting in 999 of those requests taking 40 seconds to succeed and 1 of them takes 30 seconds to succeed.

At this point your customers are probably giving up on your website but SQL Database is still within it’s SLA and Microsoft are under no obligation to either give you service credits (which is probably the least of your concerns at this point) or more importantly resolve the issue.

How likely is this to happen? I’m not sure I can give you odds but I can say that it’s rare based on sustained usage however it’s possible and it’s happened to me once and it lasted, intermittently, for 48 hours. On that occasion despite it being an acknowledged fault in the data centre (firewall) the Azure Operations Team initially refused to resolve it as the issue was within SLA. Fortunately common sense prevailed and despite being within the “letter” of the SLA it was not within the “spirit” of Azure and the situation was ultimately resolved.

What is far more likely, actually it’s a certainty, is that you will find yourself being throttled by SQL Database either due to your own usage or the usage of another tenant. This will present itself to your application as a SQL exception with a specific error code and a request to retry in 10 seconds time. These errors are known as transient faults and in most situations you try again and things will go through fine.

Because SQL Database is multi-tenanted it’s almost impossible to say when you’ll be throttled but something I’ve seen from experimentation and observation is that as the number of simple low read select queries per second grows past 220 throttling begins to kick in. That’s by no means hard and fast but it’s the best data I have to share.

However another cause of retry that I’ve seen on consistently busy systems seems to be caused by SQL Database doing a failover to another node on what looks to be a scheduled basis as it has a pretty consistent cadence of around 23 to 24 hours, presumably this is Microsoft moving the active node around to enable maintenance on the inactive nodes. This transient fault can be a little more problematic to deal with as it can take longer than 30 seconds for the failover to complete.

Ultimately you need to design your application to function with failure – this can mean simply accepting that sometimes things will fail or building in retry logic (it’s fairly easy to do and is generally recommended) and considering how long you want to wait balancing off the expense of immediate failure against the expense of recovery. In either case ultimately you do have to design for failure – but this is no different to a sound data centre applications architecture.

Finally I want to touch on database size – SQL Database is limited to an individual database size of 150Gb and at that capacity is an expensive monthly resource when compared to, for example, Table Storage. By comparison 150Gb of SQL Database will cost, at the time of writing, $225.78 per month whereas 150Gb of Table Storage will cost $14.25 per month.

SQL Database has many more limitations when compared to a big tin SQL Server installation and it’s well worth familiarising yourself with these in depth and the documentation is a good starting point.

As long as you make yourself aware of these limitations and plan to work with them (and there are many techniques to deal with these constraints that I’ll go through in more depth in the future) then SQL Database is still an effective solution for relational data within Azure and, certainly if you’re starting with a greenfield project, will probably be your best option for relational data.

Premium SQL Database

Recently Microsoft introduced Premium SQL Database which comes in two flavours both of which give you a guarantee of a minimum level of resource however it comes at a steep price. The service is currently running in preview with a 50% discount on the price but even at that a single 1.6GHz core with 8Gb of memory will cost you $479.25 per month for a single 150Gb database. A dual 1.6GHz server with 16Gb of memory will cost basically twice that at $944.25.

You still need to be aware that this is still a multi-tenanted offering and if you exceed the resource limits you have paid for then your application is subject to the same restrictions as standard SQL Database and in fact you may suffer transient faults in any case due to things like server failover.

As the service is in preview there isn’t an SLA available yet but I would expect it to be very similar to the existing SQL Database SLA, certainly in terms of service availability.

My recommendation would be to only use the Premium Database if you are porting a project from a traditional data centre into Azure or have a very specific dependency on SQL but I would look carefully at if this makes economic sense and consider concluding the port with code change to support standard SQL Database.

SQL Server

It’s fairly straightforward to run SQL Server inside Azure using Virtual Machines and in fact you can deploy this using a gallery item that also includes the SQL Server license. This approach can work but there are a number of things to be aware of.

Firstly the cost – you need a SQL Server license to deploy this and, certainly compared, to SQL Database that isn’t cheap. A two node SQL Server cluster using the Standard edition of SQL Server and large virtual machines (4x 1.6GHz cores and 7Gb of memory) will cost you $1354.08 per month and that’s before you factor in storage costs but it does include licensing.

Secondly on top of that you have to manage this yourself just as you would a standard SQL Server and finally the real kicker: it still might not behave as your on premise SQL Server behaves as you are running on top of Azure storage and the network latencies and other characteristics will almost certainly be (at best) different to your on premise solution.

Like Premium SQL Database I would really only recommend this if you are porting a project from a traditional data centre into Azure or you have a very specific requirement (for example there are many SQL Server features not available in SQL Database) but you really need to look and see if it makes economic sense.

MySQL

MySQL has always been a popular database in the open source world, at least until Oracle bought it! It and it’s derivatives are completely viable to run within Azure and may well be a good option for you. I’m not going to go into the in’s and out’s of MySQL but there are two fairly straightforward options for running this within Azure.

Firstly you can run your own virtual machine and install MySQL – personally if doing this I would run a Linux virtual machine, it’s cheaper than an equivalent Windows machine and is MySQLs natural home. Like running SQL Server yourself if you take this route you are responsible for all the management activities you would need to undertake if you were running it on premise.

Alternatively cleardb provide a MySQL as a service option available from the Azure store. I can’t give any real information based on experience however this blog is running WordPress underpinned by this service and setup was simple. As a service administration and backup activities are taken care of for you and they offer a free option to get you started.

Conclusion

Azure offers a variety of ways for you to handle relational data, and ultimately through virtual machines you can host any database you like, but for most applications SQL Database is going to prove a cost effective and reliable option with the added benefit of familiarity to most .Net developers. Just make sure you bear in mind it’s limitations and design your application accordingly. I’ll be coming back to this in a future post.

Welcome (and a Table Storage date sorting tip)

October 9, 2013 by James

Welcome to Azure From The Trenches, a blog where I hope to share information learned from using Azure to develop and run real production applications. The applications themselves shall largely remain nameless but to illustrate some points I have a sample application to go along with this blog which I’ll be publishing on GitHub shortly.

I really want to cover Azure in a broad sense so expect to see topics on subjects such as:

Architecture
Business
Code
Cost modelling
Deployment
Operating an Azure system
Scaling
SLAs
Testing

As well as sharing information I’d love for this blog to be visited and commented on by others working on Azure so that I can learn from you and improve my own understanding and future work.

As my companion application isn’t quite ready I thought I’d kick things off with a short tip for sorting table storage within a partition by date in a descending order. This had me scratching my head for a while when I first wanted to do this during my early use of Table Storage.

The solution relies on the rows in a partition being stored sorted by the row key in an ascending fashion. The row key is a string so you can’t just drop a date in their but as long as you’re dealing with recent dates you can quickly and easily place a string in the row key that will let you sort in an ascending date count: the Tick property of the DateTime class.

This is a long value that counts the number of ticks (100 nanoseconds) that have elapsed since midnight on the 1st January 0001. If we zero pad that as a string then the partition will be sorted by date:

1	RowKey = String.Format("{0:D19}", DateTime.UtcNow.Ticks);

To sort in a descending date order we simply need to subtract the current number of ticks from the number of ticks for the maximum date:

1	RowKey = String.Format("{0:D19}", DateTime.MaxValue.Ticks - DateTime.UtcNow.Ticks);

When I first began to use Azure and Table Storage my instinct was always to run back to SQL but with some lateral thinking you can often bend Table Storage to your will and get better scalability, reliability and cheaper running costs – a topic I plan on coming back to.

Month: October 2013

Saving Images to a Blob Container with Azure SDK 2.2 and VS2013

Relational Data in Azure

Welcome (and a Table Storage date sorting tip)

Contact

Recent Posts

Recent Tweets

Recent Comments

Archives

Categories

Meta