Category Archives: Blob Storage

Uploading an image to a Blob Container via Web API

Handling image (or other binary object) uploads via Web API for storing in Azure blob storage without using the local file system (handy if, for example, you’re using Azure Websites) seems to be a frequently asked question.

I’ve not tested this in anger yet but I’ve posted my own attempt at solving this issue as a gist on GitHub and shown below. It seems to work but as I say I’ve not tested it in anger, only in fairly limited scenarios.

If you use my Azure Application Framework I’ve also added a GetMultipartStreamProvider method to the IAsynchronousBlockBlobRepository interface that provides a pre-configured implementation for a given blob container.

Hope thats helpful.

Using Blob Leases to Manage Concurrency with Table Storage

Azure’s table storage service allows for highly scalable and reliable access to large quantities of data but if you come from a SQL background it can seem very primitive – there is essentially no support for transactions (ok so you can transact a batch but that’s not often that useful) and only support for optimistic concurrency within the Table Storage itself. You can’t do much about the former, though there are some strategies you can adopt that help – future blog post, but their is a technique you can use if optimistic concurrency isn’t good enough and you want exclusive access to table storage resources for a period of time – essentially obtaining a lock.

The trick is in coupling table storage with blob storage to take advantage of the leasing functionality available on blobs. I frequently use this technique when I want to access or perform an update on data across multiple tables and be certain the data is going to be consistent.

There is a simple example hosted on GitHub here from which I’m going to highlight some of the code to illustrate how this approach works in practice.

Firstly we need to create a table entry and a blob to go with it:

lease1

You can see that this is fairly standard code for uploading a blob and inserting an entity however note that we’ve given the blob a name that matches up with the key of our table entity. We have no row key but if you did you’d form the blob from the composite of the key (unless you were interested in locking a range).

Now lets look at the code for a lease protected table access:

lease2

The code inside the try block is the fairly familiar looking code for accessing and updating entities for table storage however before we access table storage you can see that we get a reference to our blob using the entities key as a name again and then we call AcquireLease on the blob.

Importantly we do this with a timespan. It’s possible to indefinitely acquire a lease on a blob but this is not usually a good idea: if you suffer a crash (either your code or an Azure failure) you’re going to have a real problem on your hands as the blob will be leased by something that no longer exists.

It’s important to consider how long you want the lease for – thinking hard about retry policies and how long a series of operations could theoretically take. This is an extremely simple example but lets assume you were updating two tables – how long could that take? Well normally milliseconds assuming you’ve keyed your tables well. But let’s assume both operations require a significant retry period. The default retry policy for the storage client (on version 2 through to 3.0.3.0) has a maximum duration of 120 seconds. So if all your operations (read table 1, read table 2, write table 1, write table 2) succeed but are at the upper range of this threshold then you are looking at around 480 seconds for it to fully complete. In my experience this is unlikely – but it does happen.

So to cover this lets say you set your leases timespan to 490 seconds – it will cover the total possible operation time but if there is an issue and your lease doesn’t get released due to an application crash (or Azure issue) then the entity you are attempting to lock cannot be updated again until the full 490 seconds have passed. You can mitigate this from an application error with a finally block as in this sample code but that won’t help you if your process dies.

Another option open to you is to renew the lease between operations. Their is a method on the blob called RenewLease that will do exactly what it says on the tin and renew the lease and this can be an effective, if messy looking, solution but it does come with a performance penalty. Just like acquiring a lease in the first place takes time renewing a lease does too – in most cases it is extremely quick but you should be prepared for variance.

There’s no magic answer and, as ever, it’s a series of trade offs and you need to pick the best fit for your use case. It’s so use case specific that it’s difficult to give general advice – however general common sense is reasonable apply and try to cater for the common case and treat exceptional cases as just that: exceptional. As long as you know the fault has happened you can do something about it later – just don’t put your head in the sand and ignore it.

With that aside back to our example. Run the application and have it call the SimpleExample method shown below:

lease3

At the end of this you should see the expected output in the storage emulator – an entry in table storage in the entities table and a blob with a name that matches the partition key in the leaseObjects blob container.

Now lets add a method that adds a delay into the update process so we can see force a collision and see what happens:

lease4

And finally lets use that to run two updates concurrently with the task library:

lease5

You should find that a storage exception is raised on the AcquireLease line with a status code of 409 – conflict. The lease is acquired and so the second attempt to acquire the lease fails. Depending on your use case you may choose to fail the operation entirely or catch the exception and use a backoff policy to retry later.

Obviously the example here is somewhat simplistic and artificial but hopefully it illustrates how you can use this technique in more complex scenarios. And you can of course use the blob lease pattern in other concurrency scenarios.

Finally – the AccidentalFish.ApplicationSupport library on GitHub contains a dependency injectable lease manager you can use to simplify your code.

Saving Images to a Blob Container with Azure SDK 2.2 and VS2013

On the face of it the below is quite an obscure post but since this has bitten me and image upload is so common I figure it will bite others too. And to be fair I have called this blog Azure From The Trenches so inevitably it’s going to get a bit grubby every now and then!

I’ve been quite happily working on my companion application for this blog over the last few evenings and tonight ported it all over to VS2013, Azure SDK 2.2 and the various NuGet updates that have been released alongside all that and hit an odd problem – a simple image upload that worked previously no longer does.

Part of my application uploads images into blob storage via Web API and I’d been using code that looks somewhat like the below:

1
2
3
4
5
Image myImage = new Image(); // paraphrasing here
using(CloudBlobStream uploadStream = blobRef.OpenWrite())
{
    myImage.Save(uploadStream, ImageFormat.Png);
}

This has been working with no issues at all (on my laptop, desktop and in Azure) until I ran the app after the upgrade this evening and then I began to get a NotSupportedException with the following root cause:

Microsoft.WindowsAzure.Storage.Blob.BlobWriteStreamBase.get_Length() is not supported

Cracking out dotPeek to take a look at the storage client code in the latest NuGet package (2.1.0.3) reveals that yes, getting the length of a stream on a block blob (the kind I am using) is unsupported:

1
2
3
4
5
6
7
8
9
10
public override long Length
{
  get
  {
    if (this.pageBlob != null)
      return this.pageBlobSize;
    else
      throw new NotSupportedException();
  }
}

However this is the exact same implementation on the previous version of the client and so it’s not that – and it makes sense right? You can’t really expect to get the length of a stream that is writing to a remote server.

This led me off into System.Drawing. I’ll spare you the grisly details but if your Image class doesn’t have raw data, and chances are you don’t, and you call Save without an EncoderParameters parameter (and even sometimes when you do) then a native method called GdipSaveImageToStream is called and this expects a COM stream. Your .net stream is converted to a COM stream using a class called ComStreamFromDataStream and unfortunately this class has a nasty habit of calling get_Length() on your .net stream.

And this is what causes the crash.

It’s easy to fix: grab yourself a byte array or memory stream first and upload that. I’ve wrapped this into my application framework with an UploadImage method but essentially this does the following:

1
2
3
4
5
6
7
Image myImage = new Image(); // paraphrasing here
using (MemoryStream ms = new MemoryStream())
{
    image.Save(ms, ImageFormat.Png);
    ms.Position = 0;
    blobRef.UploadFromStream(ms);
}

I’m still none the wiser as to why this has become a problem now but I guess with the move to VS2013 there is a lot of change and the layers of abstraction are now so deep that without spending hours reading Microsoft code I’m never going to know.