Azure Functions vs AWS Lambda – Scaling Face Off

Since I published this piece Microsoft have made significant improvements to HTTP scaling on Azure Functions and the below is out of date. Please see this post for a revised comparison.

If you’ve been following my blog recently you’ll know I’ve been spending a lot of time with the Azure Functions – Microsoft’s implementation of a serverless platform. The idea behind serverless appeals to me massively and seems like the natural next evolution of compute on the cloud with scaling and pricing being, so the premise goes, fully dynamic and consumption based.

The use of App Service Plans (more later) as a host mechanism for Azure Functions gave me some concern about how “serverless” Azure Functions might actually be and so to verify suitability for my use cases I’ve been running a range of different tests around response time and latency that culminated in the “real” application I described in my last blog post and some of the performance tests I ran along the way. I quickly learned that the hosting implementation is not particularly dynamic and so wanted to run comparable tests on AWS Lambda.

To do this I’ve ported the serverless blog over to AWS Lambda, S3 and DynamoDB (the, rather scruffy, code is in a branch on GitHub – I will tidy this up but the aim was to get the tests running) and then I’ve run a number of user volume scenarios against a single test case: loading the homepage. The operations involved in this are:

  1. A GET request to a serverless HTTP endpoint that:
    1. Loads 3 resources from storage (Blob Storage on Azure, S3 on AWS) in an asynchronous batch.
    2. Combines them together using a Handlebars template
    3. Returns the response as a string of type text/html.

On Azure I’m using .NET 4.6 on the v1 runtime while on AWS I’m using the same code running under .NET Core 1.0. It’s worth noting that latency on blob access remained minimal throughout all these tests (6ms on average across all loads) and when removing blob access from the tests it made little difference to the patterns.

Although the .NET 4.6 and Core runtimes are different (and accepted may exhibit different behaviours) these are the current general availability options for implementing serverless on the two platforms using .NET and both vendors claim full support for them. In Microsoft’s case some of the languages supported on the v1 Azure Functions runtime, the one tested here (v2 is in preview and has serious performance issues with .NET Core), are experimental and documented as having scale problems but C# (which runs under full framework .NET) is not one of them. Both vendors have .NET Core 2.0 support on the way and in preview but given the issues I’m waiting until they go on general availability until I compare them.

The results are, frankly, pretty damning when it comes to Azure Functions ability to scale dynamically and so let’s get into the data and then look at why.

A quick note on the graphs: I’ve pulled these from VSTS, it’s quite hard (or at least I don’t know how to!) equalise the scales and so please do look at the numbers carefully – the difference is quite startling.

Add 2 Users per Second

In this test scenario I’ve started with a single user and then added 2 users per second over a 5 minutes run time up to a maximum of 500 users:

We can see from this test that AWS matches the growth in user load almost exactly, it has no issue dealing with the growing demand and page requests time hover around the 100ms mark. Contrast this with Azure which always lags a little behind the demand, is spikier, and has a much higher response time hovering around the 700ms mark.

This is backed up by the average stats from the run:

It’s interesting to note just how many more requests AWS dealt with as a result of it’s better performance: 215271 as opposed to Azure’s 84419. Well over twice as many.

Constant Load of 400 Concurrent Users

This test hits the application with 400 concurrent users from a standing start and runs over a 10 minute period simulating a sudden spike or influx of traffic and looking at how quickly each serverless environment is able to deal with the load. Neither environment was completely cold as I’d been refreshing the view in the browser but neither had had any significant traffic for some time. The contrast is significant to say the least:

Let’s cover AWS first as it’s so simple: it quickly absorbs the load and hits a steady response time of around 80ms again in under a minute.

Azure, on the other hand, is more complex. Average response time doesn’t fall under a second until the test has been running for 7 minutes and it’s only around then that the system is able to get near the throughput AWS put out in a minute. Pretty disappointing and backed up by the overall stats for the run:

Again it’s striking just how improved the AWS stats over the Azure figures.

Constant Load of 1000 Concurrent Users

Same scenario as the last test but this time 1000 users. Lets get into the data:

Again we can see a similar pattern with Azure slow to scale up to meet the demand while with AWS it is business as usual in under a minute. Interestingly at this level of concurrency AWS also error’d heavily during the early scaling:

It should be noted that AWS specifically instructs you to implement retry and backoff handlers on the client which in the load test I am not doing, additionally at this point I am seeing throttle events in the logging for the AWS function – this is something I will look to come back to in the future. However its interesting to note the contrasting approaches of the two systems: Azure inflates it’s response time while AWS prefers to throw errors.

The average stats for the run:

Azure Functions

I don’t think there’s much point dancing around the issue: the above numbers are disappointing. Azure is slow to scale it’s HTTP triggered functions and once we get beyond the 100 concurrent users point the response times are never great and the experience is generally uneven. For customer facing API / web serving where low latency and response time are critical to a smooth user experience this really rules it out as an option. And it’s not just the .NET 4.6 variant that is poor as can be seen from my previous posts where I stripped test cases down to the most basic scenarios and used a variety of frameworks. The best case for Azure scaling I’ve found is using a CSX approach to return a string but even that lags behind AWS doing real work as the test cases in this post do:

using System.Net;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
    log.Info("C# HTTP trigger function processed a request.");

    var response = req.CreateResponse();
    response.StatusCode = HttpStatusCode.OK;
    response.Content = new StringContent("<html><head><title>Blog</title></head><body>Hello world</body></html>", System.Text.Encoding.UTF8, "text/html");

    return response;
}

With 1000 concurrent users over 5 minutes:

And with the add 2 users per second scenario:

Even in this final case, and remember this Azure Function is only returning a string, we can see the response time creeping up as the user load increases and the total number of requests served is only 77514 to AWS’s 215271 over the same period with a much lower number of requests per second.

In an additional attempt to validate my conclusion that the Azure Function system is poor at scaling I pointed the AWS Lambda installation at Azure Blob Storage instead of S3. In this test other than the function entry point semantics the code running on AWS is now taking exactly the same branches as the Azure tests and using the same underlying storage mechanism, albeit with a hop across the Internet to access the storage. I ran this scenario using the 400 concurrent user scenario:

We can see from this that other than a slightly increased response time due to the storage being hosted in another data centre AWS continues to perform well and scales up almost immediately and response time remains steady and low. We can also see their is no issue with Azure Blob Storage – if there was an issue there we’d expect to see it impact these results.

With these additional validation tests (an empty workload and AWS running against Blob Storage) that pretty much isolates the issue to the Azure Function runtime.

And it’s a shame as the developer experience is great, there is solid documentation, and plenty of samples, and the development team on Twitter are ludicrously responsive – to the point that I feel bad saying what I need to say here. I will reach out to them for feedback.

Why is this the case? Well I’d suggest the root of the issue is how the system has been built on top of App Service Plans. It’s not all that, well, serverless and you still find yourself worrying about, well, servers.

On Azure an App Service Plan is essentially a collection of rented servers / reserved compute power of a given spec (CPU, memory) and capabilities. Microsoft have layered what they call a Consumption Plan over this for Azure Functions which provides for automatic scaling and consumption based pricing. Unfortunately if you track what is going on your Functions are running on a limited number of these servers which you can evidence by tracking the instance ID and by sharing state between your functions (to be clear: this is not good!).

Essentially the level of granularity for scaling your functions remains, as in a traditional hosting model, at the server level and as your system scales up instances are slowly being added – but this is throttled tightly presumably to prevent Microsoft’s costs from spiralling out of control.

Now because they run on Application Service Plans you can switch hosting away from the Consumption plan onto a standard plan (which allows additional Azure features to be used) but this, to me, completely defeats the point of serverless. I’m paying for reserved compute again and managing server instance counts. I may as well not have bothered in the first place!

It’s hard to escape the feeling that Microsoft had to play catch up with AWS Lambda (it launched as a preview in late 2014 and went into general release in April 2015 whereas Azure Functions launched as a preview in March 2016 ) and built something they could market as serverless computing as quickly as they could by reusing existing compute and scaling systems on Azure.

Would I still use Azure Functions? Yes sure – in back end scenarios where latency isn’t all that important they’re a great fit. Anything that impacts user experience? No. Definitely not at this point.

It will be interesting to see if Microsoft revise the hosting model, I suspect if they do it’s some time off as currently they seem focused on the v2 runtime which isn’t a hosting change (as far as I can see) but rather giving Functions the ability to support more languages and .NET Core.

AWS Lambda

I’ll preface this by saying I am absolutely not an AWS expert so it’s harder for me to speculate about the underlying architecture of Lambda however… the numbers don’t lie: AWS manages to respond to changes in demand very quickly and, until I started to hit throttle limits (which I would need to speak to AWS Support to have lifted), is very consistent in response times.

I’ve not tried any state sharing but I would expect it to fail: it looks like Amazon have containerised at the Function level, rather than the host server, and this is what allows them to operate as you’d expect a serverless environment to. Both scaling and billing can then be at the function level.

Would I use AWS Lambda? Yes. But as most of my development work is on Azure I’m really hoping Microsoft bridge the capability gap.

Wrap Up and Next Steps

If you’ve followed this far – thanks! I’m a big fan of the serverless model but the Azure implementation of serverless looks like something of a compromised offering at this point and I’d be cautious of recommending it without understanding in detail the usage requirements as you will quickly hit choppy water.

I am planning on repeating similar experiments with the queue processing I began some time ago and if I get any information from Microsoft around this topic will make any corrections as appropriate. This is one of those times I’d love to have got things wrong.

Serverless Blog – Christmas 2017 Project

Happy New Year everyone – I hope everyone had a great break and has a fantastic 2018.

Much like last year I’d set some time aside over the Christmas break to tinker with something fairly left-field and somewhat experimental (algorithmic art) but unfortunately spent a lot of the break ill. This left me with a lot less time on my hands than I’d planned for and based my project around – I’d hoped to spend 4 to 5 days on it and an additional day for writing this blog post but had been left with only around 12 hours available for the implementation.

That being the case I scrabbled around for something smaller but still interesting and useful to me and that I thought would fit into the reduced amount of time I had available. I decided I’d attempt to put together a Minimum Viable Product for replacing my WordPress based blog with something that looks and feels the same to the reader but is entirely serverless in it’s architecture. My aim was to get, in no particular order, something that:

  • Renders using a similar look and feel to my current blog
  • Supports the same URL patterns for posts so that I could port my content, do a DNS change, and wouldn’t cause Google or linking sites a problem
  • Has super-cheap running costs
  • Has high uptime
  • Uses Markdown as it’s post authoring format
  • Has fast response times (< 100ms for the main payload)
  • Is capable of scaling up to high volumes of concurrent users
  • Support https for all content as my current blog does
  • Was deployed and running on an endpoint at the end of my allotted time (you can try it out here)

Knowing I only had 12 or so hours to spend on this I didn’t expect to be flicking the switch at the end of the second day and migrating my blog to this serverless system but I did want to have it running on my domain name, fairly sound, and be able to prove the points above with a working Minimum Viable Product. From a code quality point of view I wanted it to be testable and reasonably structured but I wasn’t aiming for perfection and expected low to zero automated test coverage.

The challenge here was covering enough ground in 12 hours to demonstrate an MVP worked and was in a sufficiently developed state that it was clear how the quality could be raised to a high degree with a fairly small amount of additional work.

If you’re interested in seeing the code it can be found on GitHub. If you use this as a basis for your own projects please bear in mind this was put together very rapidly in just over 12 hours – it needs more work (see next steps at the bottom of this post).

https://github.com/JamesRandall/AzureFromTheTrenches.ServerlessBlog

Planning

Normally when I undertake a project like this I’ve had the chance to roll it around in my head for a few days and can hit the ground running. With the late change of direction I didn’t really get the chance to do that and so I really came into this pretty cold.

As I wanted to replace my current WordPress blog with a serverless approach a good place to start seemed to be looking at it’s design and my workflows around it. The layout of my blog is pretty simple and every page has the same structure: a title bar, a content panel, and a sidebar:

In addition their are only really 4 types of page: a homepage made up of the most recent posts, posts, category pages, and archives. The category and archive pages simply list the posts within a category and month respectively. The only thing that causes site change is the addition or editing of a post that can cause all those pages to require update.

I do most of my writing on the train and use the markdown format which I subsequently import into WordPress for publishing. This means I don’t really use the editing capabilities of WordPress (other than to deal with markdown to WordPress conversion issues!) and so was comfortable simply uploading the Markdown to a blob container for this serverless blog. This left the question of how to get any metadata into posts (for example categories) and I decided on a simple convention based approach where an optional block of JSON could be included at the start of a post. That way that too could be maintained in a text editor.

Given all that my general approach (at this point best catgorised as a harebrained scheme) was to render the components of the site as static HTML snippets using a blob triggered Azure Function and assemble them into the overarching layout when a user visits a given page with page requests being handled by HTTP triggered Azure Functions – one per page type. I toyed with the idea of going full static and re-rendering the whole website on each update but felt this “mostly” static approach revolving around the side components might provide a bit more flexibility without much performance impact as all I’m really doing to compose a page is stitching together some strings, and if I were to actually start using this I’d like to add a couple of dynamic components.

In any case having settled on that approach I mapped the architecture out onto Azure services as shown below:

In addition to using Azure Functions as my compute platform for building out the components I picked a toolset I’m either working with day to day or have used in the past:

  • C# and .NET Core
  • Visual Studio 2017 and Visual Studio Code
  • Handlebars for page templating
  • Blob and Table storage

I briefly considered using CosmosDB as a datastore but my query needs were limited and it would bump up the running cost and add complexity for no real gain and so quickly discounted it.

Implementation

With the rough planning complete it was time to knuckle down with the laptop, a quiet room, a large quantity of coffee, and get started on some implementation. Bliss!

In order to make this readable I’ve organised my approach into a linear series of steps but like most development work there was some to-ing and fro-ing and things were iterated on and fleshed out as I moved through the process.

My general approach on a project like this is to prioritise the building out of a vertical slice and so here that meant starting with a markdown file, generating enough of the static assets that I could compose web pages, and then a couple of entry points so I could try it out in the Azure environment.

Step 1 – Replicating the Styling of my Existing Blog

As this project is really about markdown in and HTML out I wanted to start by ensuring I had a clearly defined view of that final output and so I began by creating a HTML file and CSS file that mimicked the layout of my existing blog. Design is always easier for a none-designer when you have a reference and so I quite literally opened up my current blog in one tab, my candidate HTML file in another tab and iterated over the content of it and the CSS until I had something that was a reasonable approximation.

While I’m not going to pretend that the CSS is a stunning piece of artistry this didn’t take long and I was sufficiently in the ballpark after just an hour.

Time taken: 1 hour

Step 2 – Creating a Solution and Code Skeleton

Next up was creating a solution skeleton in Visual Studio establishing the basic coding practices along with the models I expected to use throughout. My previous work with Azure Functions has been for small and quite isolatable parts of a wider system rather than being the main compute resource for the system and so I’d not really had to think too hard about how to organise the code.

Something I knew I wanted to carry over as a pattern from my previous work was the concept of “thin” functions. The function methods themselves are, to me, much like actions on a ASP.Net Core / Web API controller – entry points that accept input and return output and should be kept small and focused, handing off to more appropriate implementers that are not aware of the technicalities of the specific host technology (via services, commands etc.). Not doing that is a mix of concerns and tightly ties your implementation to the Functions runtime.

While I wanted to separate my concerns out I also didn’t want this simple solution to inflate into an overly complex system and so I settled on a fairly traditional layered approach comprised, from an implementation point of view, of 4 assemblies communicating over public C# interfaces but with fully private implementations all written to .NET Standard 2.0:

  • Models – a small set of classes to communicate basic information up and down, but not out of (by which I mean they are not persisted in a data store directly nor are they returned to the end user), the stack
  • Data Access – simple implementations on top of table storage and blob storage
  • Runtime – the handful of classes that do the actual work
  • Functions – the entry point assembly

Mapped out this ultimately gave me a solution structure like this:

The remaining decision I needed to make was how to handle dependency injection. An equivalent system written with, say, ASP.Net Core would use an IoC container and register the configuration during startup but that’s state that persists for the lifetime of the server and functions are ideally stateless. Spinning up and configuring an IoC container for each execution of a function seemed needlessly expensive so I made the decision to use a “poor mans” approach to dependency injection with the Runtime and Data Access assemblies each exposing a static factory class that was responsible for essentially implementing the “Resolve” method for each of my instantiable types and that exposed public create methods for the public interfaces of each layer.

For the limited number of classes I have this approach worked pretty well and allowed me to write testable code in the same way as if I was using a fully fledged container.

Time taken: 1 hour

Step 3 – Creating the Layout, Posts and the Homepage

The first step in turning my earlier HTML and CSS work into something that could be used to create a real blog from real posts was to write a Handlebars template for the overall layout that could stitch together the main content and sidebar into a full HTML document. Based on my earlier work this was pretty simple and looked like this:

<html>
    <head>
        <title>{{pageTitle}}</title>
        <link href="{{stylesheetUrl}}" rel="stylesheet" />
        <link rel='stylesheet' href='https://fonts.googleapis.com/css?family=Roboto:regular' type='text/css' media='all' />
        <link href="{{faviconUrl}}" rel="shortcut icon" type="image/x-icon" />
    </head>
    <body>
        <div class="title-panel">
            <div class="container">
                <a class="primary-title" href="/">{{blogName}}</a>
            </div>
        </div>
        <div class="container">
            <div class="content">            
                <div class="reading">                            
                    {{{readingContent}}}
                </div>
                <div class="sidebar">
                    {{{sidebar}}}
                </div>
            </div>
        </div>
        <div class="footer-panel">
            <div class="container">
                Copyright &copy; {{defaultAuthor}}
            </div>
        </div>
    </body>
</html>

Along with this I created a pair of methods in my composition class to bring the components of the site together:

public async Task<string> GetHomepage()
{
    return await GetWrappedContent(() => _outputRepository.GetHomepageContent());
}

private async Task<string> GetWrappedContent(Func<Task<string>> contentFunc)
{
    Task<string> templateTask = _templateRepository.GetLayoutTemplate();
    Task<string> sidebarTask = _outputRepository.GetSidebar();
    Task<string> contentTask = contentFunc();

    await Task.WhenAll(templateTask, sidebarTask, contentTask);

    string template = templateTask.Result;
    string content = contentTask.Result;
    string sidebar = sidebarTask.Result;

    TemplatePayload payload = new TemplatePayload
    {
        BlogName = _blogName,
        DefaultAuthor = _defaultAuthor,
        PageTitle = _blogName,
        ReadingContent = content,
        Sidebar = sidebar,
        StylesheetUrl = _stylesheetUrl,
        FavIconUrl = _favIconUrl
    };
    Func<object, string> compiledTemplate = Handlebars.Compile(template);

    string html = compiledTemplate(payload);
    return html;
}

To generate posts I needed to read a post from an IO stream and then convert the Markdown into un-styled HTML and for that I used the excellent CommonMark.NET which I hid behind an injected helper to facilitate later testing. After conversion the post is saved to the output blob store:

Post post = await _postRepository.Get(postStream);
string html = _markdownToHtmlConverter.FromMarkdown(post.Markdown, post.UrlName, post.Author, post.PostedAtUtc);
await _outputRepository.SavePost(post.UrlName, html);

Actually deserializing the post took a little more effort as I needed to also parse out the metadata and this can be seen in the PostParser.cs implementation.

The homepage on my blog is basically the most recent n posts compiled together and so to do this I used another Handlebars template:

{{#each this}}
    {{#if @index}}
        <div class="post-spacer"></div>
    {{/if}}
    {{{this}}}
{{/each}}

To order the posts on the homepage (and later the sidebar) I need to track the “posted at” dates of each post. I can’t use on the LastModified property of the blob as that won’t deal with updates correctly and to migrate my content over I need to be able to set the dates as part of that process. To do this I persisted some basic data to an Azure Storage table.

And finally I created a handlebars template for generating a hard coded sidebar based on my sample.

Time taken: 3 hour

Step 4 – Blob Triggered Post Processing Function

At this point I had a bunch of code written for processing markdown and generating web pages but no way to call it and so the next step was to implement a function that would listen for new and updated blobs and generate the appropriate assets:

public static class ProcessPost
{
    [FunctionName("ProcessPost")]
    public static async Task Run([BlobTrigger("posts/{name}", Connection = "BlogStorage")]Stream myBlob, string name, TraceWriter log)
    {
        log.Info($"ProcessPost triggered\n Blob Name:{name} \n Size: {myBlob.Length} Bytes");

        Factory.Create(ConfigurationOptionsFactory.Create());

        IStaticAssetManager staticAssetManager = Factory.Instance.GetRenderer();
        await staticAssetManager.AddOrUpdatePost(myBlob);
    }
}

This function demonstrates the use of some of the principles and practices I thought about during the first step of this process:

  • The Azure Function is small and restricts it’s actions to that domain: it takes an input, sets up the subsequent environment and hands off.
  • The poor mans dependency injection approach is used to resolve an instance if IStaticAssetManager.

I tested this first locally using the Azure Functions Core Tools and other than some minor fiddling around with the local tooling it just worked which I verified by checking the output blob repository and eyeballing the contents. No great genius on my part: I’m using things I’ve used before and am familiar with to solve a new problem.

Time taken: 1 hour

Step 5 – Homepage and Post Functions

Next up was to try and render my homepage and for this I wrote a new function following the same principles as before:

[FunctionName("GetHomepage")]
public static async Task<ContentResult> Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = "home")]HttpRequest req, TraceWriter log)
{
    log.Info("C# getContent HTTP trigger function processing a request.");
            
    Factory.Create(ConfigurationOptionsFactory.Create());

    IResponseRenderer responseRenderer = Manager.Factory.Instance.GetResponseRenderer();
    string content = await responseRenderer.GetHomepage();

    return new ContentResult
    {
        Content = content,
        ContentType = "text/html",
        StatusCode = 200
    };            
}

This worked but I encountered my first challenge of the day: the function was on a path of https://blog.azurewebsites.net/api/home which is not going to allow it to function as the root page for my website. In fact if I went to the root I would instead see the Azure Functions welcome page:

While this is a perfectly fine page it’s not really going to help my readers view my content. Fortunately Azure Functions also include a capability called Proxies which allow you to take any incoming request, reshape it, and call an alternate backend. I had no idea if this would work on a root path but wrote the simple pass through proxy shown below:

{
  "$schema": "http://json.schemastore.org/proxies",
  "proxies": {
    "HomePageProxy": {
      "matchCondition": {
        "route": "/",
        "methods": [
          "GET"
        ]
      },
      "backendUri": "https://%BlogDomain%/home"
    }
  }
}

That matches on a GET request to the root and sends it on to my home page handler. This works absolutely fine when run on Azure but doesn’t work locally in the Core Tools – they seem to use the root path for something else. I need to do more investigation here but for now, given it works in the target environment and I only have 12 hours, I settled on this and moved on.

To remove the api component of the URI on my future functions I also modified the hosts.json file used by Azure Functions setting the HTTP routePrefix option to blank:

{
  "http": {
    "routePrefix": ""
  }
}

Writing this I’m wandering if this is what’s causing my issues with the root proxy on the local tools. Hmm. Something to try later as I can accomplish the same with another proxy.

Time taken: 2 hours

Step 6 – Load Testing

With my homepage compositor function written and a working system deployed to the cloud with this first fully representative vertical slice I wanted to get a quick handle on how it would cope with a reasonable amount of load.

Visual Studio Team Services is great for quickly throwing lots of concurrent virtual users against a public endpoint. I set up a test with a fairly rapid step up in the number of users going from 0 to 400 concurrent users in around 5 minutes and then staying at that level for another 15 minutes.

I knew from my casual browser testing that the response from the homepage function for a single user page load on a quiet system took between 60 and 100ms which I was fairly pleased about. I expected some divergence from that as the system scaled up but for things essentially to work.

Much to my surprise and horror that was not the case. As the user count increased the response time started running at around 3 to 4 seconds per request and generated an awful lot of errors along the way. The system never scaled up to a point where the load could really be acceptably dealt with as can be seen below:

I blogged about this extensively in my last post and so won’t cover it again here but the short version is that the Azure Functions v2 .NET Core runtime (that is still in preview) was the culprit. To resolve things I migrated my functions over to .NET 4.6.2 and after doing so and running a similar test again I got a much more acceptable result:

Average response time over the run averages 700ms and the system scaled out pretty nicely to deal with the additional users (and I pushed this up to 600 on this test). The anecdotal experience (me using the browser with the cache disabled as the test ran) was also excellent and felt consistently snappy throughout with timings of between 90ms and 900ms with the majority that I saw taking around 300ms (it’s worth noting I’m geographically closer than the test agents to the Azure data centre the blog is running in – VSTS doesn’t run managed agents from UK South currently).

As part of moving to .NET 4.6 I had to make some changes to my functions, an example of this is below:

[FunctionName("GetHomepage")]
public static async Task<HttpResponseMessage> Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = "home")]HttpRequestMessage req, TraceWriter log)
{
    log.Info("GetHomepage triggered");
    Factory.Create(ConfigurationOptionsFactory.Create());

    IWebPageComposer webPageComposer = Factory.Instance.GetResponseRenderer();
    string content = await webPageComposer.GetHomepage();

    HttpResponseMessage response = req.CreateResponse(HttpStatusCode.OK);
    response.Content = new StringContent(content, Encoding.UTF8, "text/html");            

    return response;            
}

Time Taken: 3 hours

Step 7 – Sidebar Content

To maintain a sidebar I needed to maintain some additional metadata – what posts belong in what categories which I’m pulling from the (optional) JSON annotation of the Markdown files I outlined earlier. An example of that can be seen below:

{
    createdAtUtc: '2017-12-29 10:01:00',
    categories: [
        'C#',
        'Code'
    ],
    urlName: 'aUrlNameForAPost',
    author: 'James Randall'
}

The categories get parsed into a very simple table storage class:

internal class CategoryItem : TableEntity
{
    public string UrlName => PartitionKey;

    public string PostUrlName => RowKey;

    public string DisplayName { get; set; }

    public string PostTitle { get; set; }

    public DateTime PostedAtUtc { get; set; }

    public static string GetPartitionKey(string categoryUrlName)
    {
        return categoryUrlName;
    }

    public static string GetRowKey(string postUrlName)
    {
        return postUrlName;
    }
}

The UrlName‘s referenced above are just (by default) camelcase alphabetic strings used to identify posts as part of a URI and as such are unique (within the context of a blog). Because all this activity takes place on the backend and away from user requests I’ve not bothered with any more complex indexing strategies or further storage tables to store the unique set of categories – instead when I need to organise the categories into a hierarchical structure or get the category names I simply load them all from table store and run some simple LINQ:

internal class CategoryListBuilder : ICategoryListBuilder
{
    public IReadOnlyCollection<Category> FromCategoryItems(IEnumerable<CategoryItem> items)
    {
        var result = items.GroupBy(x => x.UrlName, (k, g) => new Category
        {
            UrlName = k,
            DisplayName = g.First().DisplayName,
            Posts = g.OrderByDescending(x => x.PostedAtUtc).Select(x => new PostSummary
            {
                PostedAtUtc = x.PostedAtUtc,
                Title = x.PostTitle,
                UrlName = x.PostUrlName
            }).ToArray()
        }).OrderBy(x => x.DisplayName).ToArray();

        return result;
    }
}

This is something that might need revisiting at some point but this isn’t some uber-content management system, it’s designed to handle simple blogs like mine, and, hey, I only have 12 hours!

I take a similar approach to generating the list of months for the archives section of the sidebar and then create it as a static asset with a Handlebars template:

<h2>Recent Posts</h2>
<ul>
    {{#each recentPosts}}
        <li><a href="/{{urlName}}">{{title}}</a></li>        
    {{/each}}    
</ul>
<h2>Archives</h2>
<ul>
    {{#each archives}}
        <li><a href="/archive/{{year}}/{{month}}">{{displayName}}</a></li>
    {{/each}}    
</ul>
<h2>Categories</h2>
<ul>
    {{#each categories}}
        <li><a href="/category/{{urlName}}">{{displayName}}</a></li>
    {{/each}}    
</ul>

Time taken: 2 hours

Step 8 – Wrap Up

With most of the system working and problems solved all that was left was to fill in a couple of the empty pages: post lists for categories and archives. The only new code I needed to this was something to summarise a post, for the moment I’ve taken a quick and dirty approach to this based on how my content is structured: I look for the title and the end of the first paragraph in the HTML output.

And having got that again I simply use another Handlebars template to generate the output and a couple more functions to return the content to a user.

Time taken: 2 hours

Conclusions and Next Steps

Did I succeed? Well I have my MVP,  it works, and it ticks off what I wanted! However I took 14 hours to put this together rather than the 12 I’d allowed. Most of the overrun was due to the performance issue with .NET Core and the Azure Functions v2 runtime, it took a little while to pin down the cause of the issue as the starting point for my investigation was based on the (generally reasonable!) assumption that I’d done something stupid.

Given that and as it’s New Year I’m going to give myself a pass and class this as a resounding success! A few takeaways for me:

  • Azure Functions are very flexible and serverless can be a great model but there are some definite limitations in the Azure Function implementation some of which stem from the underlying hosting model – I’m going to come back to this in a future blog post and, time allowing, contrast them with AWS Lamda’s.
  • Implementing this using Azure Functions was not really any harder than using ASP.Net Core or Web API.
  • Never underestimate the need to test with some load against your code.
  • It’s always spending some time on identifying the main challenges in a project and focusing your efforts against them. In this case it was covering enough ground quickly enough to validate the design without making things a nightmare to move on and into a more professional codebase.
  • If you really focus its amazing how much you can get done quickly with modern tools and technologies.
  • Development is fun! I had a great time building this small project.
  • Blogging takes even longer than development. The real overrun on this project was the blog post – I think its taken me the best part of 2 days.

If I continue with this project the next steps, in a rough priority order, will be to:

  1. Add unit tests
  2. Introduce fault tolerance strategies and logging
  3. Add a proper deployment script so others can get up and running with it
  4. Test it with more content (extracted from my blog)
  5. Improve code syntax highlighting
  6. Ensure images work

Finally the code that goes along with this blog can be found over on GitHub:

https://github.com/JamesRandall/AzureFromTheTrenches.ServerlessBlog

Azure Functions v2 Preview Performance Issues (.NET Core / Standard)

I’ve been spending a little time building out a serverless web application as a small holiday project and as this is just a side project I’d taken the opportunity to try out the new .NET Core based v2 runtime for Azure Functions and the new tooling and support in Visual Studio 2017.

As soon as I had an end to end vertical slice I wanted to run some load tests to ensure it would scale up reliably – the short version is that it didn’t. The .NET Core v2 runtime is still in preview (and you are warned not to use this environment for production workloads due to potential breaking changes) so you would hope that this will get fixed by general release but right now there seem to be some serious shortcomings in the scalability and performance of this environment rendering it fairly unusable.

I used the VSTS load testing system to hit a single URL initially with a high volume of users for a few minutes. In isolation (i.e. if I run it from a browser with no activity) this function runs in less than 100ms and normally around the 70ms mark however as the number of users increases performance quickly takes a serious nosedive with requests taking seconds to return as can be seen below:

After things settled down a little (hitting a system like this from cold with a high concurrency is going to cause some chop while things scale out) average request time began to range in the 3 to 9 seconds and the anecdotal experience (me running it in a browser / PostMan while the test was going on) gave me highly variable performance. Some requests would take just a few hundred milliseconds while others would take over 20 seconds.

Worryingly no matter how long the test was run this never improved.

I began by looking at my code assuming I’d made a silly mistake but I couldn’t see anything and so boiled things down to a really simple test case, essentially the one that is created for you by the Visual Studio template:

[FunctionName("GetString")]
public static IActionResult Run([HttpTrigger(AuthorizationLevel.Anonymous, "get", Route = null)]HttpRequest req, TraceWriter log)
{
    log.Info("C# HTTP trigger function processed a request.");

    var result = new OkObjectResult("hello world");

    return (IActionResult)result;
}

I expected this to scale and perform much better as it’s as simple as it gets: return a hard coded string. However to my surprise this exhibited very similar issues:

The response time, to return a string!, hovered around the 7 second mark and the system never really scaled sufficiently to deal with a small percentage of failures due to the volume.

Having run a fair few tests and racking up a lot of billable virtual user minutes on my credit card I tweaked the test slightly at this point moving to a 5 minute test length with step up concurrent user growth. Running this on the same simple test gave me, again, poor results with average response times of between 1.5 and 2 seconds for 100 concurrent users and a function that is as close to doing nothing as it gets (the response time is hidden by the page time in the performance chart below, it tracks almost exactly). The step up of users to a fairly low volume eliminates the errors, as you’d expect.

What these graphs don’t show are variance around this average response time which still ranged from a few hundred milliseconds up to around 15 seconds.

At this point I was beginning to suspect the Functions 2.0 preview runtime might be the issue and so created myself a standard Functions 1.0 runtime and deployed this simple function as a CSX script:

using System.Net;

public static async Task<HttpResponseMessage> Run(HttpRequestMessage req, TraceWriter log)
{
    var response = req.CreateResponse();
    response.StatusCode = HttpStatusCode.OK;
    response.Content = new StringContent("hello world", System.Text.Encoding.UTF8, "text/plain");

    return response;
}

Running the same ramp up test as above shows that this function behaves much more as you’d expect with average response times in the 300ms to 400ms range when running at 100 concurrent users:

Intrigued I did run a short 5 minute 400 concurrent user test with no ramp up and again the csx based function behaved much more in line with what I think are reasonable expectations with it taking a short time to scale up to deal with the sudden demand but doing so without generating errors and eventually settling down to a response time similar to the test above:

Finally I deployed a .NET 4.6 based function into a new 1.0 runtime Function app. I made a slight mistake when setting up this test and ramped it up to 200 users rather than 100 but it scales much more as you’d expect and holds a fairly steady response time of around 150ms. Interestingly this gives longer response times than .NET Core for single requests run in isolation around 170ms for .NET 4.6 vs. 70ms for .NET Core.

At this point I felt fairly confident that the issue I was seeing in my application was due to the v2 Function runtime and so made a quick change to target .NET 4.6 instead and spun up a new v1 runtime and ran my initial 400 concurrent user test again:

As the system scales up, giving no errors, this test eventually settles at around the 500ms average request per second mark which is something I can move ahead with. I’d like to get it closer to 150ms and it will be interesting to see what I can tweak so I can on the consumption plan as I think I’m starting to bump up against some of the other limits with Functions (ironically resolving that involves taking advantage of what is actually going on with the Functions runtime implementation and accepting that its a somewhat flawed serverless implementation as it stands today).

As a more general conclusion the only real takeaway I have from the above (beyond the general point that it’s always worth doing some basic load testing even on what you assume to be simple code) is that the Azure Function 2.0 runtime has some way to go before it comes out of Preview. What’s running in Azure currently is suitable only for the most trivial of workloads – I wouldn’t feel able to run this even in a beta system today.

Something else I’d like to see from Azure Functions is a more aggressive approach to scaling up/out, for spiky workloads where low latency is important there is a significant drag factor at the moment. While you can run on an App Service Plan and handle the scaling yourself this kind of flies in the face of the core value proposition of serverless computing – I’m back to renting servers. A reserved throughput or Premium Consumption offering might make more sense.

I do plan on running these tests again once the runtime moves out of preview – I’m confident the issue will be fixed, after all to be usable as a service it basically has to be.

C# Cloud Application Architecture – Commanding via a Mediator (Part 3)

In the previous post we simplified our controllers by having them accept commands directly and configured ASP.Net Core so that consumers of our API couldn’t insert data into sensitive properties. However we’ve not yet set up any validation so, for example, it is possible for a user to add a product to a cart with a negative quantity. We’ll address that in this post and take a look at an alternative to the attribute model of validation that comes out the box with ASP.Net and ASP.Net Core.

The source code for this post can be found on GitHub:

https://github.com/JamesRandall/CommandMessagePatternTutorial/tree/master/Part3

The built in ASP.Net Core approach to validation, like previous versions, relies on a model being annotated with attributes but there are some significant drawbacks with this approach:

  1. You need access to the source code to add the attributes – sometimes this isn’t possible and so you end up maintaining multiple sets of models just to express validation rules.
  2. It requires you to tightly couple your model to a specific validation system and set of validation rules and include additional dependencies that may not be appropriate everywhere you want to use the model.
  3. It’s extensible but not particularly elegantly and complicated rules are hard to implement.

Depending on your application architecture maintaining multiple sets of model can be a good solution to some of these issues, and might be the right thing to do for your architecture in any case (for example you certainly don’t want Entity Framework models bleeding out through an API for example) but in our scenario there’s little to be gained by maintaining multiple representations of commands, at least in the general sense.

An excellent alternative framework for validation is Jeremy Skinner’s Fluent Validation package set. This allows validation rules to be expressed separately from models, hooks into the ASP.Net Core pipeline to support the standard ModelState approach, and allows validations to be run easily from anywhere.

We’re going to continue to follow our application encapsulation model for adding validations and so we will add a validator package to each domain, for example ShoppingCart.Validation. Each of these assemblies will consist of private classes for the validators that are registered in the IoC container using an installer. Below is an example of the validator for our AddToCartCommand command:

internal class AddToCartCommandValidator : AbstractValidator<AddToCartCommand>
{
    public AddToCartCommandValidator()
    {
        RuleFor(c => c.ProductId).NotEqual(Guid.Empty);
        RuleFor(c => c.Quantity).GreaterThan(0);
    }
}

This validator makes sure that we have a none-empty product ID and specify a quantity of 1 or more. We register this in our installer as follows:

public static class IServiceCollectionExtensions
{
    public static IServiceCollection RegisterValidators(this IServiceCollection serviceCollection)
    {
        serviceCollection.AddTransient<IValidator<AddToCartCommand>, AddToCartCommandValidator>();
        return serviceCollection;
    }
}

The ShoppingCart.Validation assembly is referenced only from the ShoppingCart.Application installer as shown below:

public static class IServiceCollectionExtensions
{
    public static IServiceCollection UseShoppingCart(this IServiceCollection serviceCollection,
        ICommandRegistry commandRegistry)
    {
        serviceCollection.AddSingleton<IShoppingCartRepository, ShoppingCartRepository>();

        commandRegistry.Register<GetCartQueryHandler>();
        commandRegistry.Register<AddToCartCommandHandler>();

        serviceCollection.RegisterValidators();

        return serviceCollection;
    }
}

And finally in our ASP.Net Core project, OnlineStore.Api, in the Startup class we register the Fluent Validation framework – however it doesn’t need to know anything about the specific validators:

public void ConfigureServices(IServiceCollection services)
{
    services.AddMvc(c =>
    {
        c.Filters.Add<AssignAuthenticatedUserIdActionFilter>();
        c.AddAuthenticatedUserIdAwareBodyModelBinderProvider();
    }).AddFluentValidation();

    services.AddSwaggerGen(c =>
    {
        c.SwaggerDoc("v1", new Info { Title = "Online Store API", Version = "v1" });
        c.SchemaFilter<SwaggerAuthenticatedUserIdFilter>();
        c.OperationFilter<SwaggerAuthenticatedUserIdOperationFilter>();
    });

    CommandingDependencyResolver = new MicrosoftDependencyInjectionCommandingResolver(services);
    ICommandRegistry registry = CommandingDependencyResolver.UseCommanding();

    services
        .UseShoppingCart(registry)
        .UseStore(registry)
        .UseCheckout(registry);
    services.Replace(new ServiceDescriptor(typeof(ICommandDispatcher), typeof(LoggingCommandDispatcher),
        ServiceLifetime.Transient));
}

We can roll that approach out over the rest of our commands and with the validator configured in ASP.Net can add model state checks into our AbstractCommandController, for example:

public abstract class AbstractCommandController : Controller
{
    protected AbstractCommandController(ICommandDispatcher dispatcher)
    {
        Dispatcher = dispatcher;
    }

    protected ICommandDispatcher Dispatcher { get; }

    protected async Task<IActionResult> ExecuteCommand<TResult>(ICommand<TResult> command)
    {
        if (!ModelState.IsValid)
        {
            return BadRequest(ModelState);
        }
        TResult response = await Dispatcher.DispatchAsync(command);
        return Ok(response);
    }
        
    // ...
}

At this point we can enforce simple entry point validation but what if we need to make validation decisions based on state we only know about in a handler? For example in our AddToCartCommand example we want to make sure the product really does exist before we add it and perhaps return an error if it does not (or, for example, if its not in stock).

To achieve this we’ll make some small changes to our CommandResponse class so that instead of simply returning a string it returns a collection of keys (property names) and values (error messages). This might sound like it should be a dictionary but the ModelState design of ASP.Net supports multiple error messages per key and so a dictionary won’t work, instead we’ll use an IReadOnlyCollection backed by an array. Here’s the changes made to the result-free implementation (the changes are the same for the typed variant):

public class CommandResponse
{
    protected CommandResponse()
    {
        Errors = new CommandError[0];
    }

    public bool IsSuccess => Errors.Count==0;

    public IReadOnlyCollection<CommandError> Errors { get; set; }

    public static CommandResponse Ok() {  return new CommandResponse();}

    public static CommandResponse WithError(string error)
    {
        return new CommandResponse
        {
            Errors = new []{new CommandError(error)}
        };
    }

    public static CommandResponse WithError(string key, string error)
    {
        return new CommandResponse
        {
            Errors = new[] { new CommandError(key, error) }
        };
    }

    public static CommandResponse WithErrors(IReadOnlyCollection<CommandError> errors)
    {
        return new CommandResponse
        {
            Errors = errors
        };
    }
}

We’ll add the errors from our command response into model state through an extension method in the API assembly – that way the syntax will stay clean but we won’t need to add a reference to ASP.Net in our shared assembly. We want to keep that as dependency free as possible. The extension method is pretty simple and looks like this:

public static class CommandResponseExtensions
{
    public static void AddToModelState(
        this CommandResponse commandResponse,
        ModelStateDictionary modelStateDictionary)
    {
        foreach (CommandError error in commandResponse.Errors)
        {
            modelStateDictionary.AddModelError(error.Key ?? "", error.Message);
        }
    }
}

And finally another update to the methods within our AbstractCommandController to wire up the response to the model state:

protected async Task<IActionResult> ExecuteCommand<TCommand, TResult>() where TCommand : class, ICommand<CommandResponse<TResult>>, new()
{
    if (!ModelState.IsValid)
    {
        return BadRequest(ModelState);
    }
    TCommand command = CreateCommand<TCommand>();
    CommandResponse<TResult> response = await Dispatcher.DispatchAsync(command);
    if (response.IsSuccess)
    {
        return Ok(response.Result);
    }
    response.AddToModelState(ModelState);
    return BadRequest(ModelState);
}

With that we can now return simple model level validation errors to clients and more complex validations from our business domains. An example from our AddToCartHandler that makes use of this looks as follows:

public async Task<CommandResponse> ExecuteAsync(AddToCartCommand command, CommandResponse previousResult)
{
    Model.ShoppingCart cart = await _repository.GetActualOrDefaultAsync(command.AuthenticatedUserId);

    StoreProduct product = (await _dispatcher.DispatchAsync(new GetStoreProductQuery{ProductId = command.ProductId})).Result;

    if (product == null)
    {
        _logger.LogWarning("Product {0} can not be added to cart for user {1} as it does not exist", command.ProductId, command.AuthenticatedUserId);
        return CommandResponse.WithError($"Product {command.ProductId} does not exist");
    }
    List<ShoppingCartItem> cartItems = new List<ShoppingCartItem>(cart.Items);
    cartItems.Add(new ShoppingCartItem
    {
        Product = product,
        Quantity = command.Quantity
    });
    cart.Items = cartItems;
    await _repository.UpdateAsync(cart);
    _logger.LogInformation("Updated basket for user {0}", command.AuthenticatedUserId);
    return CommandResponse.Ok();
}

Hopefully that’s a good concrete example of how this loosely coupled state and convention based approach to an architecture can reap benefits. We’ve added validation to all of our controllers and actions without having to revisit each one to add boilerplate and we don’t have to worry about unit testing each and every one to ensure validation is enforced and to be sure this worked we simply updated the unit tests for our AbstractCommandController, and our validations are clearly separated from our models and testable. And we’ve done all this through simple POCO (plain old C# object) type models that are easily serializable – something we’ll make good use of later.

In the next part we’ll move on to our handlers and clean them up so that they become purely focused on business domain concerns and at the same time improve the consistency of the logging we’ve got scattered around the handlers at the moment. We’ll also add some basic telemetry to the system using Application Insights so that we can measure how often and for how long each of our handlers take to run.

Other Parts in the Series

Part 5
Part 4
Part 2
Part 1

Azure Functions – Expect Significant Clock Skew

While running the experiment I posted about on Sunday I annotated the message I sent on to the event hub with the Azure Functions view of the current time (basically I set a property to DateTime.UtcNow) and, out of curiosity, grouped my results by second long tumbling windows based on that date. This gave me results that were observably different than when doing the same with the enqueue date and time logged by the Event Hub (as an aside there is some interesting information about Event Hubs and clock skew here). My experiments didn’t need massively accurate time tracking as I was really just looking for trends over a long, relative to the clock skew, period of time however I looked at some of the underlying numbers and became suspicious that there was a significant degree of clock skew across my functions.

I reached out to the @AzureFunctions team on Twitter asking how they handled clock sync on the platform and one of the engineers, Chris Anderson, replied confirming what I suspected: there are no guarantees about clock sync on the Azure Functions platform and, further, you should expect the time skew to be large.

That means you can’t really obtain a consistent view of “now” from within an Azure Function. You could go and get it from an external source but that in and of itself is going to introduce other inaccuracies. Essentially you can’t handle dynamic time reliably inside a function with any precision and you’re limited to working with reference points obtained upstream and passed in.

Definitely something to be aware of when designing systems that make use of Azure Functions as it would be easy to use them in scenarios applying timestamps expecting some sense of temporal sequence.

This isn’t, of course, a new problem – dealing with precise and accurate time in a distributed system is always a challenge and requires careful consideration but it does underline the importance of understanding your cloud vendors various runtime environments.

Azure Functions – Queue Trigger Scaling (Part 1)

I’m a big fan of the serverless compute model (known as Azure Functions on the Azure platform), but in some ways its greatest strength is also its weakness: the serverless model essentially asks you to run small units of code inside a fully managed environment on a pay for what you need basis that in theory will scale infinitely in response to demand. With the increased granularity this is the next evolution in cloud elasticity with no more need to buy and reserve CPUs which sit partially idle until the next scaling point is reached. However as a result you lose control over the levers you might be used to pulling in a more traditional cloud compute environment – it’s very much a black box. Using typical queue processing patterns as an example this includes the number of “threads” or actors looking at a queue and the length of the back off timings.

Most of the systems I’ve transitioned onto Azure Functions to date have been more focused on cost than scale and have had no particular latency requirements and so I’ve just been happy to reduce my costs without a particularly close examination. However I’m starting to look at moving spikier higher volume queue systems onto Azure Functions and so I’ve been looking to understand the opaque aspects more fully through running a series of experiments.

Before continuing it’s worth noting that Microsoft continue to evolve the runtime host for Azure Functions and so the results are only really valid at the time they are run – run them again in 6 months and you’re likely to see, hopefully subtle and improved, changes in behaviour.

Most of my higher volume requirements are light in terms of compute power but heavy on volume and so I’ve created a simple function that pulls a message from a Service Bus queue and writes it, along with a timestamp, onto an event hub:

[FunctionName("DequeAndForwardOn")]
[return: EventHub("results", Connection = "EhConnectionString")]
public static string Run([ServiceBusTrigger("testqueue", Connection = "SbConnectionString")]string myQueueItem, TraceWriter log)
{
    log.Info($"C# ServiceBus queue trigger function processed message: {myQueueItem}");
    EventHubOutput message = JsonConvert.DeserializeObject<EventHubOutput>(myQueueItem);
    message.ProcessedAtUtc = DateTime.UtcNow;
    string json = JsonConvert.SerializeObject(message);
    return json;
}

Once the items are on the Event Hub I’m using a Streaming Analytics job to count the number of dequeues per second with a tumbling window and output them to table storage:

SELECT
    System.TimeStamp AS TbPartitionKey,
    '' as TbRowKey,
    SUBSTRING(CAST(System.TimeStamp as nvarchar(max)), 12, 8) as Time,
    COUNT(1) AS totalProcessed
INTO
    resultsbysecond
FROM
    [results]
TIMESTAMP BY EventEnqueuedUtcTime
GROUP BY TUMBLINGWINDOW(second,1)

For this initial experiment I’m simply going to pre-load the Service Bus queue with 1,000,000 messages and analyse the dequeue rate. Taking all the above gives us a workflow that looks like this:

Executing all this gave some interesting results as can be seen from the graph below:

From a cold start it took just under 13 minutes to dequeue all 1,000,000 messages with a fairly linear, if spiky, approach to scaling up the dequeue rate from a low at the beginning of 23 dequeues per second to a peak of over 3000 increasing at a very rough rate of 3.2 messages per second. It seems entirely likely that this will go on until we start to hit IO limits around the Service Bus. We’d need to do more experiments to be certain but it looks like the runtime is allocating more queue consumers while all existing consumers continue to find their are items on the queue to process.

In the next part we’re going to run a few more experiments to help us understand the scaling rate better and how it is impacted by quiet periods.

C# Cloud Application Architecture – Commanding via a Mediator (Part 2)

In the last post we’d converted our traditional layered architecture into one that instead organised itself around business domains and made use of command and mediator patterns to promote a more loosely coupled approach. However things were definitely still very much a work in progress with our controllers still fairly scruffy and our command handlers largely just a copy and paste of our services from the layered system.

In this post I’m going to concentrate on simplifying the controllers and undertake some minor rework on our commands to help with this. This part is going to be a bit more code heavy than the last and will involve diving into parts of ASP.Net Core. The code for the result of all these changes can be found on GitHub here:

https://github.com/JamesRandall/CommandMessagePatternTutorial/tree/master/Part2

As a reminder our aim is to move from controller with actions that look like this:

[HttpPut("{productId}/{quantity}")]
public async Task<IActionResult> Put(Guid productId, int quantity)
{
    CommandResponse response = await _dispatcher.DispatchAsync(new AddToCartCommand
    {
        UserId = this.GetUserId(),
        ProductId = productId,
        Quantity = quantity
    });
    if (response.IsSuccess)
    {
        return Ok();
    }
    return BadRequest(response.ErrorMessage);
}

To controllers with actions that look like this:

[HttpPut("{productId}/{quantity}")]
public async Task<IActionResult> Put([FromRoute] AddToCartCommand command) => await ExecuteCommand(command);

And we want to do this in a way that makes adding future controllers and commands super simple and reliable.

We can do this fairly straightforwardly as we now only have a single “service”, our dispatcher, and our commands are simple state – this being the case we can use some ASP.Net Cores model binding to take care of populating our commands along with some conventions and a base class to do the heavy lifting for each of our controllers. This focuses the complexity in a single, testable, place and means that our controllers all become simple and thin.

If we’re going to use model binding to construct our commands the first thing we need to be wary of is security: many of our commands have a UserId property that, as can be seen in the code snippet above, is set within the action from a claim. Covering claims based security and the options available in ASP.Net Core is a topic in and of itself and not massively important to the application architecture and code structure we’re focusing on for the sake of simplicty I’m going to use a hard coded GUID, however hopefully its clear from the code below where this ID would come from in a fully fledged solution:

public static class ControllerExtensions
{
    public static Guid GetUserId(this Controller controller)
    {
        // in reality this would pull the user ID from the claims e.g.
        //     return Guid.Parse(controller.User.FindFirst("userId").Value);
        return Guid.Parse("A9F7EE3A-CB0D-4056-9DB5-AD1CB07D3093");
    }
}

If we’re going to use model binding we need to be extremely careful that this ID cannot be set by a consumer as this would almost certainly lead to a security incident. In fact if we make an interim change to our controller action we can see immediately that we’ve got a problem:

[HttpPut("{productId}/{quantity}")]
public async Task<IActionResult> Put([FromRoute]AddToCartCommand command)
{
    CommandResponse response = await _dispatcher.DispatchAsync(command);
    if (response.IsSuccess)
    {
        return Ok();
    }
    return BadRequest(response.ErrorMessage);
}

The user ID has bled through into the Swagger definition but, because of how we’ve configured the routing with no userId parameter on the route definition, the binder will ignore any value we try and supply: there is nowhere to specify it as a route parameter and a query parameter or request body will be ignored. Still – it’s not pleasant: it’s misleading to the consumer of the API and if, for example, we were reading the command from the request body the model binder would pick it up.

We’re going to take a three pronged approach to this so that we can robustly prevent this happening in all scenarios. Firstly, to support some of these changes, we’re going to rename the UserId property on our commands to AuthenticatedUserId and introduce an interface called IUserContextCommand as shown below that all our commands that require this information will implement:

public interface IUserContextCommand
{
    Guid AuthenticatedUserId { get; set; }
}

With that change made our AddToCartCommand now looks like this:

public class AddToCartCommand : ICommand<CommandResponse>, IUserContextCommand
{
    public Guid AuthenticatedUserId { get; set; }

    public Guid ProductId { get; set; }

    public int Quantity { get; set; }
}

I’m using the Swashbuckle.AspNetCore package to provide a Swagger definition and user interface and fortunately it’s quite a configurable package that will allow us to customise how it interprets actions (operations in it’s parlance) and schema through a filter system. We’re going to create and register both an operation and schema filter that ensures any reference to AuthenticatedUserId either inbound or outbound is removed. The first code snippet below will remove the property from any schema (request or response bodies) and the second snippet will remove it from any operation parameters – if you use models with the [FromRoute] attribute as we have done then Web API will correctly only bind to any parameters specified in the route but Swashbuckle will still include all the properties in the model in it’s definition.

public class SwaggerAuthenticatedUserIdFilter : ISchemaFilter
{
    private const string AuthenticatedUserIdProperty = "authenticatedUserId";
    private static readonly Type UserContextCommandType = typeof(IUserContextCommand);

    public void Apply(Schema model, SchemaFilterContext context)
    {
        if (UserContextCommandType.IsAssignableFrom(context.SystemType))
        {
            if (model.Properties.ContainsKey(AuthenticatedUserIdProperty))
            {
                model.Properties.Remove(AuthenticatedUserIdProperty);
            }
        }
    }
}
public class SwaggerAuthenticatedUserIdOperationFilter : IOperationFilter
{
    private const string AuthenticatedUserIdProperty = "authenticateduserid";

    public void Apply(Operation operation, OperationFilterContext context)
    {
        IParameter authenticatedUserIdParameter = operation.Parameters?.SingleOrDefault(x => x.Name.ToLower() == AuthenticatedUserIdProperty);
        if (authenticatedUserIdParameter != null)
        {
            operation.Parameters.Remove(authenticatedUserIdParameter);
        }
    }
}

These are registered in our Startup.cs file alongside Swagger itself:

public void ConfigureServices(IServiceCollection services)
{
    services.AddMvc();
    services.AddSwaggerGen(c =>
    {
        c.SwaggerDoc("v1", new Info { Title = "Online Store API", Version = "v1" });
        c.SchemaFilter<SwaggerAuthenticatedUserIdFilter>();
        c.OperationFilter<SwaggerAuthenticatedUserIdOperationFilter>();
    });

    CommandingDependencyResolver = new MicrosoftDependencyInjectionCommandingResolver(services);
    ICommandRegistry registry = CommandingDependencyResolver.UseCommanding();

    services
        .UseShoppingCart(registry)
        .UseStore(registry)
        .UseCheckout(registry);
}

If we try this now we can see that our action now looks as we expect:

With our Swagger work we’ve at least obfuscated our AuthenticatedUserId property and now we want to make sure ASP.Net ignores it during binding operations. A common means of doing this is to use the [BindNever] attribute on models however we want our commands to remain free of the concerns of the host environment, we don’t want to have to remember to stamp [BindNever] on all of our models, and we definetly don’t want the assemblies that contain them to need to reference ASP.Net packages – after all one of the aims of taking this approach is to promote a very loosely coupled design.

This being the case we need to find another way and other than the binding attributes possibly the easiest way to prevent user IDs being set from a request body is to decorate the default body binder with some additional functionality that forces an empty GUID. Binders consist of two parts, the binder itself and a binder provider so to add our binder we need two new classes (and we also include a helper class with an extension method). Firstly our custom binder:

public class AuthenticatedUserIdAwareBodyModelBinder : IModelBinder
{
    private readonly IModelBinder _decoratedBinder;

    public AuthenticatedUserIdAwareBodyModelBinder(IModelBinder decoratedBinder)
    {
        _decoratedBinder = decoratedBinder;
    }

    public async Task BindModelAsync(ModelBindingContext bindingContext)
    {
        await _decoratedBinder.BindModelAsync(bindingContext);
        if (bindingContext.Result.Model is IUserContextCommand command)
        {
            command.AuthenticatedUserId = Guid.Empty;
        }
    }
}

You can see from the code above that we delegate all the actual binding down to the decorated binder and then simply check to see if we are dealing with a command that implements our IUserContextCommand interface and blanks out the GUID if neecssary.

We then need a corresponding provider to supply this model binder:

internal static class AuthenticatedUserIdAwareBodyModelBinderProviderInstaller
{
    public static void AddAuthenticatedUserIdAwareBodyModelBinderProvider(this MvcOptions options)
    {
        IModelBinderProvider bodyModelBinderProvider = options.ModelBinderProviders.Single(x => x is BodyModelBinderProvider);
        int index = options.ModelBinderProviders.IndexOf(bodyModelBinderProvider);
        options.ModelBinderProviders.Remove(bodyModelBinderProvider);
        options.ModelBinderProviders.Insert(index, new AuthenticatedUserIdAwareBodyModelBinderProvider(bodyModelBinderProvider));
    }
}

internal class AuthenticatedUserIdAwareBodyModelBinderProvider : IModelBinderProvider
{
    private readonly IModelBinderProvider _decoratedProvider;

    public AuthenticatedUserIdAwareBodyModelBinderProvider(IModelBinderProvider decoratedProvider)
    {
        _decoratedProvider = decoratedProvider;
    }

    public IModelBinder GetBinder(ModelBinderProviderContext context)
    {
        IModelBinder modelBinder = _decoratedProvider.GetBinder(context);
        return modelBinder == null ? null : new AuthenticatedUserIdAwareBodyModelBinder(_decoratedProvider.GetBinder(context));
    }
}

Our installation extension method looks for the default BodyModelBinderProvider class, extracts it, constructs our decorator around it, and replaces it in the set of binders.

Again, like with our Swagger filters, we configure this in the ConfigureServices method of our Startup class:

// This method gets called by the runtime. Use this method to add services to the container.
public void ConfigureServices(IServiceCollection services)
{
    services.AddMvc(c =>
    {
        c.AddAuthenticatedUserIdAwareBodyModelBinderProvider();
    });
    services.AddSwaggerGen(c =>
    {
        c.SwaggerDoc("v1", new Info { Title = "Online Store API", Version = "v1" });
        c.SchemaFilter<SwaggerAuthenticatedUserIdFilter>();
        c.OperationFilter<SwaggerAuthenticatedUserIdOperationFilter>();
    });

    CommandingDependencyResolver = new MicrosoftDependencyInjectionCommandingResolver(services);
    ICommandRegistry registry = CommandingDependencyResolver.UseCommanding();

    services
        .UseShoppingCart(registry)
        .UseStore(registry)
        .UseCheckout(registry);
}

At this point we’ve configured ASP.Net so that consumers of the API cannot manipulate sensitive properties on our commands and so for our next step we want to ensure that the AuthenticatedUserId property on our commands is popualted with the legitimate ID of the logged in user.

While we could set this in our model binder (where currently we ensure it is blank) I’d suggest that’s a mixing of concerns and in any case will only be triggered on an action with a request body, it won’t work for a command bound from a route or query parameters for example. All that being the case I’m going to implement this through a simple action filter as below:

public class AssignAuthenticatedUserIdActionFilter : IActionFilter
{
    public void OnActionExecuting(ActionExecutingContext context)
    {
        foreach (object parameter in context.ActionArguments.Values)
        {
            if (parameter is IUserContextCommand userContextCommand)
            {
                userContextCommand.AuthenticatedUserId = ((Controller) context.Controller).GetUserId();
            }
        }
    }

    public void OnActionExecuted(ActionExecutedContext context)
    {
            
    }
}

Action filters are run after model binding and authentication and authorization has occurred and so at this point we can be sure we can access the claims for a validated user.

Before we move on it’s worth reminding ourselves what our updated controller action now looks like:

[HttpPut("{productId}/{quantity}")]
public async Task<IActionResult> Put([FromRoute]AddToCartCommand command)
{
    CommandResponse response = await _dispatcher.DispatchAsync(command);
    if (response.IsSuccess)
    {
        return Ok();
    }
    return BadRequest(response.ErrorMessage);
}

To remove this final bit of boilerplate we’re going to introduce a base class that handles the dispatch and HTTP response interpretation:

public abstract class AbstractCommandController : Controller
{
    protected AbstractCommandController(ICommandDispatcher dispatcher)
    {
        Dispatcher = dispatcher;
    }

    protected ICommandDispatcher Dispatcher { get; }

    protected async Task<IActionResult> ExecuteCommand(ICommand<CommandResponse> command)
    {
        CommandResponse response = await Dispatcher.DispatchAsync(command);
        if (response.IsSuccess)
        {
            return Ok();
        }
        return BadRequest(response.ErrorMessage);
    }
}

With that in place we can finally get our controller to our target form:

[Route("api/[controller]")]
public class ShoppingCartController : AbstractCommandController
{
    public ShoppingCartController(ICommandDispatcher dispatcher) : base(dispatcher)
    {
            
    }

    [HttpPut("{productId}/{quantity}")]
    public async Task<IActionResult> Put([FromRoute] AddToCartCommand command) => await ExecuteCommand(command);

    // ... other verbs
}

To roll this out easily over the rest of our commands we’ll need to get them into a more consistent shape – at the moment they have a mix of response types which isn’t ideal for translating the results into HTTP responses. We’ll unify them so they all return a CommandResponse object that can optionally also contain a result object:

public class CommandResponse
{
    protected CommandResponse()
    {
            
    }

    public bool IsSuccess { get; set; }

    public string ErrorMessage { get; set; }

    public static CommandResponse Ok() {  return new CommandResponse { IsSuccess = true};}

    public static CommandResponse WithError(string error) {  return new CommandResponse { IsSuccess = false, ErrorMessage = error};}
}

public class CommandResponse<T> : CommandResponse
{
    public T Result { get; set; }

    public static CommandResponse<T> Ok(T result) { return new CommandResponse<T> { IsSuccess = true, Result = result}; }

    public new static CommandResponse<T> WithError(string error) { return new CommandResponse<T> { IsSuccess = false, ErrorMessage = error }; }

    public static implicit operator T(CommandResponse<T> from)
    {
        return from.Result;
    }
}

With all that done we can complete our AbstractCommandController class and convert all of our controllers to the simpler syntax. By making these changes we’ve removed all the repetitive boilerplate code from our controllers which means less tests to write and maintain, less scope for us to make mistakes and less scope for us to introduce inconsistencies. Instead we’ve leveraged the battle hardened model binding capabilities of ASP.Net and concentrated all of the our response handling in a single class that we can test extensively. And if we want to add new capabilities all we need to do is add commands, handlers and controllers that follow our convention and all the wiring is taken care of for us.

In the next part we’ll explore how we can add validation to our commands in a way that is loosely coupled and reusable in domains other than ASP.Net and then we’ll revisit our handlers to separate out our infrastructure code (logging for example) from our business logic.

Other Parts in the Series

Part 5
Part 4
Part 3
Part 1

C# Cloud Application Architecture – Commanding via a Mediator (Part 1)

When designing cloud based systems we often think about the benefits of loosely coupling systems, for example through queues and pub/sub mechanisms, but fall back on more traditional onion or layered patterns for the application architecture when alternative approaches might provide additional benefits.

My aim over a series of posts is to show how combining the Command pattern with the Mediator pattern can yield a number of benefits including:

  • A better separation of concerns with a clear demarkation of domains
  • An internally decomposed application that is simple to work with early in it’s lifecycle yet can easily evolve into a microservice approach
  • Is able to reliably support none-transactional or distributed data storage operations
  • Less repetitive infrastructure boilerplate to write such as logging and telemetry

To illustrate this we’re going to refactor an example application, over a number of stages, from a layered onion ring style to a decoupled in-process application and finally on through to a distributed application. An up front warning, to get the most out of these posts a fair amount of code reading is encouraged. It doesn’t make sense for me to walk through each change and doing so would make the posts ridiculously long but hopefully the sample code provided is clear and concise. It really is designed to go along with the posts.

The application is a RESTful HTTP API supporting some basic, simplified, functionality for an online store. The core API operations are:

  • Shopping cart management: get the cart, add products to the cart, clear the cart
  • Checkout: convert a cart into an order, mark the order as paid
  • Product: get a product

To support the above there is also a sense of store product catalog, it’s not exposed through the API, but it is used internally.

I’m going to be making use of my own command mediation framework AzureFromTheTrenches.Commanding as it has some features that will be helpful to us as we progress towards a distributed application but alternatives are available such as the popular Mediatr and if this approach appeals to you I’d encourage you to look at those too.

Our Starting Point

The code for the onion ring application, our starting point, can be found on GitHub:

https://github.com/JamesRandall/CommandMessagePatternTutorial/tree/master/Part1a

Firstly lets define what I mean by onion ring or layered architecture. In the context of our example application it’s a series of layers built one on top of the other with calls up and down the chain never skipping a layer: the Web API talks to the application services, the application services to the data access layer, and the data access layer to the storage. Each layer drives the one below. To support this each layer is broken out into it’s own assembly and a mixture of encapsulation (at the assembly and class level) and careful reference management ensures this remains the case.

In the case of our application this looks like this:

Concrete class coupling is avoided through the use of interfaces and these are constructor injected to promote a testable approach. Normally in such an architecture each layer has its own set of models (data access models, application models etc.) with mapping taking place between them however for the sake of brevity, and as I’m just using mock data stores, I’ve ignored this and used a single set of models that can be found in the OnlineStore.Model assembly.

As this is such a common pattern, it’s certainly the pattern I’ve come across most often over the last few years (where application architecture has been considered at all – sadly the spaghetti / make it up as you go along approach is still prevalent), it’s worth looking at what’s good about it:

  1. Perhaps most importantly – it’s simple. You can draw out the architecture on a whiteboard with a set of circles or boxes in 30 seconds and it’s quite easy to explain what is going on and why
  2. It’s testable, obviously things get more complicated in the real world but above the data access layer testing is really easy: everything is mockable and the major concrete external dependencies have been isolated in our data access layer.
  3. A good example of the pattern is open for extension but closed for modification: because their are clear contractual interfaces decorators can be added to, for example, add telemetry without modifying business logic.
  4. It’s very well supported in tooling – if you’re using Visual Studio it’s built in refactoring tools or Resharper will have no problem crunching over this solution to support changes and help keep things consistent.
  5. It can be used in a variety of different operating environments, I’ve used a Web API for this example but you can find onion / layer type architectures in all manner of solutions.
  6. It’s what’s typically been done before – most engineers are familiar with the pattern and its become almost a default choice.

I’ve built many successful systems myself using this pattern and, to a point, its served me well. However I started looking for better approaches as I experienced some of the weaknesses:

  1. It’s not as loosely coupled and malleable as it seems. This might seem an odd statement to make given our example makes strong use of interfaces and composition via dependency injection, an approach often seen to promote loose coupling and flexibility. And it does to a point. However we still tend to think of operations in such an architecture in quite a tightly coupled way: this controller calls this method on this interface which is implemented by this service class and on down through the layers. It’s a very explicit invocation process.
  2. Interfaces and classes in the service layer often become “fat” – they tend to end up coupled around a specific part of the application domain (in our case, for example, the shopping basket) and used to group together functionality.
  3. As an onion ring application gets larger it’s quite common for the Application Layer to begin to resemble spaghetti – repositories are made reference to from across the code base, services refer to services, and helpers to other helpers. You can mitigate this by decomposing the application layer into further assemblies, and making use of various design patterns, but it only takes you so far. These are all sticking plaster over a fundamental problem with the onion ring approach: it’s designed to segregate at the layer level not at the domain or bounding context level (we’ll talk about DDD later). It’s also common to see concerns start to bleed across the layer with what once were discrete boundaries.
  4. In the event of fundamental change the architecture can be quite fragile and resistant to easy change. Let’s consider a situation where our online store has become wildly successful and to keep it serving the growing userbase we need to break it apart to handle the growing scale. This will probably involve breaking our API into multiple components and using patterns such as queues to asynchronously run and throttle some actions. With the onion ring architecture we’re going to have to analyse what is probably a large codebase looking for how we can safely break it apart. Almost inevitably this will be more difficult than it first seems and it will probably seem quite difficult to begin with (I may have been here myself!) and we’ll uncover all manner of implicit tight coupling points.

There are of course solutions to many of these problems, particularly the code organisation issues, but at some point it’s worth considering if the architecture is still helping you or beginning to hinder you and if perhaps there are alternative approaches.

With that in mind, and as I said earlier, I’d like to introduce an alternative. It’s by no means the alternative but its one I’ve found tremendously helpful when building applications for the cloud.

The Command Pattern with a Mediator

The classic command pattern aims to encapsulate all the information needed to perform an action typically within a class that contains properties (the data needed to perform the action) and an execute method (that undertakes the action). For example:

public class GetProductCommand : ICommand<Product>
{
    public Guid ProductId { get; set; }

    public Task<Product> Execute()
    {
         /// ... logic
    }
}

This can be a useful way of structuring an application but tightly couples state and execution and that can prove limiting if, for example, it would be helpful for the command to participate in a pub/sub approach, take part in a command chain (step 1 – store state in undo, step 2 – execute command, step 3 – log command was executed), or operate across a process boundary.

A powerful improvement can be made to the approach by decoupling the execution from the command state via the mediator pattern. In this pattern all commands, each of which consist of simple serializable state, are dispatched to the mediator and the mediator determines which handlers will execute the command:

Because the command is decoupled from execution it is possible to associate multiple handlers with a command and moving a handler to a different process space (for example splitting a Web API into multiple Web APIs) is simply a matter of changing the configuration of the mediator. And because the command is simple state we can easily serialize it into, for example, an event store (the example below illustrates a mediator that is able to serialize commands to stores directly, but this can also be accomplished through additional handlers:

It’s perhaps worth also briefly touching on the CQRS pattern – while the Command Message pattern may facilitate a CQRS approach it neither requires it or has any particular opinion on it.

With that out the way lets take a look at how this pattern impacts our applications architecture. Firstly the code for our refactored solution can be found here:

https://github.com/JamesRandall/CommandMessagePatternTutorial/tree/master/Part1b

And from our layered approach we now end up with something that looks like this:

It’s important to realise when looking at this code that we’re going to do more work to take advantage of the capabilities being introduced here. This is really just a “least work” refactor to demonstrate some high level differences – hopefully over the next few posts you’ll see how we can really use these capabilities to simplify our solution. However even at this early stage there are some clear points to note about how this effects the solution structure and code:

  1. Rather than structure our application architecture around technology we’ve restructured it around domains. Where previously we had a data access layer, an application layer and an API layer we now have a shopping cart application, a checkout application, a product application and a Web API application. Rather than the focus being about technology boundaries its more about business domain boundaries.
  2. Across boundaries (previously layers, now applications) we’ve shifted our invocation semantics from interfaces, methods and parameters to simple state that is serializable and persistable and communicated through a “black box” mediator or dispatcher. The instigator of an operation no longer has any knowledge about what will handle the operation and as we’ll see later that can even include the handler executing in a different process space such as another web application.
  3. I’ve deliberately used the term application for these business domain implementations as even though they are all operating in the same process space it really is best to think of them as self contained units. Other than a simple wiring class to allow them to be configured in the host application (via a dependency injector) they interoperate with their host (the Web API) and between themselves entirely through the dispatch of commands.
  4. The intent is that each application is fairly small in and of itself and that it can take the best approach required to solve it’s particular problem. In reality it’s quite typical that many of the applications follow the same conventions and patterns, as they do here, and when this is the case its best to establish some code organisation concentions. In fact it’s not uncommon to see a lightweight onion ring architecture used inside each of these applications (so for lovers of the onion ring – you don’t need to abandon it completely!).

Let’s start by comparing the controllers. In our traditional layered architecture our product controller looked like this:

[Route("api/[controller]")]
public class ProductController : Controller
{
    private readonly IProductService _productService;

    public ProductController(IProductService productService)
    {
        _productService = productService;
    }

    [HttpGet("{id}")]
    public async Task<StoreProduct> Get(Guid id)
    {
        return await _productService.GetAsync(id);
    }
}

As expected this accepts an instance of a service, in this case of an IProductService, and it’s get method simply passes the ID on down to it. ALthough the controller is abstracted away from the implementation of the IProductService it is clearly and directly linked to one type of service and a method on that service.

Now lets look at its replacement:

[Route("api/[controller]")]
public class ProductController : Controller
{
    private readonly ICommandDispatcher _dispatcher;

    public ProductController(ICommandDispatcher dispatcher)
    {
        _dispatcher = dispatcher;
    }

    [HttpGet("{id}")]
    public async Task Get(Guid id)
    {
        return await _dispatcher.DispatchAsync(new GetStoreProductQuery
        {
            ProductId = id
        });
    }
}

In this implementation the controller accepts an instance of ICommandDispatcher and then rather than invoke a method on a service it calls DispatchAsync on that dispatcher supplying it a command model. The controller no longer has any knowledge of what is going to handle this request and all our commands are executed by discrete handlers. In this case the GetStoreProductQuery command is handled the the GetStoreProductQueryHandler in the Store.Application assembly:

internal class GetStoreProductQueryHandler : ICommandHandler<GetStoreProductQuery, StoreProduct>
{
    private readonly IStoreProductRepository _repository;

    public GetStoreProductQueryHandler(IStoreProductRepository repository)
    {
        _repository = repository;
    }

    public Task<StoreProduct> ExecuteAsync(GetStoreProductQuery command, StoreProduct previousResult)
    {
        return _repository.GetAsync(command.ProductId);
    }
}

It really does no more than the implementation of the ProductService in our initial application but importantly derives from the ICommandHandler generic interface supplying as generic types the type of the command it is handling and the result type. Our command mediation framework takes care of routing the command to this handler and will ultimately call ExecuteAsync on it. The framework we are using here allows commands to be chained and so as well as being given the command state it is also given any previous result.

Handlers are registered against actors as follows (this can be seen in the IServiceCollectionExtenions.cs file of Store.Application):

commandRegistry.Register<GetStoreProductQueryHandler>();

The command mediation framework has all it needs from the definition of GetStoreProductQueryHandler to map the handler to the command.

I think it’s fair to say that’s all pretty loosely coupled! In fact its so loose that if we weren’t going to go on to make more of this pattern we might conclude that the loss of immediate traceability is not worth it.

In the next part we’re going to visit some of our more complicated controllers and commands, such as the Put verb on the ShoppingCartController, to look at how we can massively simplify the code below:

[HttpPut("{productId}/{quantity}")]
public async Task<IActionResult> Put(Guid productId, int quantity)
{
    CommandResponse response = await _dispatcher.DispatchAsync(new AddToCartCommand
    {
        UserId = this.GetUserId(),
        ProductId = productId,
        Quantity = quantity
    });
    if (response.IsSuccess)
    {
        return Ok();
    }
    return BadRequest(response.ErrorMessage);
}

By way of a teaser the aim is to end up with really thin controller action that, across the piece, look like this:

[HttpPut("{productId}/{quantity}")]
public async Task<IActionResult> Put([FromRoute] AddToCartCommand command) => await ExecuteCommand(command);

We’ll then roll that approach out over the rest of the API. Then get into some really cool stuff!

Other Parts in the Series

Part 5
Part 4
Part 3
Part 2

Cosmos DB x-ms-partitionkey Error

A small tip but one might save you some time – if you’re trying to run Cosmos DB queries via the REST API you might encounter the following Bad Request error:

The partition key supplied in x-ms-partitionkey header has fewer components than defined in the the collection.

What you actually need to specify is the partition key for your collection using the x-ms-documentdb-partitionkey header.

An annoyingly misleading error message!

Upgrading an ASP.Net Core 1.1 project to ASP.Net Core 2.0 and run on Azure

Following the instructions on the ASP.Net Core 2.0 announcement blog post I was quickly able to get a website updated to ASP.Net Core 2.0 and run it locally.

The website is hosted on Azure in the App Service model and unfortunately after publishing my project I encountered IIS 502.5 errors. Luckily this was in a deployment slot so my public website was unaffected.

After a bit of head scratching I found two extra steps were required to upgrade an existing web site.

Firstly In your .csproj file you should have an ItemGroup like the one below:

<ItemGroup>
    <DotNetCliToolReference Include="Microsoft.VisualStudio.Web.CodeGeneration.Tools" Version="1.0.1" />
</ItemGroup>

You need to change the version to 2.0.0.

Secondly you need to clear out all the old files from your Azure hosted instance. The easiest way to do this is to open Kudu (Advanced Tools in the portal), navigate to the site/wwwroot folder and delete them.

After that if you publish your website again it should work fine.

Hope that saves someone a few hours of head-scratching.

Contact

  • If you're looking for help with C#, .NET, Azure, Architecture, or would simply value an independent opinion then please get in touch here or over on Twitter.

Recent Posts

Recent Tweets

Recent Comments

Archives

Categories

Meta

GiottoPress by Enrique Chavez