A Close Look At .NET Core 3.1 on AWS Lambda
The day we’ve been waiting for came yesterday when support for .NET Core 3.1 on AWS Lambda was announced. If you recall, AWS’ policy is to only support LTS versions of .NET Core, so it was only a matter of time. Almost four months since 3.1 was announced, actually.
To my knowledge, .NET Core 3.1 doesn’t contain any changes from 3.0 that would affect its performance on AWS Lambda, so my recommendations shouldn’t need to change. In fact, AWS made one of the same recommendations when they advised turning on ReadyToRun for better cold start performance. In my post, I also recommended turning off TieredCompilation due to the performance impact it had for 15–20 seconds after a cold start. We’ll see if that is still necessary.
.NET Core 3.0 AWS Lambda Benchmarks and Recommendations
.NET Core 3.0 has been out of preview for about a month now. Here are my updated benchmarks and recommendations.
I’ve been looking forward to 3.1 support since I first benchmarked 3.0-preview4 back in April 2019. At that time, the only way to run 3.0 was using a custom runtime which negatively impacts performance. Even so, it was clear that 3.1 was going to be a game changer when natively supported.
In this post, I’m going to cover two things. First, we’ll look under the hood at the differences between the AWS Lambda’s .NET Core 2.1 and 3.1 support. Then we’ll get to the graphs and see just how much better 3.1 is, as well as double-check that my recommendations are better than the defaults.
Under The Hood
Ever wondered how your handler method gets called and how its class gets instantiated? Well, without diving too deep:
Lambda containers run a native program which starts .NET running an assembly named Bootstrap. Bootstrap loads your assembly and instantiates your class (or triggers your static constructor). Assuming everything goes well, it then enters the invoke loop which uses P/Invoke and shared memory to receive and respond to invocations.
The most interesting part is the Bootstrap assembly. There weren’t many changes made to produce the .NET Core 3.1 version, but a few of them are interesting enough to talk about.
Until recently, the go to .NET library for JSON has been Json.NET. Almost as ubiquitous as the library itself is resolving the version conflicts caused by dependencies using different versions from the one you’re using. To avoid this problem, AWS adopted LitJSON, a small, embedded JSON library.
Now we have System.Text.Json, a high performance, secure, and standards compliant JSON library from Microsoft. I haven’t seen a performance test between LitJSON and System.Text.Json, but I guess the Lambda team has.
System.Text.Json is now being used instead of LitJSON to deserialize properties of the
CognitoClientContext that appear in
ILambdaContext when using Cognito credentials.
Move aside Json.NET
Unless both your handlers input and output are of type
Stream, you must configure an implementation of
You would have seen this line before:
This is telling Bootstrap to deserialize input and serialize output using the
Amazon.Lambda.Serialization.Json.JsonSerializer class, which is found in the
Amazon.Lambda.Serialization.Json NuGet package.
In keeping with the theme, AWS have created a new NuGet package named
Amazon.Lambda.Serialization.SystemTextJson which contains a new implementation (
LambdaJsonSerializer) based on System.Text.Json.
With this change, it’s System.Text.Json all the way down and we can expect to see performance gains as a result.
You can still use Json.NET with .NET Core 3.1 Lambda functions, though. This may be necessary for compatibility or functionality reasons.
In one of my first blog posts, I wrote about how Lambda functions that call async code and return either
Task<T>, throw an
AggregateException that wraps the real exception.
The solution to unwrapping that
AggregateException was to set the
UNWRAP_AGGREGATE_EXCEPTIONS environment variable. Doing so changed the way Bootstrap awaits the result of your function.
This is still the case for .NET Core 2.1 functions. However, for .NET Core 3.1 functions, the Lambda team have taken the opportunity to make a potentially breaking change. They’ve removed the environment variable and changed the default behaviour to always avoid the
AggregateException by calling
GetAwaiter().GetResult(). There is no way to change this behaviour back, but that’s probably for the best.
If you’ve turned on active tracing and checked out X-Ray (or just read any of my previous blog posts), you’ll know that Lambda reports different trace subsegments to X-Ray.
In the past, we’ve seen
Overhead. Initialization refers to the cold start time before your handler method is first executed, invocation is the time your handler is running, and overhead was previously only seen when using a custom runtime.
Previously, overhead seemed to refer to the time after your custom runtime reported the result to the runtime API. Now, however, .NET Core 3.1 functions are always reporting Overhead.
It’s unclear why, but it might have something to do with the fact 3.1 is running on Amazon Linux 2, while 2.1 remains on Amazon Linux 1.
There were a couple of other small changes.
One is that Bootstrap is now explicitly reporting initialization errors. These happen when the configured assembly, class, or method doesn’t exist, your constructor throws an exception, or something else goes wrong. I suspect this is to better support provisioned concurrency and debugging.
Another even more minor change is that there was a piece of code where Bootstrap was previously calling
string.Join on an
object where 3/4 items were of type
string and one was
int is now converted to a
string which avoids boxing and unboxing by using the
string overload. I wouldn’t expect any noticeable performance impact from this, but I just wanted to draw attention to the attention to detail!
Show Me The Numbers
Code targeting .NET Core 3.1 is more performant than .NET Core 2.1 without any change. Switching from Json.NET to System.Text.Json should help too. I’ve kept this in mind while designing these benchmarks.
Meet the competitors
There are three configurations under test here.
- .NET Core 2.1
- .NET Core 3.1 with default settings
- .NET Core 3.1 with ReadyToRun:on and TieredCompilation:off
Default settings are ReadyToRun:off and TieredCompilation:on. The third configuration is the one I recommended previously. Add the following to your project file’s
<PropertyGroup> to change these settings:
Simple vs Complex
I’ve created two projects which will be compiled targeting each of the above configurations.
The simple project is about as simple as it gets. It takes in a
Stream and returns a
Stream. There’s no JSON serialization happening and it’s not doing anything other than returning the stream it was given.
The complex project does a lot more. Its goal is to simulate an API Gateway HTTP API integration. It takes a
APIGatewayHttpApiV2ProxyRequest (available in the recently released
2.0.0 version of
Amazon.Lambda.APIGatewayEvents) and returns a
The handler uses Amazon Rekognition to detect labels for an image stored in S3 and tags the S3 object with the labels. Lastly, it removes the tags it added. At each step, it’s starting and stopping XRay subsegments. All this brings in a handful of NuGet packages and a realistic amount of code to JIT and run.
The JSON input used for all functions will be the example payload for the HTTP API Lambda integration. The request
body is a serialized S3 event referring to the aforementioned S3 object. The S3 event will be deserialized by the complex project using Json.NET in 2.1 and System.Text.Json in 3.1.
Cold vs Warm
Cold starts are still very relevant for .NET. The more code you have, the slower your cold starts are going to be. On the other hand, warm starts are generally only affected by compiler options and how much your function does.
I force cold starts by changing the function configuration between each invocation. I also exclude the two fastest and two slowest results because Lambda can be weird sometimes.
For warm starts, I exclude the first two invocations because one may be a cold start and the second is sometimes unrepresentative.
I invoke the function 30–50 times to get good averages for cold starts. For warm starts, I invoke it repeatedly for about three minutes.
As shown below, .NET Core 3.1 easily beats 2.1 in total duration for the complex function. When further optimised by enabling ReadyToRun, it manages to finish in half the time with 128 MB of memory.
As usual, the gaps start to close when more memory (and therefore more CPU power) is available. However, even with 512 MB, the difference is still close to double!
Drilling down into the initialization time tells a different story. .NET Core 2.1 is actually faster here. The only reason I can think is that maybe 3.1 has more overhead in loading assemblies or performing reflection.
The results of the simple function are unexpected. They show .NET Core 2.1 as being significantly faster than 3.1 and scaling better with higher memory.
Again, could there be more initial overhead in .NET Core 3.1? Or something to do with the Amazon Linux 2 instances it runs on?
This is interesting! The graph below shows that for .NET Core 2.1, about half the total runtime is initialization. Whereas for .NET Core 3.1, much more is.
This supports the hypothesis that .NET Core 3.1 has significant overhead somewhere around loading your assembly, instantiating your class, and calling your constructor (all of which is done via reflection).
Below is the result of repeatedly invoking the complex function with 512 MB of memory. It’s very difficult to say any of them are particularly faster or more stable than the others.
The most important thing to notice here is that the 15–20 seconds of warm start problems with Tiered Compilation appear to be gone! This could be a change in 3.1 or Amazon Linux 2. I’m not sure!
In fact, Tiered Compilation doesn’t seem to be making any difference at all. That’s strange. I’ve tried explicitly turning it off and on, but have seen no difference.
Other than that, .NET Core 3.1 didn’t do much for warm starts, at least not for my complex sample. Your mileage may vary if your function does a lot of JSON operations, or something else that was improved in 3.0.
The behaviour at other memory amounts is more of the same, so I won’t bother adding those graphs. There’s also no point including the simple warm starts as the numbers are so small and identical that the only takeaway is that nothing has changed for the better or worse.
You’re making a new Lambda function, should you choose .NET Core 2.1 or .NET Core 3.1? You should choose .NET Core 3.1.
You have a .NET Core 2.1 function, should you migrate it to .NET Core 3.1? Yes, you will most likely see a performance boost. Just be careful of breaking changes and test well.
When using .NET Core 3.1 on Lambda, you should continue to compile with ReadyToRun turned on (if possible) by adding
<PublishReadyToRun>true</PublishReadyToRun> to your project file’s
Tiered Compilation doesn’t appear to make a difference one way or the other anymore. Just keep it in mind if you see varying performance during the first ~20 seconds of your Lambda container’s life.
Lastly, remember that
UNWRAP_AGGREGATE_EXCEPTIONS doesn’t do anything in .NET Core 3.1 functions and exceptions are unwrapped by default. This may be a breaking change in your error handling.