This is a follow up to my recent post titled How to Trigger a Lambda Function Every 5–10 Seconds. I recommend you read that too, but it’s not required.
As is often the case in software, this problem has more than one possible solution. One awesome thing about serverless is how we can glue services together in creative ways to solve the same problem differently. Jeremy Thomerson did just that when he described an alternate solution.
Jeremy proposed using a Step Function to provide the second-level precision instead of the Lambda/SQS combination used in my previous post. The Step Function itself is similarly triggered by an EventBridge rule.
Like last time, I’ve set this up and analysed the results.
I’ll go into more detail next, but the big changes are that
everyminute is now a Step Function (instead of a Lambda function) and the SQS queue is gone.
The solution is cleaner overall (it has a few more lines of YAML, but the code in
everyminute.js is gone).
Step Function: everyminute
Let me give you something better to look at instead of YAML. This is a screenshot from AWS’ recently released “Step Functions Workflow Studio”.
I’ve built this to be as comparable as possible to the Lambda/SQS solution, but you could do it other ways (see Enhancements).
Importantly, I’ve set
1on the Map state, otherwise it would run multiple threads and invoke the Lambda function in parallel.
The Map state iterator waits
10 seconds using a Wait state, then invokes the
interval Lambda function asynchronously (it doesn’t wait for it to complete before moving on to the next loop).
I’ve removed the SQS event trigger and the reserved concurrency, but the code is unchanged; it still simply logs its input and the current time.
How accurate is it?
The results in my previous post show that EventBridge rules appear to trigger on the same second of every minute, give or take ~500 milliseconds.
The Wait state offers second-level accuracy and the AWS console shows it waits for exactly 10,000 milliseconds. State transitions take ~10ms.
The following graph shows how far away from “every 10 seconds” the invocations were over an hour. I’d call this within a second, usually 500ms.
Most invocations seem to have happened ~50ms late. If they weren’t late, they were 250–500ms early (with a couple of larger spikes).
The graph below compares these results to the previous ones. Very similar!
Lambda & SQS
I didn’t actually publish the cost of the other solution, so let’s look at it now.
The cost is made up of:
everyminuteinvocation time +
everyminutesending messages to SQS +
- Lambda polling SQS for
I’m omitting the tiny SQS storage cost and assuming a 30-day month.
everyminute function had an average billed duation of 25ms and I was running it at 256MB of memory. 43,200 invocations a month, priced at $0.000000004 per millisecond, means $0.004536/month for
everyminute invocation time.
everyminute sends 6 messages to SQS each time it’s invoked (259,200 messages a month). If we ignore the free tier and use $0.40 per million requests,
everyminute sending messages to SQS costs $0.103687/month.
Lambda long polls with five processes. Let’s say four of them never pick up messages, so they each make 3 requests per minute. One process picks up a message every 10 seconds, so it makes 6 requests per minute. Polling requests are priced the same as sending messages, so Lambda polling SQS for
interval costs $0.31104 per month.
In total, that’s $0.419263/month. I’ll call that 42 cents.
I have the Step Function being executed every minute (43,200/month) which costs $0.0432.
Small Step Functions like this one use 64MB of memory, so $0.000001042 per second. Since the Step Function is effectively running constantly, that means it costs $2.700864/month.
In total, that’s $2.744064/month. I’ll call that $2.75.
For comparison, a Standard Step Function could cost ~$13/month.
The most interesting enhancement is one Jeremy already pointed out: Express Step Functions can run for up to 5 minutes. That means you could change the EventBridge rule to trigger every 5 minutes instead of every minute. The Step Function would then invoke the Lambda function 30 times instead of 6 times.
This would save 3 cents on per-execution costs. In my opinon, this isn’t worth it for 3 reasons (one per cent?):
- If an execution fails for unexpected reasons, I’d prefer to retry it the next minute instead of having up to 5 minutes without invoking my function.
- Running the Step Function right up to the 5 minute limit seems risky. It is hard to reason about the Wait states, transition times, and Lambda service response times to make sure it all fits.
- Due to transition times and Lambda service response times, I think that the longer the Step Function runs, the more the invocation times would move around. I like the reset given by starting a new execution.
Maybe you could do 3 or 4 minutes if you really wanted, but I still don’t think it’s worth it.
Similar to the previous solution, if you wanted to fan this out to invoke multiple functions, you could make the Step Function directly publish an SNS message instead of invoking just one function directly. Be aware that this will add some latency.
This solution is great! Sure, it costs 6.5 times more, but it’s on a small enough scale that I suspect the extra $2.33 isn’t going to break the bank. If you wanted to invoke your function even more often, the price gap would close very slightly.
The accuracy is slightly poorer, but it’s not far off. There are less moving parts and no custom Lambda code to miss bugs in. Finally, I think this solution is a little easier to understand.
Which one should you use? Either, really. If cost (or accuracy?) is very important to you then go with the previous solution. Otherwise, use this Step Function solution.
All of the code shown here is available on GitHub.