SQS Lambda Concurrency
SQS Lambda Concurrency
How do both the SQS and Lambda get implemented together for concurrency?
While attempting to work with SQS, you must wait for messages to get received, then get them processed and deleted out of the queue. In case you forget to delete the message, it is going to return after the specified VisibilityTimeout, since SQS believes that the processing had failed and shall make it available for getting consumed again, so you’re not going to lose any messages. This process will not be applied while utilizing SQS as an event source for Lambda because you won’t be touching the SQS part!
The function level concurrent execution limit will also limit you as it defaults to a shared pool of unreserved concurrency allocation being 1000/region. This may be lowered through specifying reserved concurrent executions parameter to being a subset out of the limit of your account. Nonetheless, it is going to subtract that number out of your shared pool, thus affecting different functions! In addition, in case your Lambda is VPC-enabled, this means that Amazon EC2 limits are going to be applied.
If you choose to take your SQS up a level, you’re going to notice the number of messages in flight starting to increase. That’s going to be your Lambda slowly scaling out as a response to the queue size, and finally it will hit concurrency limit. These messages are going to be consumed and using synchronous invocation is going to be a total failure along with an exception. This is where the SQS retry policy is most useful.
What do you think is going to happen upon adding one queue as your event source for 2 distinct functions? Yep, it is definitely going to act as being a load balancer.
Does this actually work?
Some tests were made undergoing the below assumptions:
- Every single message was available in the queue prior to the enabling of the Lambda trigger
- SQS visibility timeout = One hour
- Every single test case is performed in separate environments and different timings
- Lambda will perform nothing, but merely sleep for a specific period of time
These were the results:
– Normal Use Cases
A total of 1000 messages, and sleep for three seconds:
> Nothing interesting is going to happen
> It is going to work just as good as it is expected to
> Consuming the messages quickly
> Cloudwatch not going ahead and registering the scaling process
– Normal Use Cases + Heavy Loads
Again, we’ve got three seconds sleep but a total of 10000 messages instead:
> Over the concurrency limit
> Scale-out process taking more than the execution of first Lambdas
> No throttle
> A bit longer time for consuming all of the messages
– Long-running lambdas
Returning to 1000 messages, and having 240 seconds sleep:
> AWS handles the scale-out process for the internal workers.
> Getting about 550 concurrent lambdas running
– Hitting the concurrency limit
Also, taking 240 seconds of sleep while pushing to the limit of 10000 messages, as concurrency limit being set to 1000:
> AWS will react to the number of messages available in SQS, and then goes ahead with scaling internal workers to a point where concurrency limit gets reached
> This leaves it helpless when predicting the number of Lambdas it may run, and it will at last start to throttle.
> Throttled Lambdas will start returning exceptions to workers signaling them to stop, but it will continue to try hoping it’s not our global limit but merely other functions taking this pool.
> AWS shall not retry function execution
> Message will be returned to the queue after the defined VisibilityTimeout
> Some invocations will be seen even after 23:30
A similar scenario will occur while setting your very own reserved concurrency pool. Running the same test with a max of 50 concurrent executions, will end up being extremely low because of the throttling.
– Multiple Lambda workers
You get the chance to subscribe multiple functions to simply a single queue!
10000 messages are sent to a queue which was set as an event source for four distinct functions:
> Every Lambda gets executed for about 2500 times
> This setup is behaving similar to a load-balancer, but without the ability to subscribe Lambdas from differing regions nor creating a global load balancer.
SQS as an event source for Lambda provides you with the possibility to process messages in a simple way without the need for containers nor EC2 instances. However, keep in mind that these are Lambda workers which means that this solution won’t be suitable for heavy load processing, since there is a limit of 5-minute timeout, along with memory constraints and the concurrency limit.