How Can We Help?

AWS Lambda SQS Batch Size

You are here:
← All Topics

This article provides a detailed overview of AWS Lambda error handling with SQS integration.


Batch Size and Error Handling with the SQS integration

While you set up your SQS event integration, you get the chance to configure a “batchSize” property, which would specify the maximum number of SQS messages to be sent to the Lambda function every time a single trigger is set.

  • It is considered as a very interesting and important property, but caution is necessary for assuring it’s correctly being tuned for fitting your every single need.
  • Batches of messages being sent into one single invocation is capable of reducing costs and speeding up the whole act of processing the messages.
  • In case a Lambda function requires to perform a costly operation every time it tries to spin up, for example the initialization of a database connection or the download of a dataset for enriching messages, it is possible to save time through the processing of various messages just in one single batch.
  • You are going to efficiently amortizing the previously costly operation across a multiple number of messages, instead of having to pay for the cost of every message passing through.

Yet, it is required to be aware of how batches are utilized with SQS integration.

SQS: Traditional messaging system where messages are placed into a queue to get processed. A worker reads one message from the queue and then works it. When the work ends in success, the worker then removes this message from the chosen queue and retrieves another message to start with processing it.

 

SQS / Lambda integration: A batch of messages may either succeed or fail together. This is considered to be a significant point. For Ex, if your batchSize gets set to a number of 10 messages, which comes as the default.

  • When the function gets invoked with a total of 10 messages, and it function returns back an error as it tries to process the 8th message, the whole 10 messages shall stay in the queue for getting processed by another Lambda function.
  • AWS is going to merely delete the messages from the selected queue in case this function had been returned in a successful manner and with no errors.

In a case, if there is a possibility for one of your messages failing while the others succeed, then resiliency must be planned in the architecture. This may be handled in a couple of different ways, where some of which include:

  • Relying on a batchSize of “1”, which makes messages either succeed or fail each one on their own.
  • Assuring the idempotency of your processing, so that the reprocessing of a message doesn’t turn out to be harmful, regardless of the additional processing cost.
  • Handling the errors that are found within the function code, maybe through finding them and transferring the message to be sent to a dead letter queue to get further processed.
  • Calling DeleteMessage API in a manual manner inside the function when the message gets successfully processed.

Which approach gets chosen by you will be depending on the requirements of your architecture.


Event Source Mapping and Batch Size
  • While trying to set up your event source mapping, you will find that there won’t be much to configure.
  • One of the properties which may be found relevant in regard to the retry behavior is that of the batch size with the name BatchSize.
  • SQS API provides the capability of retrieving a number of multiple messages in a single request & then AWS shall invoke your Lambda using a batch of 1 to 10 messages according to the configured batch size.
  • The below shown example utilizes the AWS CLI for the sake of mapping a function called my-function to a DynamoDB stream which is specified by its ARN, relying on a batch size being 500.
$ aws lambda create-event-source-mapping --function-name my-function --batch-size 500 --starting-position LATEST \
   --event-source-arn arn:aws:dynamodb:us-east-2:123456789012:table/my-table/stream/2019-06-10T19:26:16.525
 {
"UUID": "14e0db71-5d35-4eb5-b481-8945cf9d10c2",
"BatchSize": 500,
"MaximumBatchingWindowInSeconds": 0,
"ParallelizationFactor": 1,
"EventSourceArn": "arn:aws:dynamodb:us-east-2:123456789012:table/my-table/stream/2019-06-10T19:26:16.525",
"FunctionArn": "arn:aws:lambda:us-east-2:123456789012:function:my-function",
"LastModified": 1560209851.963,
"LastProcessingResult": "No records processed",
"State": "Creating",
"StateTransitionReason": "User action",
"DestinationConfig": {},
"MaximumRecordAgeInSeconds": 604800,
"BisectBatchOnFunctionError": false,
"MaximumRetryAttempts": 10000
}

Event source mappings:

  • Used for reading items found in a stream or queue in batches.
  • Contain a number of items in an event received by your function.

The size of the batch sent to your function by  event source mapping may be configured up to a max which varies according to service. The number of the items found in the event may be less than that of the batch size in case there weren’t enough items available, or in case the batch is bigger than to being sent in simply one event and needs to get split up.

In case of having a batch of events failing at every processing attempt, the event source mapping will start sending details regarding this batch to an SQS queue.


Event batch:

  • The event which is sent by Lambda to the function.
  • A batch consisting of records or of messages that get compiled from items read by the event source mapping out of a stream or a queue.
  • Batch size + other settings are merely applied to the event batch.

  • When it comes to streams, an event source mapping tends to create an iterator for every single shard found in the stream and will then continue to process items located in every shard one at a time in order.
  • The event source mapping may be configured for reading merely newly appearing items in the stream, or to begin with the older items.

Here are few awesome resources on AWS Services:

  • CloudySave is an all-round one stop-shop for your organization & teams to reduce your AWS Cloud Costs by more than 55%.
  • Cloudysave’s goal is to provide clear visibility about the spending and usage patterns to your Engineers and Ops teams.
  • Have a quick look at CloudySave’s Cost Caluculator to estimate real-time AWS costs.
  • Sign up Now and uncover instant savings opportunities.

 

Table of Contents