When working with a Serverless first application, it’s really common to use SQS, SNS, and EventBridge. The EventBridge and SNS interfaces generally perform how you’d expect.
You can configure which Lambda to run. You can even apply filters to the content of the message (for EventBridge) and filters based on the message attributes (for SNS). These two integrations also have fairly understandable backoff and retry rates should the quantity of invocations exceed your Lambda capacity. SNS will retry the message delivery 100,015 times over 23 days. And EventBridge will retry the event 185 times over 24 hours. You can configure both of these retry patterns.
However, the SQS integration with Lambda has several weird quicks to it. You can read a detailed overview of them here.
To summarize, AWS uses 5 lambda to long-poll for messages from SQS. This means, Lambda essentially takes precedent any configuration to how long it should poll set on your SQS queue. This also means if you set your Lambda’s reserved concurrency to anything under 5, you will see very odd behavior with your invocations.
For example, if you have a concurrency limit on your Lambda to 1 and there are 5 messages in your SQS queue, Lambda will get those 5 message in as little as 1 event with 5 messages and as many as 5 events with 1 message each.
Each of these potential events will be delivered to your Lambda at the exact same time. Yet, only one of those events will actually be invoked. The other payloads will be discarded and not triggered. Because of SQS’s visibility timeout, these discarded messages will not be available to your Lambda for the length of the timeout.
As you can see, this could be a problem when trying to use a SQS Queue.
It is an antipattern to slow down your system. You should most likely look to increase or remove the concurrency limit on your Lambda. But, there are times where it makes sense to limit it.
If you need to limit your Lambda’s concurrency, one solution I’ve found is to use a cron every 1-2 minutes and then inside the cron process the SQS Queue manually.
Example serverless.yml
processMessages:
handler: src/processSQSMessages.handler
description: "Query SQS for messages and process them."
events:
- schedule: rate(2 minutes)
Example Handler.ts
export async function handler() {
const sqsClient = new SQS({ region: getRegion() });
console.log('Quering queue');
const response = await sqsClient
.receiveMessage({ QueueUrl, MaxNumberOfMessages: 1, VisibilityTimeout: 900, WaitTimeSeconds: 0 })
.promise();
const [message] = response?.Messages || [];
if (!message || !message.ReceiptHandle) {
return;
}
// process message here
console.log('Deleting message');
await sqsClient.deleteMessage({ QueueUrl: hydrationCommon.hydrationQueueUrl, ReceiptHandle: message.ReceiptHandle }).promise();
console.log('Finished processing messages');
}
All of this being said, the real solution should be to not limit your Lambda or SQS processing capacity.
Here is a quote from the https://d1.awsstatic.com/whitepapers/architecture/AWSWell-ArchitectedFramework.pdf:
If you make a poor capacity decision when deploying a workload, you might end up sitting on expensive idle resources or dealing with the performance implications of limited capacity. With cloud computing, these problems can go away. You can use as much or as little capacity as you need, and scale up and down automatically.