
And once max is reached the message will end up in DLQ. It looks like we can find a proper value here.īut once the quota is reached, SQS will still retry to send messages. The value could be calculated by the account's maximum allowed concurrency in conjunction with the number of parallel tasks of the step-functions. However, this leaves us with the exact same issues described in (3).Īssuming we have an SQS in front of the SFN, the SQS-consumer can be configured with a fixed provision concurrency. Because the SQS-consumer is not the indicator of the workload, as it only triggers the step-functions.ĭue to uneven workload this is not optimal as it would be better to have the concurrency distributed by actual workload rather than by chance.īy using Kinesis data stream with predefined shards and batch-sizes we can implement the logic of rate-limiting. As this solution works quite good for their use-case, I am actually not convinced it would be a good approach for our use-case. Summary: By using a virtually grouped items we can define the number of items that are being processed. Something along the line what this blog-post is explaining. Without any more logic this would not solve the concurrency issues. However, a simple queue on top does not provide rate-limiting.Īs SQS can't be configured to execute SFN directly a lambda in between would be required, which then triggers then SFN by code. This will be one of our choices nevertheless, as it is always good to have a queue on top of such number of events. Besides, it will have impact to the clients, as they will receive a 4x response.

This would only address one component, as the API is not the only trigger of the step-functions. It doesn't resolve the root cause nor does it give us any flexibility or room for any custom rate-limiting. This would probably be the easiest solution but it would increase the potential workload quite much.

SQS QUEUE LAMBDA FULL
We want to have full control of the rate-limiting. The issue with that is that we reach throttling due to the maximum number of concurrent lambda executions (1K per account).
