Simon Obetko, Dev @ Stacktape
February 21, 2024
In the ever-evolving landscape of cloud computing, deploying applications using container services like Amazon ECS (Elastic Container Service) has become a go-to strategy for many organizations seeking flexibility, scalability, and efficiency. However, even the most seasoned developers can encounter stumbling blocks when it comes to deployment issues, particularly with the often cryptic and unhelpful error messages that can arise.
A prime example of such perplexity is the "Circuit Breaker" or "Deployment timed out" errors, a messages that can leave many developers scratching their heads, wondering where things went wrong in their ECS deployment process.
This technical blog post aims to demystify these error, offering insights and strategies to not only decipher this particular message but to also implement more informative and actionable error handling in your ECS deployment processes, exclusively using AWS tools and SDKs.
At Stacktape, our mission is to simplify the application deployment on AWS, making it accessible and manageable for developers of all skill levels. We understand that wrestling with cryptic error messages, such as the dreaded "Circuit Breaker" or "Deployment timed out" errors, can be a major roadblock to this goal. It's not just about deploying faster; it's about deploying smarter, with clarity and confidence and ensuring that every developer, regardless of their experience with ECS, can deploy their applications without getting lost.
Deploying containers on Amazon ECS offers a range of options including IaaC solutions such as AWS CloudFormation or Terraform, AWS Management Console and many others. No matter which option you choose, chances are, you will encounter "Curcuit Breaker" or similar confusing errors.
Under the hood Stacktape uses Cloudformation to deploy user environments into their AWS accounts. When a user would get one of the dreaded, vague and confusing errors mentioned above in the past, they would usually end up contacting our support team where conversation would go like this:
Client:
Hey, during deploying I am getting following error: Error occurred during operation 'ECS Deployment Circuit Breaker was triggered'." (HandlerErrorCode: GeneralServiceException)
. Can you help?
Support:
Yes. This can be caused by multiple root causes. Here is the process to debug the root cause:
Essential container in task exited
continue with step 7. If the message is Cannot pull container error
continue with 8.Even an incomplete answer already makes my head spin - and I do consider myself a seasoned ECS user pro. Going through this process must be a gruesome experience for new AWS users - let alone to figure out what to do without some assistance must be impossible.
The more instances of this conversation arised, the more we realised that there must be a better way to do this. Hence we decided to find a solution to give users the answer faster and without the need to contact the support.
Errors such as "Circuit Breaker" and "Deployment timed out" error messages serve merely as indicators that a deployment has encountered significant issues, but importantly, they are symptoms rather than the root cause of the problem. These errors can signal that the deployment process has failed to complete successfully, whether due to:
And while a seasoned ECS developer might know where to look to uncover the actual root cause, for others, this might turn into a rabbit hole.
Thankfully AWS provides a strong API and SDKs. This means that anyhing you can see in AWS console, you can also get programatically. We decided to implement a automated flow which would detect if something goes wrong during the ECS service deployment and inform the user with details straight away. The flow looks like this:
During Cloudformation stack update/create, we detect a new ECS Service deployment by monitoring Cloudformation stack events. Once we observe a stack event indicatng that ECS Service is being created or updated, we start the flow.
Before we implemented our new flow, this was the error that user would get:
After implementing the flow, users now receive invaluable details that can help them start addressing the failure reasons immediately. Moreover, we can inform users about the problem as soon as we detect it, eliminating the need to wait until the entire deployment fails after multiple attempts. This shortens the feedback loop, providing users with answers more quickly.
By going a bit deeper into AWS and ECS we have found a way to provide users with useful information enhancing their cloud development experience. Importantly, our solution is simple and exclusively utilizes libraries provided directly by AWS, avoiding any third-party dependencies - making it easy to maintain and build on.
Let Stacktape transform your AWS into a developer-friendly platform.
Learn more