For startups & SMBs

Docs


Careers

Blog

Pricing

Terms of Use

Privacy policy

Contact us

Sign up

Improving ECS deployment

Simon Obetko, Dev @ Stacktape

February 21, 2024

Enhanced ECS deployment monitoring and improved error messages

In the ever-evolving landscape of cloud computing, deploying applications using container services like Amazon ECS (Elastic Container Service) has become a go-to strategy for many organizations seeking flexibility, scalability, and efficiency. However, even the most seasoned developers can encounter stumbling blocks when it comes to deployment issues, particularly with the often cryptic and unhelpful error messages that can arise.

A prime example of such perplexity is the "Circuit Breaker" or "Deployment timed out" errors, a messages that can leave many developers scratching their heads, wondering where things went wrong in their ECS deployment process.

This technical blog post aims to demystify these error, offering insights and strategies to not only decipher this particular message but to also implement more informative and actionable error handling in your ECS deployment processes, exclusively using AWS tools and SDKs.

Confused dev

Motivation

At Stacktape, our mission is to simplify the application deployment on AWS, making it accessible and manageable for developers of all skill levels. We understand that wrestling with cryptic error messages, such as the dreaded "Circuit Breaker" or "Deployment timed out" errors, can be a major roadblock to this goal. It's not just about deploying faster; it's about deploying smarter, with clarity and confidence and ensuring that every developer, regardless of their experience with ECS, can deploy their applications without getting lost.

I got "Circuit Breaker" error, please help

Deploying containers on Amazon ECS offers a range of options including IaaC solutions such as AWS CloudFormation or Terraform, AWS Management Console and many others. No matter which option you choose, chances are, you will encounter "Curcuit Breaker" or similar confusing errors.

Under the hood Stacktape uses Cloudformation to deploy user environments into their AWS accounts. When a user would get one of the dreaded, vague and confusing errors mentioned above in the past, they would usually end up contacting our support team where conversation would go like this:

Client:

Hey, during deploying I am getting following error: Error occurred during operation 'ECS Deployment Circuit Breaker was triggered'." (HandlerErrorCode: GeneralServiceException). Can you help?

Support:

Yes. This can be caused by multiple root causes. Here is the process to debug the root cause:

  1. Go to your AWS ECS console and find your cluster "XXXXXX". In the cluster go to your service "XXXXXX"
  2. If you are unable to find it, your stack has probably deleted the cluster and the service during rollback. Re-deploy and continue with step 3.
  3. Once you localize your service go to "Deployments" tab. Check the events to get possible root cause of the problem. If not continue with step 4.
  4. In your service page go to "Tasks" tab, then filter tasks with "Stopped" status.
  5. Click on one of the recent tasks (from recent deployment) to see the task detail.
  6. On the top of the task detail page there should be informative message why the tasked stopped. If the message is Essential container in task exited continue with step 7. If the message is Cannot pull container error continue with 8.
  7. Check the "Logs" tab of your task to see the logs of your containers. If there are no logs...
  8. ...

Even an incomplete answer already makes my head spin - and I do consider myself a seasoned ECS user pro. Going through this process must be a gruesome experience for new AWS users - let alone to figure out what to do without some assistance must be impossible.

The more instances of this conversation arised, the more we realised that there must be a better way to do this. Hence we decided to find a solution to give users the answer faster and without the need to contact the support.

No time for that

Deciphering errors

Errors such as "Circuit Breaker" and "Deployment timed out" error messages serve merely as indicators that a deployment has encountered significant issues, but importantly, they are symptoms rather than the root cause of the problem. These errors can signal that the deployment process has failed to complete successfully, whether due to:

  • tasks not passing health checks,
  • resource allocation issues,
  • startup process failing,
  • ... many other underlying problems.

And while a seasoned ECS developer might know where to look to uncover the actual root cause, for others, this might turn into a rabbit hole.

Implementation

Thankfully AWS provides a strong API and SDKs. This means that anyhing you can see in AWS console, you can also get programatically. We decided to implement a automated flow which would detect if something goes wrong during the ECS service deployment and inform the user with details straight away. The flow looks like this:

Flow

During Cloudformation stack update/create, we detect a new ECS Service deployment by monitoring Cloudformation stack events. Once we observe a stack event indicatng that ECS Service is being created or updated, we start the flow.

Results

Before we implemented our new flow, this was the error that user would get:

Before implentation error

After implementing the flow, users now receive invaluable details that can help them start addressing the failure reasons immediately. Moreover, we can inform users about the problem as soon as we detect it, eliminating the need to wait until the entire deployment fails after multiple attempts. This shortens the feedback loop, providing users with answers more quickly.

After implentation error

Conclusion

By going a bit deeper into AWS and ECS we have found a way to provide users with useful information enhancing their cloud development experience. Importantly, our solution is simple and exclusively utilizes libraries provided directly by AWS, avoiding any third-party dependencies - making it easy to maintain and build on.

Want to deploy production-grade apps to AWS in less than 20 minutes?

Let Stacktape transform your AWS into a developer-friendly platform.

Learn more

Stay in touch

Join our monthly product updates.

input icon

Copyright © Stacktape 2024