This is the first in our “living in the clouds” series of posts. Over the past months we have moved from a traditional infrastructure, hosted on our own physical servers to a cloud based infrastructure. This is our story.…
In this post I’ll talk about what has been the most challenging (and possibly rewarding) part. That is having cloud based infrastructure for our own development environments, supporting development infrastructure and the production environment for our own software. As we worked through the problems, we have come across some great tools and techniques for managing our environments.
Recreating our entire infrastructure – on demand (and every night)
Making the decision to decommission our servers and go to the cloud gave us the opportunity to re-architect our infrastructure from scratch. What we wanted to do was ensure Software Configuration Management (SCM) was central to our solution. How could we be sure we had achieved this?By deleting and recreating our entire infrastructure from version control every day.So every night we delete all our Amazon EC2 instances. Our servers (currently 11 Amazon EC2 servers) are then created from the ground up. That is, from a base Amazon AMI, we bootstrap each client using puppet (a fantastic data centre automation tool). Puppet takes over and sets up the servers based on the version controlled “recipes” – definitions of the desired state of each piece of infrastructure. The below picture shows the basic steps in this process:
|Nightly Provisioning Flow|
Why bother? Here are 5 good reasons…
Getting to this stage has involved a fair amount of effort. We have had to learn new tools, develop new processes and techniques and also write an application that could provision EC2 instances via the Amazon API. This is why we think it is well worth the hassle:
- No more debugging configuration issues – ever. “Manual” configuration in our experience is the cause of a huge amount of application downtime and debugging. Making changes outside of the version controlled process has become an impossibility. We can still however test and roll out changes in a matter of minutes in a controlled and reversible manner.
- A side benefit, but a good one – we don’t pay for our EC2 instances overnight when we don’t need them.
- The marginal cost of expansion is never a new server – as opposed to managing our own physical servers. We never hit the situation that when running at near capacity, the cost of increasing capacity by a couple of percent is going out shopping for a new server.
- Lot’s of practice means less excitement – creating new environments is a process that is in effect practised every day. No more spending a week hand crafting a new environment based on what you think production is like – probably. We can get out a new environment built from bare metal up in under 10 minutes.
- Change of focus – we no longer think in terms of servers or hardware, but services and components. This is a big shift in mindset that leads to flexible and innovative solutions to problems