Introduction
The rapid adoption of virtualization over the
last decade solved many of the problems associated with application and server provisioning. Gone are the days when deploying a new application
meant specifying a new server to run it on, getting sign-off for the purchase order, waiting a few weeks for the server to be delivered, installing the server, installing the application, then releasing it to Production for use. Not to mention providing
for Dev, Test, and DR copies of the new production server. Given all that it was typically months from go ahead to actual use of a new server and the application running on it.
Virtualization changed everything. By eliminating the need to provide dedicated servers for each application and service, IT departments could host many applications on the same physical hardware, while also preventing configuration conflicts. The long
lead times for provisioning a new server to host an application were also eliminated. Virtual servers could be provisioned in minutes in the brave new world of virtualization. Suddenly infrastructure was dynamic and could be provisioned and changed
quickly as required.
Challenges
Many IT departments soon found that this dynamic infrastructure that allowed easy creation of virtual servers led to a whole new set of management issues. Such as:
- Server sprawl – Virtualization, and today public cloud services, make it easy to provision new servers from pools of resources. This can quickly lead to a situation in which the number of servers grows faster than the IT departments ability
to manage and update them.
- Configuration drift – Even if all virtual servers are configured from a standard set of templates so they start out with identical configurations, they can change over time. As administrators make changes to one server that isn’t made
to another, or is a new service is installed on some of them, the server configurations can drift.
- Snowflake servers – Most IT administrators have encountered a particular server that runs a mission critical piece of software that will only run on a particular specific configuration of operating system and application server. These servers
can’t be upgraded and so over time they get further and further removed from the standard configurations and take more and more IT resource to manage.
- Fragile Infrastructure – The only thing worse than having a Snowflake Server is having lots of them. If this is the case, then the whole infrastructure can be said to be fragile. With lots of different servers that require special configurations
and management. Even in a virtualized environment where snapshots and server clones can be used, this type of infrastructure is easily disrupted and hard to fix.
- Fear of Automation – Even though templates and standard server builds can be used to provision new servers in a virtualized environment, most IT departments are reluctant to fully automate the management and updating of existing servers in case
a change does damage to a lot of virtual servers. So management of the infrastructure is still done in largely traditional ways using manual tools.
Virtualization should simplify the management of server infrastructure. However in many deployments it hasn’t delivered on this promise due to the reasons outlined above. Many IT departments have a large investment in their current virtualized infrastructure
and taking a step back to start again and apply new management processes is difficult.
With new ways of working come new opportunities. At the present time we are starting to see adoption of public and private cloud infrastructure as a way to provision servers and provide IT services. This paradigm shift provides an opportunity to reboot
the management processes used for server infrastructure.
Infrastructure as Code
Over the last few years many software development and IT departments have adopted the Agile and DevOps approach to IT service provision. DevOps has four components that provide a unified development and operations framework for service delivery. These components are Culture, Automation, Measurement, and Sharing, or
CAMS for short. Infrastructure administrators have taken several development practices such as use of version control systems, automated testing, and deployment orchestration, and adopted them to help manage infrastructure. This approach is known
as Infrastructure as Code.
Infrastructure as Code targets the Automation component of CAMS. The goals of Infrastructure as Code are:
- Enable rapid infrastructure change so that it becomes a business enabler rather than being a constraint.
- Make changes to infrastructure a routine operation and not a major project that has to be planned well in advance.
- Reduce the drama and stress on IT staff who have to manage a dynamic infrastructure platform.
- Enable improvements to be deployed continuously rather than in periodic large rollouts.
- Enable IT staff to spend their time on meaningful valuable work and not micro-management of servers.
- Enable business departments and users to provision their own resources as required without need for requests to IT departments.
- Enable rapid recovery from failures rather than over engineering infrastructure to try to prevent all failures.
Enabling this is not based on picking a particular tool and deploying it. There are lots of tools that can help deliver an Infrastructure as Code management setup, like Chef, Puppet, PowerShell scripting tools, and others. The tools in use are not the
most important factor. The most important thing is getting to a position where the infrastructure deployment and management can be fully automated. Everything that can be scripted should be scripted so that each time a task is performed it is done
in an agreed and consistent way. Any task that can’t be scripted should be looked at to see if it can be done in a different way using a tool that can be scripted. Standard text based definition files should be created for all the server and
application setups for the automation tools to read and use when deploying and updating services.
Adopting Infrastructure as Code practices provide many benefits:
- Systems should be easily reproduced – it makes it possible to quickly and reliable rebuild all parts of the infrastructure. Configuration settings for each component will already be in the definition files that are read by the automation tools
when building servers and components. This ability to quickly rebuild any part of the infrastructure removes a barrier to rapid change. There is no fear or risk in change as the recovery from any issues is rapid and complete.
- Systems are disposable – Any component of the infrastructure can be removed and replaced if required. Again this makes it easier to handle rapid change as any system can be replicated and worked on, and existing systems disposed of rapidly and
safely.
- Configurations are consistent – using standardized scripts and definition files ensures that servers and components are deployed in a consistent way. There can still be different definitions for different types of servers, but servers of the
same type can be consistently configured.
- Processes are repeatable – changes and updates to the infrastructure are standardized and repeatable. This ensures that issues such as configuration drift do not occur. If a change is needed for a server then the scripted process makes sure
it is done in the agreed and supported way every time. There is no personally favored setup and different configurations from system administrators in the IT department.
- Changing requirements are delivered – having fully automated processes allows changes to be made centrally when new requirements or best practices are applied.
- Version control can be used – using text based scripts and definition files allows them to be stored in a version control system and managed just like code. This ensures that infrastructure configuration is treated in the same way as software
changes in a DevOps environment. Changes to scripts can be rolled back to previous versions if required. Just like code can be in applications.
Conclusion
As mentioned previously adoption of full automation may be a daunting task for IT departments with a mature virtual infrastructure. However the move to use more and more cloud based services, many in hybrid deployments, provides an opportunity to start
applying Infrastructure as Code management methods to these new Cloud based servers. Over time as more and more Cloud based servers are deployed, and on premise infrastructure is refreshed, a consistent and fully automated infrastructure management
framework can be built. It may take a few years, and a complete server refresh cycle to get to a fully automated system, but it is a goal worth aiming for.