The case for continuous delivery

Building functionality that really delivers the expected customer value

By now, many of us are aware of the wide adoption of continuous delivery within companies that treat software development as a strategic capability that provides competitive advantage. Amazon is on record as making changes to production every 11.6 seconds on average in May of 2011. Facebook releases to production twice a day. Many Google services see releases multiple times a week, and almost everything in Google is developed on mainline. Still, many managers and executives remain unconvinced as to the benefits, and would like to know more about the economic drivers behind CD.

First, let’s define continuous delivery. Martin Fowler provides a comprehensive definition on his website, but here’s my one sentence version: Continuous delivery is a set of principles and practices to reduce the cost, time, and risk of delivering incremental changes to users.

Read more…

Comment |

5 Unsung Tools of DevOps

A Free Velocity Report

cover-5-unsung-toolsWhen I first started as a sysadmin many years ago, I quickly realized what a daunting task was before me. Like any good engineer, I took to finding the right tools to keep at hand to make light work out of the most difficult situations. This in itself was quite an endeavor, as over the years there has been a proliferation of tools and scripts. Many are of the artisanal, organic, hand-crafted variety, forged out of bash pipelines by our forefathers.

Much of that has changed now as the DevOps movement strengthens. With closer interaction between developers and operations, or even operations teams composed of developers, the tools have significantly improved. Treating infrastructure as code with automated configuration management and provisioning tools have freed many from the menial tasks of creating snowflake systems, and we’ve turned our attention to the more important matters of scaling and optimizing our systems.

In my Velocity report, 5 Unsung Tools of DevOps, I highlight a few of the tools that have gone unnoticed—or at least unrecognized—for some time. These are but a few of the tools that recognized needs early on and that successfully solve real-world problems that you’re likely to encounter today. Here is a brief synopsis:

Read more…

Comment |

Velocity: Toward the Real-Time Business

Velocity 2013 Speaker Series

I want to start by thanking John and Steve for the warm welcome. They’ve created something very amazing with Velocity, and I’m excited to be a part of it.

It might seem a bit odd to talk about What’s Next at the beginning of a conference, but I figure the best time to go to the bank and ask for a loan is when you actually have some money.

What we’ve been talking about at Velocity, especially the DevOps side of things, is only the tip of the iceberg when it comes to how businesses are changing. And that shift is from the sequential to the concurrent. It used to be that we threw things over a series of walls, from Product Management to Design, to Development, to QA, to Production, to Customer Service and so on. That was an old world of software and one-year development cycles.

Read more…

Comment |

Building for Failure Is a Recipe for Success

How you handle failure can mean the difference between "just another incident" and a revenue-stealing accident.

I was ready to get home. I’d been dozing throughout the flight from JFK to SFO, listening to the background chatter of Channel 9 as a lullaby. Somewhere over Sacramento, the rhythmic flow of controller-issued clearances and pilot confirmations was broken up by a call from our plane:

“NorCal Approach, United three-eighty-nine.”
“United three-eighty-nine, NorCal, go.”
“NorCal, United three-eighty-nine, we’d like to go ahead and…”

My headphones went silent, Channel 9 shut off.

I didn’t think too much of it as we continued our descent, flight attendants walking calmly through the cabin, getting us ready for landing. I had noticed our arrival path was one I was unfamiliar with, but nothing else seemed out of the ordinary… until we turned onto the final approach. In the turn, I noticed the unmistakable glint of firetrucks’ rotating red lights, lined up alongside the runway.

Read more…

Comment |

Starting Small with Great Expectations

Explicit expectations are key to operating at scale

Our lives are rife with expectations.

When we flip the light switch, we expect electrons to flow; when we issue CPU instructions, we expect to get the correct answer; when we look at commit logs in the source repository, we (hopefully) expect tests to accompany them and that our colleagues have run them, pre-checkin. But we’ve all probably been burned by these types of assumptions at some point.

In an operational environment, like the large scale websites and build farms we’re responsible for, these sort of expectations can be a costly cause of errors, and are one of the prime sources of miscommunication. Many a postmortem has uncovered that some expectation the ops team had of the development team was actually an assumption… and we all know that old saying about assumptions and donkeys.

Read more…

Comment |

Efficient, Effective Communication Still Often Elusive

In the operational environment, miscommunication can be costly; but there are some easy ways to improve it.

Editor’s note: This is part two in a four-part series on the “-ations” of aviation that can provide further insight into DevOps best practices and achieving them. Part one, on how standardization helps organizations scale and is actually a part of healthy DevOps culture, can be read here.

Communication is an enigmatic topic when it comes to engineering. Parts of our jobs—blueprints, chemical formulae, and source code—require extremely precise forms of communication (even if it doesn’t end up communicating to the steel, molecules, or silicon what we intended). But when it comes to email threads sifting through requirements, meetings about implementation styles and risk assessment, and software design documentation, we often fumble.

Let’s face it: there’s a reason the “engineer equals bad communicator” stereotype exists. But there are some simple things that can be done, both individually and technologically, to begin challenging that stereotype.

Dual Navigation Receivers Required

There are obviously many forms of communication. In an operational context, it’s useful to distinguish between static and active communication.

Read more…

Comment |

Process Is Not a Four-Letter Word

Standardization done right can save your sanity and improve your culture

Capital-P “Process” ™ is something many software developers, operations engineers, system administrators, and even managers love to hate.

It is often considered a productivity-killing, innovation-stifling beast whose only useful domain is within the walls of some huge, hulking enterprise or sitting in a wiki nobody ever reads.

I have always found distaste for process fascinating and now even moreso that configuration management and version control have become such core tenets of the DevOps movement. The main purpose of those tools is to provide structure for software development and operations to increase reproducibility, reliability, and standardization of those activities.

Read more…

Comment |

The Rise of Infrastructure as Data

Simplifying IT automation

IT infrastructure should be simpler to automate. A new method of describing IT configurations and policy as data formats can help us get there. To understand this conclusion, it helps to understand how the existing tool chains of automation software came to be.

In the beginnings of IT infrastructure, administrators seeking to avoid redundant typing wrote scripts to help them manage their growing computer hordes. The development of these in­house automation systems were not without cost; each organization built its own redundant tools. As scripting gurus left an organization, these scripts were often very difficult to maintain by new employees.

As we all know by the huge number of books written on the topic, software development sometimes has a large amount of time investment required to do it right. Systems management software is especially complex, due to all the possible variables and corner cases to be managed. These in­house scripting systems often grew to be fragile.

Read more…

Comments: 3 |

Go Programming Language for System Administration

OSCON 2013 Speaker Series

Go is the first major systems language to emerge in over a decade, even though computing continues to change at a rapid pace—computers are smaller, faster, and can execute operations in parallel via multi core processors. Even languages like Python and Ruby have grown in popularity in recent years among system administrators, operations, and DevOps personnel. Yet, as a relatively new kid on the block, Go is a versatile and robust language that has plenty to offer.

Let’s go through the list:

Open: It’s Open Source—Go has been open source software since November 2009, reaching Version 1 in March of 2012. It includes a language specification, standard libraries, and custom tools. Being open, Go has long-term stability.

Concurrency: Go provides support for concurrent execution and communication. There is no need to learn multiple ways of dealing with threads. Go greatly simplifies threading by providing goroutines and channels.

Fast compilation: Go compiles at a break-neck speed. It has robust dependency analysis and a rigid dependency specification to avoid wasting time with unused dependencies.

One binary to rule them all: Have you ever had to distribute your script or binary to multiple systems, then worry about libraries and dependencies in general? With Go, you simply don’t have to worry about dependencies. Gc, Go’s default compiler statically links its binaries. You can use go build to compile your code and then distribute it to multiple machines with minimal effort.

Feature rich standard library: The language has a great standard library and many third party Go packages maintained at Bitbucket, Github, Launchpad, or Google Project Hosting.

Readability: Go ceases the debate about the best style of programming by providing a code formatter tool (gofmt) and enforcing it in its standard library. The code you write today will be much easier to read and maintain in a few months or even years by simply sticking to gofmt.

And, Go is a language that grows with you. Take the tour

Comment |

Zero Downtime Application Updates with Ansible

OSCON 2013 Speaker Series

Automating the configuration management of your operating systems and the rollout of your applications is one of the most important things an administrator or developer can do to avoid surprises when updating services, scaling up, or recovering from failures. However, it’s often not enough. Some of the most common operations that happen in your datacenter (or cloud environment) involve large numbers of machines working together and humans to mediate those processes. While we have been able to remove a lot of human effort from configuration, there has been a lack of software able to handle these higher-level operations.

I used to work for a hosted web application company where the IT process for executing an application update involved locking six people in a room for sometimes 3-4 hours, each person pressing the right buttons at the right time. This process almost always had a glitch somewhere where someone forgot to run the right command or something wasn’t well tested beforehand. While some technical solutions were applied to handle configuration automation, nothing that could perform configuration could really accomplish that high level choreography on top as well. This is why I wrote Ansible.

Ansible is a configuration management, application deployment, and IT orchestration system. One of Ansible’s strong points is having a very simple, human readable language – it allows users very fine, precise control over what happens on what machines at what times.

Getting started

To get started, create an inventory file, for instance, ~/ansible_hosts that defines what machines you are managing, and which machines are frequently organized into groups. Ansible can also pull inventory from multiple cloud sources, but an inventory file is a quick way to get started:

Now that you have defined what machines you are managing, you have to define what you are going to do on the remote machines.

Ansible calls this description of processes a “playbook,” and you don’t have to have just one, you could have different playbooks for different kinds of tasks.

Let’s look at an example for describing a rolling update process. This example is somewhat involved because it’s using haproxy, but haproxy is freely available. Ansible also includes modules for dealing with Netscalers and F5 load balancers, so this is just an example — ordinarily you would start more simply and work up to an example like this:
Read more…

Comment |