Making Systems Operable

Velocity 2013 Speaker Series

There’s an old joke about the aviation cockpit of the future that it will contain just a pilot and a dog. The pilot will be there to watch the automation. The dog will be there to bite the pilot if he tries to touch anything.

Although they will all deny it, the majority of modern IT developers have exactly this view of automation: the system is designed to be self regulating and operators are there to watch it, not to operate it. The result is current systems are often inoperable, i.e. systems they cannot be effectively operated because their functions and capacities are hidden or inaccessible.

The conceit in the pilot-and-the-dog joke is that modern systems do not require operation, that they are autonomous. Whenever these systems are exhibited, our attention is drawn to their autonomous features. But there are no systems that actually function without operators. Even when we claim they are “unmanned”, all important systems have operators who are intimately involved in their function: UAV’s are piloted, the Mars rover is driven, the satellites are managed, surgical robots are manipulated, insulin pumps are programmed. We do not see these activities–many are performed by workers who remain anonymous–but we depend on them.

Operators have never had the status that hardware and software developers claim. They are, in sociological terms, blue-collar workers. Often their job history has little in common with the white-collar workers that design and develop the systems they operate. Their work is typically shift-driven and there are long periods where they seem to do little except monitor the system. They tend to be clannish and distant; they have their own lingo, their own hierarchy, and their own prejudices. Perhaps most importantly, they have knowledge and expertise that is not shared or appreciated outside of their community. In this they are like other front-line operator communities: police, fire, nurses, doctors, and deep-shaft miners all have the same qualities.

What distinguishes these communities is that their members all work at the sharp-end of practice. This means that their work puts them in direct contact with the minute-by-minute function of the system itself and their decisions and actions are both immediate and potentially crucial. They deal with what is happening directly and purposefully. Although others may talk about the future or how things are supposed to work, operators deal with the here-and-now and how things work in practice.

Especially for complex, multi-purpose systems, the gap between how things are supposed to work and how they actually work can be quite large. (Ask any police sergeant about the difference between policing in theory and policing in practice!) A primary function of operators is to bridge this gap in ways that result in better rather than worse outcomes. The capacity of systems to be operated is what allows operators to perform this valuable function, sometimes called technical work.

It is relatively easy to design systems that are inoperable. By imagining that the operator is the passive recipient of information about the system the designer can easily manufacture a system without affordances. We see such systems all too often. Design for operability requires not just more imagination. It requires the explicit acknowledgement that the designer does not and cannot know the situations and conditions that operators will confront. This, in turn, requires that designers provide information about the system’s capabilities and functions and allow operators direct access to those capabilities and functions.

This is a radical proposal. Experience with difficulties in operations (especially in maintenance) have led to “operator-protected” systems. These systems are deliberately designed to keep operators from influencing critical configuration, production, and response functions. The goal of many designers is a “lights off” environment that has no operators at all. Yet the repeated experience with failing and failed systems is that operators will be called on to do more than monitor the blinking lights. It is the responsibility of system designers and owners to give operators the tools and experience needed to act in ways that we cannot presently foresee.

How can we learn how to design such systems? The answer lies in the study of the operator communities themselves. Such research is underway in a number of communities outside of IT, notably in nursing. Understanding how operators bridge the gap between the world as imagined and the world as experienced is a good starting place. Understanding the messy details of technical work is one way to get inside the operators’ world. Discovering what makes work hard is a good starting place.

But the first step is to overcome the pilot-and-a-dog model of future systems and recognize that operators are essential to the safe and successful functioning of all important systems–present and future.

This is one of a series of posts related to the upcoming Velocity conference in New York City (Oct 14-16). For a more in-depth look at this topic, be sure to attend Dr. Cook’s keynote address on Tuesday, October 15.


Sign up for the O'Reilly Programming Newsletter to get weekly insight from industry insiders.
topic: Web Perf/Ops
  • williamlouth

    There is nothing radical about

    “designers provide information about the system’s capabilities and functions and allow operators direct access to those capabilities and functions.”

    This has been pretty much the approach for a very long time though admittedly the degree to which this has been achieved is pretty dismal.

    A more radical and viable approach is for machines to manage the machines through new mechanisms such as mirroring & simulation, adaptive signals & boundaries, and self regulation via adaptive control & QoS. How many applications and systems above the network have these capabilities?

    It it foolish to think that a human operator with greater observation is going to make much difference in managing the complexity we must deal with today. Here is an article on what this entails. Pretty much NONE.

    Now for a truly radical and inspiring approach to this problem take this article one which nature has already adopted through our own evolution.

  • williamlouth

    I would also recommend that you distinguish between the function and operation of the system. Most operators have a minuscule understanding of the function and its internal workings. We can’t let them have access only to drop a spanner in the works and impact the service functioning. What operators do need is the ability to influence the operational behavior in planning, scheduling and executing functions. Control still needs to reside with the machine because we simply cannot respond effectively at the same resolution and across a vast space. Operators need a means to train, taint and influence the adaptive control mechanism embedded within the system along various lines: risk, performance, cost,….