Web Perf/Ops Posts
Phil Dibowitz explains the challenges and the results they got with Chef
At OSCON, Phil Dibowitz reminded me how little I understand about large systems – as he puts it, really large systems, systems of systems, with some similarities but with different people controlling parts. His work at Facebook explores the challenges (and opportunities) of creating tools that work across a company’s many networks and computers.
If you deal with such challenges, he’s worth listening to as a model. If not, he’s worth listening to for a sense of just how different work at this scale can be, though much of what he accomplishes can be worthwhile at scales much smaller than the 17,000 servers he describes for Facebook at 26:42 in the session.
I talked with him in an interview:
and we’ve posted his OSCON session:
A Free Velocity Report
When I first started as a sysadmin many years ago, I quickly realized what a daunting task was before me. Like any good engineer, I took to finding the right tools to keep at hand to make light work out of the most difficult situations. This in itself was quite an endeavor, as over the years there has been a proliferation of tools and scripts. Many are of the artisanal, organic, hand-crafted variety, forged out of bash pipelines by our forefathers.
Much of that has changed now as the DevOps movement strengthens. With closer interaction between developers and operations, or even operations teams composed of developers, the tools have significantly improved. Treating infrastructure as code with automated configuration management and provisioning tools have freed many from the menial tasks of creating snowflake systems, and we’ve turned our attention to the more important matters of scaling and optimizing our systems.
In my Velocity report, 5 Unsung Tools of DevOps, I highlight a few of the tools that have gone unnoticed—or at least unrecognized—for some time. These are but a few of the tools that recognized needs early on and that successfully solve real-world problems that you’re likely to encounter today. Here is a brief synopsis:
Maybe it was the hustle and bustle of Times Square just within earshot. Maybe it was the smell of that legendary pizza on every corner. Or maybe it was having a front row seat in the financial capital of the world while the drama of a possible government default played itself out like a Broadway show. Whatever it was, Velocity New York felt markedly different than its west coast cousin.
But discussions on scaling the operations of the squishy side of the organization–developing healthy team interactions, talking through the realities of the ever-elusive DevOps culture, and what it all means to scaling not just web sites but full-fledged businesses–were more present than ever.
Hadoop, Sqoop, and ZooKeeper
Kathleen Ting (@kate_ting), Technical Account Manager at Cloudera, and our own Andy Oram (@praxagora) sat down to discuss how to work with structured and unstructured data as well as how to keep a system up and running that is crunching that data.
Key highlights include:
- Misconfigurations consist of almost half of the support issues that the team at Cloudera is seeing [Discussed at 0:22]
- ZooKeeper, the canary in the Hadoop coal mine [Discussed at 1:10]
- Leaky clients are often a problem ZooKeeper detects [Discussed at 2:10]
- Sqoop is a bulk data transfer tool [Discussed at 2:47]
- Sqoop helps to bring together structured and unstructured data [Discussed at 3:50]
- ZooKeep is not for storage, but coordination, reliability, availability [Discussed at 4:44]
You can view the full interview here:
Maintaining a desired behavior
In two previous posts (Part 1 and Part 2) we introduced the idea of feedback control. The basic idea is that we can keep a system (any system!) on track, by constantly monitoring its actual behavior, so that we can apply corrective actions to the system’s input, to “nudge” it back on target, if it ever begins to go astray.
This begs the question: Why should we, as programmers, software engineers, and system administrator care? What’s in it for us?
Velocity 2013 Speaker Series
I want to start by thanking John and Steve for the warm welcome. They’ve created something very amazing with Velocity, and I’m excited to be a part of it.
It might seem a bit odd to talk about What’s Next at the beginning of a conference, but I figure the best time to go to the bank and ask for a loan is when you actually have some money.
What we’ve been talking about at Velocity, especially the DevOps side of things, is only the tip of the iceberg when it comes to how businesses are changing. And that shift is from the sequential to the concurrent. It used to be that we threw things over a series of walls, from Product Management to Design, to Development, to QA, to Production, to Customer Service and so on. That was an old world of software and one-year development cycles.
How you handle failure can mean the difference between "just another incident" and a revenue-stealing accident.
I was ready to get home. I’d been dozing throughout the flight from JFK to SFO, listening to the background chatter of Channel 9 as a lullaby. Somewhere over Sacramento, the rhythmic flow of controller-issued clearances and pilot confirmations was broken up by a call from our plane:
“NorCal Approach, United three-eighty-nine.”
“United three-eighty-nine, NorCal, go.”
“NorCal, United three-eighty-nine, we’d like to go ahead and…”
My headphones went silent, Channel 9 shut off.
I didn’t think too much of it as we continued our descent, flight attendants walking calmly through the cabin, getting us ready for landing. I had noticed our arrival path was one I was unfamiliar with, but nothing else seemed out of the ordinary… until we turned onto the final approach. In the turn, I noticed the unmistakable glint of firetrucks’ rotating red lights, lined up alongside the runway.
Velocity 2013 Speaker Series: Focus on Web Apps, Not Web Pages
We’re not making web pages anymore; we’re building web applications. Gone are the days of a few script tags in the <
head>. Apps today are a complex web of asynchronously-loaded content and functionality. In the past decade, we’ve progressed from statically-loaded HTML to AJAX-ifying all the things. However, the way we’ve been measuring real user performance of our apps hasn’t changed to reflect our new state of art.
At what point during page load do users consider an app to be “ready enough” to start using? If we use standard performance metrics, we have to choose one of the following:
1) When the HTML document has been completely loaded and parsed, but before stylesheets, images, and subframes have finished loading (
2) When all synchronous scripts, stylesheets, images, and subframes have finished loading (
If we pick
DOMContentLoaded, it quickly becomes clear that there’s no inherent correlation between the app state at that point and what a user would consider “ready.”
Velocity 2013 Speaker Series: A Sneak Peek With WebPagetest and Appurify
Now that more companies have basic mobile strategies in place, they are turning their attention to the issue of performance.
Mobile developers are thinking about how fast their apps and mobile webpages load and—more importantly—what they can do to make them faster. Consumers have little patience for slow loading apps and their expectations are only going to get more stringent. This expectation likely contributed to Apple making changes so that apps on iOS 7 load 11% faster than on iOS 6.
The challenge is that measuring performance for mobile is not as easy as it is for web. Many of us have used tools like WebPagetest to assess website performance across different browsers/locations and pinpoint areas for improvement but fully functional, equivalent tools don’t exist yet for the mobile space.
This has left mobile developers ill equipped to create the highest-performing mobile apps and websites.
Velocity 2013 Speaker Series
In 2002, US Secretary of State Donald Rumsfeld told a reporter that not only don’t we know everything important, but sometimes we don’t even know what knowledge we lack:
There are known knowns; there are things we know we know. We also know there are known unknowns; that is to say we know there are some things we do not know. But there are also unknown unknowns – the ones we don’t know we don’t know.
One of the purposes of monitoring is to build early-warning systems to alert of problems before they become serious. But how can we recognize a failure in its early stages? It’s a thorny question.