Current tools make collection and visualization easier but don't reduce work
New tools are raining down on system administrators these days, attacking the “monitoring sucks” theme that was pervasive just a year ago. The new tools–both open source and commercial–may be more flexible and lightweight than earlier ones, as well as more suited for the kaleidoscopic churn of servers in the cloud, making it easier to log events and visualize them. But I look for more: a new level of data integration. What if the monitoring tools for different components could send messages to each other and take over from the administrator the job of tracing causes for events?
Brace yourself: Address exhaustion is coming
IPv6 is the global warming of the computer industry, an impending disaster that most folks don’t seem to be taking as seriously as they should be. We’re well into the exhaustion phase of the IPv4 address space, but most ISPs are still dragging their heels on supplying the wider protocol to the end user.
What Do Tim O’Reilly, Lady Gaga, and Marissa Mayer All Have In Common?
Let’s examine the followers of some popular Twitter users by asking the (Freakonomics-inspired) question, What do Tim O’Reilly, Lady Gaga, and Marissa Mayer all have in common? Although it may initially seem like an obnoxious question to ask, some of the answers may intrigue you once you begin to take a closer look at the data. (Although dashingly good looks might be one thing that they all have in common, we’ll let the data do the talking and stick with Twitter followers as the basis of computing similarity.)
The initial idea behind this entire series on Twitter influence is that it would be an interesting and educational experiment in data science to put Tim O’Reilly‘s ~1.7 million followers under the microscope and explore the correlation between popularity (based upon number of followers) and Twitter influence.
In order to draw some meaningful comparisons, however, we’ll need to consider at least one other account. Marissa Mayer seems like a fine selection for comparison since her Twitter account is similar yet different to Tim’s account. For example, she’s also a “tech celebrity” and business executive. However, her particular expertise is not quite the same, and she only has about one-fourth as many followers. (Or so it would initially appear…)
Just to make this interesting, let’s further mix things up a bit by introducing a wildcard. Lady Gaga seems as good a choice as any to introduce a bit of unexpected fun into the situation. She is one of the ten most popular Twitter users based upon number of followers, an accomplished entrepreneur, and surely draws interest from a broad cross-section of the population. The introduction of a third account also provides the opportunity to draw some additional comparisons, so let’s compute the Jaccard index for the various combinations of these three accounts and see what turns up. The Jaccard index measures similarity between sample sets, and is defined as the size of the intersection divided by the size of the union of the sample sets, or, more plainly, the amount of overlap between the sets divided by the total size of the combined set. This is a simple way to measure and compare the overlap in followers.
The full results (example code, notes, and the results from executing each cell) are available as an IPython Notebook, and you are encouraged to review it in depth. For convenience, a summary of the key results that you’ll see computed in the notebook follow:
An Interview with Neal Ford
I recently interviewed O’Reilly author Neal Ford (Functional Thinking, The Productive Programmer) on the subject of polyglot programming. In 2006, Neal wrote a blog post which resurrected the term, suggesting that as modern applications become more complex, it is important for developers to leverage knowledge of multiple languages and use the right tool for the job. In the interview, we discuss the benefits and challenges of polyglot programming, how it has evolved in recent years, and the impact it’s had on software development.
Some key highlights in our conversation include:
- What is polyglot programming? [Discussed at 0:15]
- What are some of the benefits? [Discussed at 1:39]
- How polyglot programming has affected software development in recent years [Discussed at 4:25]
- Downsides to polyglot programming? What are the trade-offs? [Discussed at 6:22]
- Best practices when starting out in polyglot programming [Discussed at 8:58]
- Resources for keeping up on trends and new technologies [Discussed at 12:48]
The power of a technology now taken for granted
A friend wanted to show me a great new thing in 1993, this crazy HTML browser called Cello. He knew I was working on hypertext and this seemed like just the thing for it! Sadly, my time in HyperCard and an unfortunate encounter with the HyTime specifications meant that I bounced off of it, because markup couldn’t possibly work.
I was, of course, very very wrong.
Markup with some brilliantly minimal hypertext options was about to launch the World Wide Web. The toolset was approachable, easy to apply to many kinds of information, and laid the foundation for greater things to come.
Unlocking Scientific Data with Python
Most people working on complex software systems have had That Moment, when you throw up your hands and say “If only we could start from scratch!” Generally, it’s not possible. But every now and then, the chance comes along to build a really exciting project from the ground up.
In 2011, I had the chance to participate in just such a project: the acquisition, archiving and database systems which power a brand-new hypervelocity dust accelerator at the University of Colorado.
Getting apps into the store is a non-deterministic process
One of the major topics of my Enterprise iOS book is how to plan release schedules around Apple’s peril-filled submission process. I don’t think you can count yourself a truly bloodied iOS dev until you’ve gotten your first rejection notice from iTunes Connect, especially under deadline pressure.
Traditionally, the major reasons that applications would bounce is that the developer had been a Bad Person. They had grossly abused the Human Interface standards, or had a flakey app that crashed when the tester fired it up, or used undocumented internal system calls. In most cases, the rejection could have been anticipated if the developer had done his homework. There were occasional apps that got rejected for bizarre reasons, such as perceived adult content, or because of some secret Apple agenda, but they were the rare exception. If you followed the rules, your app would get in the store.
Phil Dibowitz explains the challenges and the results they got with Chef
At OSCON, Phil Dibowitz reminded me how little I understand about large systems – as he puts it, really large systems, systems of systems, with some similarities but with different people controlling parts. His work at Facebook explores the challenges (and opportunities) of creating tools that work across a company’s many networks and computers.
If you deal with such challenges, he’s worth listening to as a model. If not, he’s worth listening to for a sense of just how different work at this scale can be, though much of what he accomplishes can be worthwhile at scales much smaller than the 17,000 servers he describes for Facebook at 26:42 in the session.
I talked with him in an interview:
and we’ve posted his OSCON session:
A conversation with Chris Anderson, Nick Pinkston, and Jie Qi
Manufacturing is hard, but it’s getting easier. In every stage of the manufacturing process–prototyping, small runs, large runs, marketing, fulfillment–cheap tools and service models have become available, dramatically decreasing the amount of capital required to start building something and the expense of revising and improving a product once it’s in production.
In this episode of the Radar podcast, we speak with Chris Anderson, CEO and co-founder of 3D Robotics; Nick Pinkston, a manufacturing expert who’s working to make building things easy for anyone; and Jie Qi, a student at the MIT Media Lab whose recent research has focused on the factories of Shenzhen.
Along the way we talk about the differences between Tesla’s auto plant and its previous incarnation as the NUMMI plant; the differences between on-shoring, re-shoring and near-shoring; and how the innovative energy of Kickstarter and the Maker movement can be brought to underprivileged populations.
Many of these topics will come up at Solid, O’Reilly’s new conference about the intersection of software and the physical world. Solid’s call for proposals open through December 9. We’re planning a series of Solid meet-ups, plant tours, and books about the collision of real and virtual; if you’ve got an idea for something the series should explore, please reach out!
A Free Velocity Report
When I first started as a sysadmin many years ago, I quickly realized what a daunting task was before me. Like any good engineer, I took to finding the right tools to keep at hand to make light work out of the most difficult situations. This in itself was quite an endeavor, as over the years there has been a proliferation of tools and scripts. Many are of the artisanal, organic, hand-crafted variety, forged out of bash pipelines by our forefathers.
Much of that has changed now as the DevOps movement strengthens. With closer interaction between developers and operations, or even operations teams composed of developers, the tools have significantly improved. Treating infrastructure as code with automated configuration management and provisioning tools have freed many from the menial tasks of creating snowflake systems, and we’ve turned our attention to the more important matters of scaling and optimizing our systems.
In my Velocity report, 5 Unsung Tools of DevOps, I highlight a few of the tools that have gone unnoticed—or at least unrecognized—for some time. These are but a few of the tools that recognized needs early on and that successfully solve real-world problems that you’re likely to encounter today. Here is a brief synopsis: