Programming Posts

Ohai, New Ohai Plugins!

Extending Chef

When you start to use the Chef configuration management system, you will quickly encounter a tool it ships with called Ohai which collects information about the underlying system to expose to the Chef Client as attributes during its run. These attributes allows you to easily incorporate system-specific behavior into your Chef cookbooks, for example allowing you to create a single “package” resource in your cookbook which automatically detects whether to install packages from Yum when running on a Redhat based Linux distribution or from Apt when running on a Debian based distribution.

All of Ohai’s attribute collection behavior is defined in a collection of plugins, which are shipped with Ohai out of the box. These plugins allow it to collect information including:

  • The name and version of the operating system being used
  • The hostname of the server
  • The installed version of various programming languages

The flexibility of Ohai however lies in the fact that it also allows us to define our own plugins to extend the rich collection of attributes we collect about our systems. For example at Etsy, we use an Ohai plugin to collect information on the datacenter a particular server is situated in to allow our Chef cookbooks to configure the correct DNS and LDAP server for it to use.

To our cookbooks, this data is visible as the node[:datacenter] attribute, while under the hood the Ohai plugin queries our asset management system to locate and store this information – it allows us to easily capture this information to use in our recipes, without having to worry about how to collect it once we had created the plugin.

These plugins are written using a subset of Ruby known as a Domain Specific Language, or DSL, in which Ohai defines the necessary syntax and behavior of its plugins. Since early 2011, Ohai has shipped with version 6 of its plugin DSL – here’s an example of a plugin written using v6:

As we see here, our plugin class we see here is a single monolithic Ruby file which contains no method definitions, which adds a new attribute Mash called “awesome” and creates two new keys called “sauce” and “level”. The v6 DSL we see in the above plugin is simple, works perfectly well, and has served Ohai well for over 3 years now – however it also has a number of weaknesses:
Read more…

Comment |

What’s New in Java 8: Lambdas

A hands-on introduction to Java 8's most exciting new feature

Java 8 is here, and, with it, come lambdas. Although long overdue, lambdas are a remarkable new feature that could make us rethink our programming styles and strategies. In particular, they offer exciting new possibilities for functional programming.

While lambdas are the most prominent addition to Java 8, there are many other new features, such as functional interfaces, virtual methods, class and method references, new time and date API, JavaScript support, and so on. In this post, I will focus mostly on lambdas and their associated features, as understanding this feature is a must for any Java programmer on-boarding to Java 8.

All of the code examples mentioned in this post can be found in this github repo.

What are lambdas?

Lambdas are succinctly expressed single method classes that represent behavior. They can either be assigned to a variable or passed around to other methods just like we pass data as arguments.

You’d think we’d need a new function type to represent this sort of expression. Instead, Java designers cleverly used existing interfaces with one single abstract method as the lambda’s type.

Before we go into detail, let’s look at a few examples.

Example Lambda Expressions

Here are a few examples of lambda expressions:

Have a look at them once again until you familiarize yourself with the syntax. It may seem a bit strange at first. We will discuss the syntax in the next section.

You might wonder what the type is for these expressions. The type of any lambda is a functional interface, which we discuss below.
Read more…

Comment |

Simplifying Django

Lightweight Django by example

The following comes to you from Julia Elman and Mark Lavin. Julia is a a hybrid designer/developer who has been working her brand of web skills since 2002; and Mark is the Development Director at Caktus Consulting Group in Carrboro, NC where he builds scalable web applications with Django. Together, they are working on Lightweight Django, a book due out later this year that explores bringing Django into modern web practices.

Despite Django’s popularity and maturity, some developers believe that it is an outdated web framework made primarily for “content-heavy” applications. Since the majority of modern web applications and services tend not to be rich in their content, this reputation leaves Django seeming like a less than optimal choice as a web framework.

Let’s take a moment to look at Django from the ground up and get a better idea of where the framework stands in today’s web development practices.

Plain and Simple Django

A web framework’s primary purpose is to help to generate the core architecture for an application and reuse it on other projects. Django was built on this foundation to rapidly create web applications. At its core, Django is primarily a Web Server Gateway Interface (WSGI) application framework that provides HTTP request utilities for extracting and returning meaningful HTTP responses. It handles various services with these utilities by generating things like URL routing, cookie handling, parsing form data and file uploads.

Also, when it comes to building those responses Django provides a dynamic template engine. Right out of the box, you are provided with a long list of filters and tags to create dynamic and extensible templates for a rich web application building experience.

By only using these specific pieces, you easily see how you can build a plain and simple micro-framework application inside a Django project.

We do know that there are some readers who may enjoy creating or adding their own utilities and libraries. We are not trying to take away from this experience, but show that using something like Django allows for fewer distractions. For example, instead of having to decide between Jinja2, Mako, Genshi, Cheetah, etc, you can simply use the existing template language while you focus on building out other parts. Fewer decisions up front make for a more enjoyable application building process.

Read more…

Comments: 12 |

Health IT is a growth area for programmers

New report covers areas of innovation and their difficulties

infofixO’Reilly recently released a report I wrote called The Information Technology Fix for Health: Barriers and Pathways to the Use of Information Technology for Better Health Care. Along with our book Hacking Healthcare, I hope this report helps programmers who are curious about Health IT see what they need to learn and what they in turn can contribute to the field.

Computers in health are a potentially lucrative domain, to be sure, given a health care system through which $2.8 trillion, or $8.915 per person, passes through each year in the US alone. Interest by venture capitalists ebbs and flows, but the impetus to creative technological hacking is strong, as shown by the large number of challenges run by governments, pharmaceutical companies, insurers, and others.

Some things you should consider doing include:

Join open source projects 

Numerous projects to collect and process health data are being conducted as free software; find one that raises your heartbeat and contribute. For instance, the most respected health care system in the country, VistA from the Department of Veterans Affairs, has new leadership in OSEHRA, which is trying to create a community of vendors and volunteers. You don’t need to understand the oddities of the MUMPS language on which VistA is based to contribute, although I believe some knowledge of the underlying database would be useful. But there are plenty of other projects too, such as the OpenMRS electronic record system and the projects that cooperate under the aegis of Open Health Tools

Read more…

Comment |

Building an Activity Feed System with Storm

One of many wonderfully functional recipes from the Clojure Cookbook

clj_cookbookEditor’s Note: The Clojure Cookbook is a recently published book by experienced Clojurists Luke VanderHart and Ryan Neufeld. It seeks to be a practical collection of tasks for intermediate Clojure programmers. In addition to providing their own recipes, Ryan and Luke accepted contributions from a number of people in the community. One of those contributors was Travis Vachon–in this excerpt from the Cookbook, Travis gives you a tried and true recipe for working with Clojure and Storm.


You want to build an activity stream processing system to filter and aggregate the raw event data generated by the users of your application.


Streams are a dominant metaphor for presenting information to users of the modern Internet. Used on sites like Facebook and Twitter and mobile apps like Instagram and Tinder, streams are an elegant tool for giving users a window into the deluge of information generated by the applications they use every day.

As a developer of these applications, you want tools to process the firehose of raw event data generated by user actions. They must offer powerful capabilities for filtering and aggregating data and must be arbitrarily scalable to serve ever-growing user bases. Ideally they should provide high-level abstractions that help you organize and grow the complexity of your stream-processing logic to accommodate new features and a complex world.

Clojure offers just such a tool in Storm, a distributed real-time computation system that aims to be for real-time computation what Hadoop is for batch computation. In this section, you’ll build a simple activity stream processing system that can be easily extended to solve real-world problems.

First, create a new Storm project using its Leiningen template:

In the project directory, run the default Storm topology (which the lein template has generated for you):

This generated example topology just babbles example messages incoherently, which probably isn’t what you want, so begin by modifying the “spout” to produce realistic events.

Read more…

Comment |

Facebook’s Hack, HHVM, and the Future of PHP

What is Hack and what does it mean for the future of PHP?

Photo: thebusybrain
Facebook recently released Hack, a new programming language that looks and acts like PHP. Underneath the hood, however, are a ton of features like static typing, generics, native collections, and many more features for which PHP developers have long been asking. Syntax aside, Hack is not PHP. Hack runs only on Facebook’s HipHop virtual machine (HHVM), a competitor to the traditional PHP Zend Engine.

Why did Facebook build Hack?

Much of Facebook’s internal code is first written with PHP. Facebook can onboard new developers quickly with PHP because the language is notoriously easy to learn and use. Granted, much of Facebook’s PHP code is likely converted to a C derivative before being pushed into production. The point is Facebook depends strongly on the PHP language to attract new talent and increase developer efficiency.

Strict Typing

Unfortunately, PHP may not perform as well as possible at Facebook’s scale. PHP is a loosely typed language and type-related errors may not be recognized until runtime. This means Facebook must write more tests early to enforce type checking, or spend more time refactoring runtime errors after launch. To solve this problem, Facebook added strict typing and runtime-enforcement of return types to Hack. Strict typing nullifies the need for a lot of type-related unit tests and encourages developers to catch type-related errors sooner in the development process.

Instantaneous Type Checking

To make the development process and error-catching process even easier, Facebook includes a type-checking server with its HHVM engine. This server runs locally and monitors Hack code as it is written. Developers’ code editors and IDEs can use this type-checking server to immediately report syntax or type-related errors during code development.
Read more…

Comments: 3 |

Formulating Elixir

Simon St. Laurent and Jose Valim explore a new functional programming language

I was delighted to sit down with Jose Valim, the creator of Elixir, earlier this month at Erlang Factory. He and Dave Thomas had just given a brave keynote exploring the barriers that keep people from taking advantage of Erlang’s many superpowers, challenging the audience with reminders that a programming environment must have reach as well as power to change the world.

Elixir itself is a bold effort to bring Erlang’s strengths to a broader group of developers, adding new strengths, notably metaprogramming, along the way.

Watching Elixir grow has been a unique experience for me. I’ve seen other languages (JavaScript and XSLT) emerge, but Elixir combines solid foundations on prior (Erlang) work with a remarkably open conversation about how to structure the language. Jose tries things and asks for feedback, takes suggestions well, and values questions about how best to make the language accessible. Even without a standards organization, the process has remained open, stable, and productive.

Whether you’re interested in Elixir itself or just in the challenges of creating a new combination in a world filled with past experiments, it’s well worth listening to Jose Valim.

  • We’ve had functional programming since 1959 – why the burst of interest now? [2:10]
  • Moving from Ruby to Erlang “making Rails thread-safe, that was my personal pain-point” [3:13]
  • “Every time I got to study more about the VM, the tooling and everything it provides, my mind gets blown.” [6:12]
  • Why Elixir started, and how it’s changed as Jose learned more. [10:08]
  • Integrating new Erlang features (R17 maps) into Elixir. [15:43]
  • When can you use Elixir in production? [18:07]

I’m looking forward to seeing a lot more Elixir, even as I need to catch up on updating Introducing Elixir. I’m not sure it will conquer the world immediately, but it will certainly leave its mark.

Comment |

The Case for Test-Driven Development

An interview with O'Reilly author Harry Percival

Harry Percival, author of Test-Driven Web Development with Python, discusses how he got into TDD, why you should too, and shares some tips. In the podcast above, listen to Harry talk candidly about the types of tests that make sense, what and what not to test, and at what point a program becomes complex enough to warrant testing. Below is a mostly matching text version of the same interview. Let us know your thoughts on TDD in the comments—your own war stories and what convinced you (or didn’t!).

Why write tests? How do you know it’s not a waste of time?
The theory is that it’s an investment—the time you spend writing tests will get paid back in time you don’t have to spend debugging. Also, the theory goes that tests should help you to write code that’s easier to work with, as well as code with less defects. Because having tests encourages you to refactor, and to think about design, your code should end up cleaner and better architected, and so it should be easier to work with, and your investment pays off because you’re more productive in future as well.

So that’s the theory. But the problem is that there’s delayed gratification—it’s hard to really believe this when the reward is so far off and the time required is now. So in practice, what was it that convinced me?

I first learned about testing from a book called “Dive Into Python”—it’s a popular book, maybe some of the people listening will have read it too? They may remember that Mark Pilgrim introduces testing, in fact he introduces TDD, in chapter 10. He uses the classic TDD example, which is a Roman Numeral calculator, and he shows how, by writing the tests before we even start writing the code, we can really get some help in how we implement our calculator. So he writes his tests, I should be 1 and II should be 2 and IV should be 4, and so on, and he shows how it helps us to build a really neat implementation of a Roman numeral calculator.

Read more…

Comment |

Just Enough Arel

Replacing hand-coded SQL with object oriented programming

If you were a web developer prior to ActiveRecord, you probably remember rolling your own SQL and being specific about which fields you retrieved, writing multiple queries to handle “upserts” (update or insert), and getting frustrated with how difficult it was to generate SQL dynamically.

Then Ruby on Rails came along with its ActiveRecord API, and all of that pain seemed to melt away into distant memory. Now we no longer have to worry about which fields to retrieve, “upserts” can be handled with a single method, and scopes allow us to be as dynamic as we want.

Unfortunately, ActiveRecord introduced new pains: the “WHERE” clause (i.e. the predicate) only allows connecting conditions with “AND”s, and the only comparisons allowed are “=” and “in”. To get around this, we often concede ActiveRecord’s shortcomings and revert back to raw SQL. But in Rails 3.0, we were introduced to another way.


The Arel relational algebra library is an abstract syntax tree (AST) manager and has the goals of “1) Simplif[ying] the generation of complex SQL queries; and 2) Adapt to various RDBMSes”. It’s the “engine” ActiveRecord uses to change method calls into actual SQL. To be clear, Arel only generates SQL, it doesn’t access the database and retrieve data.

With Arel, we are now capable of fully utilizing the power of SQL, without the mess of hand-coding it. ActiveRecord uses Arel to transform queries like this:

Into SQL like this:

Read more…

Comment |

Six Degrees of Kevin Bacon in six languages

20 years of efficiently computing Bacon numbers


The Oracle at Delphi spoke just one language, a cryptic one that priests “compiled” into ancient Greek. The Oracle of Bacon—the website that plays the Six Degrees of Kevin Bacon game for you—has, in its 20-year existence, been written in six languages. Read on for the history and the reasons why.

1995-1999: C

The original version of the Oracle of Bacon, written by Brett Tjaden in 1995, was all C. The current version, my stewardship of it, and my revision control history only go back to 1999, so that’s where I’ll start. In 1999, I rewrote the Oracle… still entirely in C. Expensive shortest-path and spell-check algorithms? Definitely in C. String processing to build the database? Also C! Presentation layer to parse CGI parameters and generate HTML? C here, too!

Why C? The rationale for the algorithmic component was straightforward: the Oracle of Bacon ran on a slow, shared Unix machine that other people were using to get actual work done. Minimizing CPU and memory resource requirements was the polite thing to do. I needed a compiled language that let me optimize time and space extensively. The loops all counted down, not up, because comparing against zero was fractionally faster on SPARC. It had to be C.

But why were the offline string processing and the CGIs in C? Mostly, I think, to reuse code from the other parts of the code base and from previous projects I’d written when C was the only language I knew.

2004-2005: Perl

As the site added features, I got tired of writing code to generate HTML in C. I wrote new CGIs, then rewrote existing CGIs, in Perl. Simply put, writing the CGIs in an interpreted language made me more productive. I had hash tables and vectors built into the language and CGI support a simple “use” statement away. I didn’t have to compile on one server and then deploy to another—I could edit the CGIs right there on the web server. Good deployment practices it wasn’t, but it made me more productive as a programmer, and the performance of the CGIs didn’t matter all that much.

Read more…

Comments: 3 |