"http" entries

How we got to the HTTP/2 and HPACK RFCs

A brief history of SPDY and HTTP/2.

Download HTTP/2.

Download HTTP/2.

Download a free copy of HTTP/2. This post is an excerpt by Ilya Grigorik from High Performance Browser Networking, the essential guide to networking and web performance for web developers.

SPDY was an experimental protocol, developed at Google and announced in mid-2009, whose primary goal was to try to reduce the load latency of web pages by addressing some of the well-known performance limitations of HTTP/1.1. Specifically, the outlined project goals were set as follows:

  • Target a 50% reduction in page load time (PLT).
  • Avoid the need for any changes to content by website authors.
  • Minimize deployment complexity, avoid changes in network infrastructure.
  • Develop this new protocol in partnership with the open-source community.
  • Gather real performance data to (in)validate the experimental protocol.

To achieve the 50% PLT improvement, SPDY aimed to make more efficient use of the underlying TCP connection by introducing a new binary framing layer to enable request and response multiplexing, prioritization, and header compression.


Not long after the initial announcement, Mike Belshe and Roberto Peon, both software engineers at Google, shared their first results, documentation, and source code for the experimental implementation of the new SPDY protocol:

So far we have only tested SPDY in lab conditions. The initial results are very encouraging: when we download the top 25 websites over simulated home network connections, we see a significant improvement in performance—pages loaded up to 55% faster.

— A 2x Faster Web Chromium Blog

Fast-forward to 2012 and the new experimental protocol was supported in Chrome, Firefox, and Opera, and a rapidly growing number of sites, both large (e.g. Google, Twitter, Facebook) and small, were deploying SPDY within their infrastructure. In effect, SPDY was on track to become a de facto standard through growing industry adoption.

Read more…

What every Java developer needs to know about Java 9

Introducing updated HTTP client support and JSON API integration.

What every Java developer needs to know about Java 9Java 8 may only have been released a few months ago, but Oracle has already announced the first set of features that will be targeted for Java 9. On August 11th, Mark Reinhold, a Chief Architect for Java, made available an initial feature set to subscribers on the jdk9-dev mailing list.

The crop of features are being run under a relatively new process, known as Java Enhancement Proposals (JEP). This process allows new language and VM features to be prototyped and explored without the full weight of the normal Java standardization process, although the expectation is that suitable, successful JEPs would go on to formal standardization. There will, of course, be many other new features that will be introduced in Java 9, but in this post we are going to focus on two major enhancements — and examine how they relate to features added in Java 7 and 8.

Read more…

Go node without code

Command line tools improve development workflow

At Fluent 2013, O’Reilly’s Web Platform, JavaScript and HTML5 conference, Adobe Community Manager Brian Rinaldi showed off ways Node makes possible a new world of utilities, showing JavaScript developers a toolkit they will want to integrate into their workflows.

In his talk, Go Node Without Code, Brian talked about the many ways Node and the npm ecosystem of JavaScript packages can help front-end developers, even if they don’t want to leap to server-side programming.

Tools Brian explored include:

  • UglifyJS, the “JavaScript parser / mangler / compressor / beautifier toolkit” [at 02:00]
  • Less, “for building CSS, but offers a lot of features you don’t get in straight CSS” [at 06:15]
  • Grunt, addressing the “just adding steps to my workflow” problem. [at 9:09]
  • GruntIcon, “A mystical CSS icon solution” [at 18:18]
  • Automaton, a “task automation tool” [at 25:00]
  • HTTPster, a dead-simple “local static development server” [at 30:05]
  • The slightly more configurable Node-static [at 32:00]

You will need to be familiar with the command line for a lot of it – if you’re not, it’s a great time to learn!

If the Web Platform, JavaScript, and HTML5 interest you, consider checking out our growing collection of top-rated talks from Fluent 2013.

Web application development is different (and better)

On both front and back end, the Web challenges conventional wisdom

The Web became the most ubiquitous distributed application system because it didn’t have to think of itself as a programming environment. Almost every day I see comments or complaints from programmers (even brilliant programmers) muttering about how many strange and inferior parts they have to deal with, how they’d like to fix a historical accident by ripping out HTML completely and replacing it with Canvas, and how separation of concerns is an inconvenience. Everything should be JavaScript.

(Apologies to Tom Dale, who tweeted a perfect series of counterpoints just as I was writing. He has visions of rebuilding the rendering stack in JavaScript, but those tweets are not unusual opinions.)

The Web is different, and I can see why programmers might have little tolerance for the paths it chose, but this time the programmers are wrong. It’s not that the Web is perfect – it certainly has glitches. It’s not that success means something is better. Many terrible things have found broad audiences, and there are infinite levels to the Worse is Better conversations. And of course, the Web doesn’t solve every programming need. Many problems just don’t fit, and that’s fine.

So why is the Web better?

Read more…

Caching Strategies for Improved Web Performance

OSCON 2013 Speaker Series

Caching is the method that most improves response time in web applications (as Steve Souders shows in Cache is King), but in order to make use of it, every layer of your application must be configured for that purpose.

Most applications are initially developed with little or no use of caching and then must be refactored to fulfill performance goals. However, this approach incurs extra development costs that could be saved if response time is taken into consideration in the early stages of the development process.

The methodology that can save your life while you are still developing your application is pretty straightforward: keep caching in mind whenever handling data in your system. Either web APIs or internal backend data flows need to ask one simple question:

Can I survive if the data seen by the user is not the latest?

Sometimes the answer to this question is ‘no.’ For example, I would be fired very quickly if I built a bank system that showed more money than one consumer’s account really has. On the other hand, if the system interacts with general data services like social networks, news, weather, car traffic, etc., there is less need to ensure the latest piece of information is immediately shown to the user.

Of course, the latest data needs to eventually get to the user. Data cannot be too old or you risk confusing the user, but configuring a short expiration time (let’s say 5-10 minutes or less) for dynamic data that can support it can significantly improve the response time experience. That is called temporal consistency and it is crucial for having a successful caching strategy in place.

Nowadays, web applications are based on mashing up several web services coming from different sources. The best way to tackle different response times as well as data designs is to temporally cache those elements across all system layers. It is also applicable to data coming from your own system if the information needs to travel from one part of the world to another in several hops. If information is not critical, consider caching it at any intermediate stage and reuse when it is needed. Caching in the backend can avoid half of a trip. Even better would be to cache at the target device or a CDN system that can dispose of the full data trip or reduce it to only the last mile as an easy way to enhance performance.
Read more…

The Power of a Private HTTP Archive Instance: Finding a Representative Performance Baseline

Velocity 2013 Speaker Series

Be honest, have you ever wanted to play Steve Souders for a day and pull some revealing stats or trends about some web sites of your choice? Or maybe dig around the HTTP archive? You can do that and more by setting up your own HTTP Archive.

httparchive.org is a fantastic tool to track, monitor, and review how the web is built. You can dig into trends around page size, page load time, content delivery network (CDN) usage, distribution of different mimetypes, and many other stats. With the integration of WebPagetest, it’s a great tool for synthetic testing as well.

You can download an HTTP Archive MySQL dump (warning: it’s quite large) and the source code from the download page and dissect a snapshot of the data yourself.  Once you’ve set up the database, you can easily query anything you want.

Setup

You need MySQL, PHP, and your own webserver running. As I mentioned above, HTTP Archive relies on WebPagetest—if you choose to run your own private instance of WebPagetest, you won’t have to request an API key. I decided to ask Patrick Meenan for an API key with limited query access. That was sufficient for me at the time. If I ever wanted to use more than 200 page loads per day, I would probably want to set up a private instance of WebPagetest.

To find more details on how to set up an HTTP Archive instance yourself and any further advice, please check out my blog post.

Benefits

Going back to the scenario I described above: the real motivation is that often you don’t want to throw your website(s) in a pile of other websites (e.g. not related to your business) to compare or define trends. Our digital property at the Canadian Broadcasting Corporation’s (CBC) spans over dozens of URLs that have different purposes and audiences. For example, CBC Radio covers most of the Canadian radio landscape, CBC News offers the latest breaking news, CBC Hockey Night in Canada offers great insights on anything related to hockey, and CBC Video is the home for any video available on CBC. It’s valuable for us to not only compare cbc.ca to the top 100K Alexa sites but also to verify stats and data against our own pool of web sites.

In this case, we want to use a set of predefined URLs that we can collect HTTP Archive stats for. Hence a private instance can come in handy—we can run tests every day, or every week, or just every month to gather information about the performance of the sites we’ve selected. From there, it’s easy to not only compare trends from httparchive.org to our own instance as a performance baseline, but also have a great amount of data in our local database to run queries against and to do proper performance monitoring and investigation.

Visualizing Data

The beautiful thing about having your own instance is that you can be your own master of data visualization: you can now create more charts in addition to the ones that came out of the box with the default HTTP Archive setup. And if you don’t like Google chart tools, you may even want to check out D3.js or Highcharts instead.

The image below shows all mime types used by CBC web properties that are captured in our HTTP archive database, using D3.js bubble charts for visualization.

Mime types distribution for CBC web properties using D3.js bubble visualization. The data were taken from the requests table of our private HTTP Archive database.

Mime types distribution for CBC web properties using D3.js bubble visualization. The data were taken from the requests table of our private HTTP Archive database.


Read more…

Easily invoke common protocols with Twisted

Spin up Python-friendly services with 0 lines of code

Twisted is a framework for writing, testing, and deploying event-driven clients and servers in Python. In my previous Twisted blog post, we explored an architectural overview of Twisted and examples of simple TCP, UDP, SSL, and HTTP echo servers.

While Twisted makes it easy to build servers in just a few lines of Python, you can actually use Twisted to spin up servers with 0 lines of code!

We can accomplish this with twistd (pronounced twist-dee), a command line utility that ships with Twisted for deploying your Twisted applications. In addition to providing a standardized deployment interface for common production features like daemonization, logging, and authentication, twistd can use Twisted’s plugin architecture to run flexible servers for a variety of protocols. Here are some examples:

twistd web --port 8000 --path .

Run an HTTP server on port 8000, serving both static and dynamic content out of
the current working directory. Visit http://localhost:8000 to see the directory listing:

Read more…

Four short links: 3 June 2013

Four short links: 3 June 2013

Hacking HTTP Host Headers, Collaborative Coding, Glitch is the Overloaded Essence, and Crazy Culture

  1. Practical HTTP Host Header Attacks — lots of cleverness like So, to persuade a cache to serve our poisoned response to someone else we need to create a disconnect between the host header the cache sees, and the host header the application sees. In the case of the popular caching solution Varnish, this can be achieved using duplicate Host headers. Varnish uses the first host header it sees to identify the request, but Apache concatenates all host headers present and Nginx uses the last host header.
  2. Madeye — collaborative code editing inside a Google Hangout. (via Andy Baio)
  3. Too Momentous for the MediumWhatever you now find weird, ugly, uncomfortable and nasty about a new medium will surely become its signature. CD distortion, the jitteriness of digital video, the crap sound of 8-bit – all of these will be cherished and emulated as soon as they can be avoided. It’s the sound of failure: so much modern art is the sound of things going out of control, of a medium pushing to its limits and breaking apart. The distorted guitar sound is the sound of something too loud for the medium supposed to carry it. The blues singer with the cracked voice is the sound of an emotional cry too powerful for the throat that releases it. The excitement of grainy film, of bleached-out black and white, is the excitement of witnessing events too momentous for the medium assigned to record them. (Brian Eno’s words)
  4. Where the Happy Talk about Corporate Culture is All Wrong (NY Times) — I think there are two types of happiness in a work culture: Human Resources Happy and High Performance Happy. Fast-growth success has everything to do with the latter and nothing to do with the former. Lazy false opposition, and he describes an asshole-rich workplace that would only please a proctologist. (via Sara Winge)

Exploring Hypermedia with Mike Amundsen

The Web Can Teach the Enterprise

The Web’s flexibility has helped it to survive and thrive, pushing well beyond the browser-based universe where it first showed its promise. While I’ve spent most of my time working with the HTML/CSS/JavaScript side, the HTTP side of the original HTML+HTTP combination that built the Web is perhaps even more flexible.

I enjoyed talking with Mike Amundsen, Principal API Architect at Layer 7 Technologies, who has spent much of his recent career encouraging enterprise customers to shift toward web architectures. While REST has emerged over the past decade to eclipse SOAP-based “web services”, Amundsen has eagerly promoted the next step beyond the simple CRUD-based model of early REST work: hypermedia.

Our conversation ranged from the history and foundations of REST through the many ways to integrate that work with existing enterprise practice to a glimpse at what the future might hold for frameworks, design, and architecture.

  • REST as enterprise architectures principles applied to hypermedia (at 1:57)
  • Transitioning from RPC-based models to hypermedia, by including additional information in response. (at 3:00)
  • The value of opinionated message formats and eventual integration into opinionated frameworks. (at 5:51)
  • Shifting from shared understandings of object models to messages. (at 8:50)
  • “Enough coupling, but not too much” to allow mixing of technologies. (at 11:15)
  • Human negotiation, HTTP negotiation, and responsive web design (at 14:20)
  • Read more…

A Matter of Semantics

Structuring client-server communications with hypermedia messages

Messages on the Web carry three levels of information: Structure Semantics, Protocol Semantics, and Application Semantics. No matter the implementation style, all three of these are needed for any successful communication between client and server. This threesome (S-P-A) forms the essentials of communication over distributed networks.

Most of the time, though, these levels are obscured or muddled at implementation time. For example, both Protocol Semantics (how we create valid network requests) and Application Semantics (domain names like users, customers, orders, etc) are often mixed together in conversation ("You POST new users to this URL") and both of these are usually only defined in human-readable documentation and implemented in the source code of the client application itself. In other words, the protocol-level and application-level semantics are tightly coupled. An easy way to discover this is to see if you can take the same message format and implement your API using a protocol other than HTTP (e.g. WebSockets or FTP). I illustrated this "protocol-agnostic" design pattern back in 2010 ("A RESTful Hypermedia API in Three Easy Steps").

But there is a way to keep these separate from each other and view each of these aspects in their own light. In doing so, you’ll strengthen the quality and value of your message design while increasing flexibility and choices.

Structure Semantics

Structure Semantics provides the set of rules regarding how to create a well-formed message. XML has rather simple structure semantics. JSON's rules for well-formedness are a bit more vague but reachable since JSON.parse(…) turns out to be the ultimate arbiter of such things. Determining well-formedness of other, more complex media types (HTML, Atom, HAL, Collection+JSON) is tougher, but do-able even if external validators are not always available.

Building a successful web implementation that does not contain structure semantics is difficult—and that's a good thing.

Read more…