"monitoring" entries

Four short links: 7 October 2015

Four short links: 7 October 2015

Time for Change, Face Recognition, Correct Monitoring, and Surveillance Infrastructure

  1. The Uncertain Future of Emotion AnalyticsA year before the launch of the first mass-produced personal computer, British academic David Collingridge wrote in his book “The Social Control of Technology” that “when change is easy, the need for it cannot be foreseen; when the need for change is apparent, change has become expensive, difficult, and time consuming.”
  2. Automatic Face Recognition (Bruce Schneier) — Without meaningful regulation, we’re moving into a world where governments and corporations will be able to identify people both in real time and backwards in time, remotely and in secret, without consent or recourse.
  3. Really Monitoring Your SystemsIf you are not measuring and showing the maximum value, then you are hiding something. The number one indicator you should never get rid of is the maximum value. That’s not noise — it’s the signal; the rest is noise.
  4. Haunted by Data (Maciej Ceglowski) — You can’t just set up an elaborate surveillance infrastructure and then decide to ignore it. These data pipelines take on an institutional life of their own, and it doesn’t help that people speak of the “data-driven organization” with the same religious fervor as a “Christ-centered life.”
Four short links: 14 July 2015

Four short links: 14 July 2015

Future of Work, Metrics and Events, High-functioning Dev, and Concept Calendars

  1. What’s the Future of Work (Tim O’Reilly) — Tim’s been exploring how technology is changing what work is and how we build our society around it. New conference coming!
  2. Monitoring 101: Collecting Data — the world-view behind instrumenting modern software is just as interesting as the tools to make it possible.
  3. Building a High-Performance Team: It’s Not Just About Structure — move beyond copying Spotify’s structure and work on your company’s Habits, Values & Culture, and Leaders & Management.
  4. Google Calendar Concept ArtIn the future … your content will be available directly within your calendar.

7 takeaways from Velocity Europe

Taking a look at the current issues affecting the Web operations and performance space.

Editor’s note: The European edition of our Velocity conference wrapped up a few weeks ago, and now that the jet lag has passed I’ve had a chance to reflect on the talks and excellent hallway conversations I had throughout. And while I thoroughly enjoyed all the sessions I introduced, one of the downsides to being a chair is that I can’t attend all the other sessions at the same time. As such, I always look around for excellent dissections of the conference from other people; this summary by Peter Arijs from CoScale closely reflects some of the themes I saw, including a few of the standout talks.

velocity_barcelona_crop

November in Barcelona was full of action for web and big data practitioners, with the Velocity and Strata-Hadoop conferences and side events such as WebPerfDays and Papis.io. As a startup in the web application monitoring and analytics space, it was the perfect time to get a pulse on the state of the art, and talk to some of our clients and prospects. Below is a summary of personal take-away points from selected Velocity sessions and personal interactions.

Read more…

The Future of Monitoring Data is In the Cloud

Scale and complexity call for leaving it to specialists

As applications move from on­premise to SaaS, the scale of deployments increases by orders of magnitude (to “web­scale”). At the same time, application development and operation become tightly integrated and continuous deployment brings the frequency of updates down from months to days or even hours.

The larger scale makes the health of SaaS applications mission-critical and even existential to its providers, while the frequent updates increase the risk of failures. Therefore, monitoring and root cause analysis also become mission critical functions, and more instrumentation is needed to ensure the application’s quality of service. At the company I co-founded, we see customers using extensive and often tailored instrumentation that generates massive amounts of data (think hundreds of thousands of data streams and billions of data points per day).

Read more…

The new stage of system monitoring is better integrated

Current tools make collection and visualization easier but don't reduce work

New tools are raining down on system administrators these days, attacking the “monitoring sucks” theme that was pervasive just a year ago. The new tools–both open source and commercial–may be more flexible and lightweight than earlier ones, as well as more suited for the kaleidoscopic churn of servers in the cloud, making it easier to log events and visualize them. But I look for more: a new level of data integration. What if the monitoring tools for different components could send messages to each other and take over from the administrator the job of tracing causes for events?

Read more…

OpenStack release offers more flexibility and aids to performance

The Havana release features metering and orchestration

I talked this week to Jonathan Bryce and Mark Collier of OpenStack to look at the motivations behind the enhancements in the Havana release announced today. We focused on the main event–official support for the Ceilometer metering/monitoring project and the Heat orchestration project–but covered a few small bullet items as well.

Read more…

Building an Alerting System That Really Works

Velocity 2013 Speaker Series

Building a high quality alerting system often feels like a dark art. Often it is hard to set the proper thresholds and it is even harder to define when an alert should be triggered or not. This results in alerts being raised too early or too late and your colleagues losing faith in the system. Once you use a structured approach to build an alerting system you will find it much easier and the alerts more predictable and precise.

Measure Selection

First you have to select proper measures to alert on. This selection is key as all other steps depend on using meaningful measures. While there seems to be an infinite number of different measures, you can categorize them into three main categories:

  • Saturation measures indicate how much of a resource is used. Examples are CPU usage or resource pool consumption.
  • Occurrence measures indicate whether a condition was met or not. A good example is errors. These measures are often presented as a rate like failed transactions per seconds.
  • Continuous measures do not have a single value at any given point in time, but instead a large number of different values. A typical example is response times. Irrespective of how small you make the sample, you will always have a large amount of values and never just one single representative value.

Read more…

Google Analytics for the Real World: A Conversation with Sharon Biggar of Path Intelligence

In preparation for the upcoming Web 2.0 Summit I am posting a few conversations with attendees that embody the Web Squared Theme. Path Intelligence uses sensor technology to understand shopping behavior in retail spaces by detecting and tracking the RF signals from mobile phones. As Sharon Biggar, co-founder, succinctly puts it – “we are like Google Analytics for the real world” giving offline retailers the same visibility on shopping behavior that online retail has enjoyed for years.

Understanding Web Operations Culture – the Graph & Data Obsession

We’re quite addicted to data pr0n here at Flickr. We’ve got graphs for pretty much everything, and add graphs all of the time. -John Allspaw, Operations Engineering Manager at Flickr & author of The Art of Capacity Planning One of the most interesting parts of running a large website is watching the effects of unrelated events affecting user traffic…

Hyperic CloudStatus service dashboard launches at Velocity!

Javier Soltero just launched CloudStatus during his Hyperic sponsor session today at Velocity. CloudStatus is a public health dashboard for web services like Amazon's EC2/S3, and Google's App Engine. Javier called to tell me about this last week after I declared that "Service Monitoring Dashboards are mandatory". This comes right after Amazon and Google had visible outages, and couldn't have…