"outages" entries

AT&T Fiber cuts remind us: Location is a Basket too!

The fiber cuts affecting much of the San Francisco Bay Area this week are similar to the outages in the Middle East last year (radar post), although far more limited in scope and impact.   What I said last year still holds true and is repeated below: From an operations perspective these kinds of outages are nothing new, and underscore why…

Hyperic CloudStatus service dashboard launches at Velocity!

Javier Soltero just launched CloudStatus during his Hyperic sponsor session today at Velocity. CloudStatus is a public health dashboard for web services like Amazon's EC2/S3, and Google's App Engine. Javier called to tell me about this last week after I declared that "Service Monitoring Dashboards are mandatory". This comes right after Amazon and Google had visible outages, and couldn't have…

Service Monitoring Dashboards are mandatory for production services!

Google App Engine went down earlier today. GAE is still a developer preview release, and currently lacks a public monitoring dashboard. Unfortunately this means that many people either found out from their app and/or admin consoles being unavailable or from Mike Arrington's post on TechCrunch. Google has a strong Web Operations culture, and there are numerous internal monitoring tools in…