CIO Today

CIO Today Network Sites:   Top Tech News  |   CIO Today   |   Mobile Tech Today   |   Data Storage Today
Daily Briefing for Technology's Top Decision-Makers
Welcome to the new I.T.
One that streamlines service delivery
Learn how with new I.T. from BMC

Discover it here: www.bmc.com
Friday, April 18th 
24/7/365 Network Uptime!
Trending Topics:   Security Heartbleed Big Data Cloud Computing Windows XP Data Centers OS X Mavericks
Home
Enterprise Software
Enterprise Hardware
Big Data
Network Security
Cloud Computing
CRM Systems
Data Storage
Operating Systems
Communications
CIO Issues
Mobile Tech
Chips & Processors
World Wide Web
Business Briefing
After Hours
Press Releases
 
Free Newsletters
Top CIO News
 
Mobile Tech Today
 

Applications

Google Blames Outage on Software Bug

Google Blames Outage on Software Bug
January 27, 2014 1:30PM

Bookmark and Share
Isolated events like this are not a problem and users will forgive the Google outage. It becomes a problem when a pattern develops. If it were to happen multiple times it could become a problem for Google. Gmail has become a very strategic product and it's unlikely that Google will experience many more of these outages, said analyst Greg Sterling.

Your Next Generation Data Center Is Here! Vblock™ Systems: the world's most advanced converged infrastructure are built on the Cisco Unified Computing System with Intel® Xeon® processors. Vblock™ Systems deliver extraordinary time to market, ROI and TCO, and flexibility to meet your continually changing demands with 5X faster deployment, 96% less downtime, and 1/2 the cost. Click here to learn more.

If you are a hardcore Google user, you may have been tempted to pull out a few hairs last Friday as several of the company’s key services experienced a painful hiccup. Now, Google is shedding some light on the incident.

Specifically, Google users who use logged-in services like Gmail, Google+, Calendar and Documents were unable to access those services for about 25 minutes, according to Google vice president of Engineering Ben Treynor.

“For about 10 percent of users, the problem persisted for as much as 30 minutes longer,” he said on Friday. “Whether the effect was brief or lasted the better part of an hour, please accept our apologies -- we strive to make all of Google’s services available and fast for you, all the time, and we missed the mark today.”

What Really Happened?

Treynor reports that the issue has been resolved, and the company is now focused on correcting the bug that caused the outage, as well as putting more checks and monitors in place to ensure that this kind of problem doesn’t happen again. He then offered a technical explanation for what occurred and how it was fixed.

At 10:55 a.m. PST Friday morning, Treynor explained, an internal system that generates configurations -- essentially, information that tells other systems how to behave -- encountered a software bug and generated an incorrect configuration. The incorrect configuration was sent to live services over the next 15 minutes, caused users’ requests for their data to be ignored, and those services, in turn, generated errors.

“Users began seeing these errors on affected services at 11:02 a.m., and at that time our internal monitoring alerted Google’s Site Reliability Team. Engineers were still debugging 12 minutes later when the same system, having automatically cleared the original error, generated a new correct configuration at 11:14 a.m. and began sending it; errors subsided rapidly starting at this time,” Treynor said. “By 11:30 a.m. the correct configuration was live everywhere and almost all users’ service was restored.”

Will Google See User Backlash?

With services once again working normally, Treynor said work is now focused on removing the source of failure that caused Friday’s outage, and speeding up recovery when a problem does occur. Google then took three more steps:

(1) Correcting the bug in the configuration generator to prevent recurrence, and auditing all other critical configuration generation systems to ensure they do not contain a similar bug; (2) adding additional input validation checks for configurations, so that a bad configuration generated in the future will not result in service disruption; and (3) adding additional targeted monitoring to more quickly detect and diagnose the cause of service failure.

We caught up with Greg Sterling, principal analyst at Sterling Market Intelligence, to get his take on the outage -- and its resolution. He told us because Google has such a strong reputation as a engineering-driven company when something like this happens it's surprising to many people.

“Again, isolated events like this are not a problem and users will forgive the outage,” Sterling said. “It becomes a problem when a pattern develops. If this were to happen multiple times it would start to become a problem for Google. Gmail has become a very strategic product for the company and it's unlikely that Google will experience many more of these outages.”

Tell Us What You Think
Comment:

Name:



 Applications
1. Android Gets Chrome Remote Desktop
2. Silverpop: IBM Marketing Gets Personal
3. VMware Horizon 6 Folds In AirWatch
4. Cuban Twitter Creates New Hurdles
5. Wedding in the Palm of Your Hand




 Most Popular Articles
1. BlackBerry Drops T-Mobile After Nasty Spat
2. Cisco, IBM Launch Internet of Things Consortium
3. Salesforce CRM Gets Industry Specific for Internet of Customers
4. IBM Applies Big Data Analytics To Fight Against Fraud
5. Intel Bets on Cloudera for Big Data Analytics

Have an informed opinion on this story?
Send a Letter to the Editor.
We want to know what you think.
Send us your Feedback.

 Related Topics  Latest News & Special Reports

  Poll: A Mix of Feelings on Future Tech
  Google, Rockstar Suit Stays in Calif.
  Michaels: Nearly 3M Cards Breached
  'Like' Cheerios, Give Up Right To Sue
  Heartbleed Could Cost Millions

 Technology Marketplace

Business Intelligence
Get real-time, cloud-based information services with Neustar.
 
Cloud Computing
BMC's I.T. solutions unleash the power of your business
Next Generation Data Center Is Here! Vblock™ Systems from VCE
 
Contact Centers
HP delivers the future of the contact center with HP Qfiniti 10.
 
Data Storage
Next Generation Data Center Is Here! Vblock™ Systems from VCE
Barium Ferrite (BaFe) is the future of tape.
2.5" Enterprise-class SATA & SAS SSDs for server & storage applications
 
Enterprise Hardware
Barium Ferrite (BaFe) is the future of tape.
2.5" Enterprise-class SATA & SAS SSDs for server & storage applications
 
Enterprise I.T.
BMC's I.T. solutions unleash the power of your business
 
Hardware
Protect your network with APC Smart-UPS battery backup
 
Network Security
Protect your network with APC Smart-UPS battery backup
 

Network Security Spotlight
Michaels Says Nearly 3M Credit, Debit Cards Breached
Arts and crafts retail giant Michaels Stores has confirmed that a data breach at its POS terminals from May 2013 to Jan. 2014 may have exposed nearly 3 million customer credit and debit cards.
 
Heartbleed Could Cost Millions, Could Have Been Prevented
Early estimates of Heartbleed’s cost to enterprises are running in the millions. The reason: revoking all the SSL certificates the bug leaked will come at a very hefty price.
 
Google's Street View Software Unravels CAPTCHAs
The latest software Google uses for its Street View cars to read street numbers in images for Google Maps works so well that it also solves CAPTCHAs, those puzzles designed to defeat bots.
 

Enterprise Hardware Spotlight
Vaio Fit 11A Battery Danger Forces Recall by Sony
Using a Sony Vaio Fit 11A laptop? It's time to send it back to Sony. In fact, Sony is encouraging people to stop using the laptop after several reports of its Panasonic battery overheating.
 
Continued Drop in Global PC Shipments Slows
Worldwide shipments of PCs fell during the first three months of the year, but the global slump in PC demand may be easing, with a considerable slowdown from last year's drops.
 
Google Glass Finds a Home in Medical Education, Practice
Google Glass may find its first markets in verticals in which hands-free access to data is a boon. Medicine is among the most prominent of those, as seen in a number of Glass experiments under way.
 

Mobile Technology Spotlight
Google Releases Chrome Remote Desktop App for Android
You're out on a sales call, and use your Android mobile device to grab a file you have back at the office on your desktop. That's a bit easier now with Google's Chrome Remote Desktop app for Android.
 
Amazon 3D Smartphone Pics Leaked
E-commerce giant Amazon is reportedly set to launch a smartphone after years of development. Photos of the phone, which may feature a unique 3D interface, were leaked by tech pub BGR.
 
Zebra Tech Buys Motorola Enterprise for $3.45B
Weeks after Lenovo bought Motorola Mobility’s assets from Google for $2.91 billion, Zebra Technologies is throwing down $3.45 billion for Motorola’s Enterprise business in an all-cash deal.
 

Navigation
CIO Today
Home/Top News | Enterprise Software | Enterprise Hardware | Big Data | Network Security | Cloud Computing | CRM Systems
Data Storage | Operating Systems | Communications | CIO Issues | Mobile Tech | Chips & Processors | World Wide Web
Business Briefing | After Hours | Press Releases
Also visit these Enterprise Technology Sites
Top Tech News | CIO Today | Mobile Tech Today | Data Storage Today

Services:
FreeNewsFeed | Free Newsletters | XML/RSS Feed

About CIO Today Network | How To Contact Us | Article Reprints | Services for PR Pros (In partnership with NewsFactor) | Top Tech Wire | How To Advertise

Privacy Policy | Terms of Service
© Copyright 2000-2014 CIO Today. All rights reserved. Article rating technology by Blogowogo. Member of Accuserve Ad Network.