A VMware mistake may shutdown thousands of virtual infrastructures

Posted by virtualization.info Staff   |   Tuesday, August 12th, 2008   |  

This morning the VMware’s customers that upgraded their virtual data centers with the new Infrastructure 3.5 Update 2 (build 103908) had an awful surprise: any virtual machine that is turned off cannot be powered on again, and any attempt to execute a VMotion (the live migration of a VM from one host to another) fails.

The reason behind this huge and unprecedented issue is an error in the license expiration time.

The only way to workaround the problem at the moment is to disable the Network Time Protocol (NTP) client and set the date back to August 10, as promptly suggested by a customer here.

Of course this countermeasure has an impact on the log consistency and on any tool that analyzes the VirtualCenter events for different purposes (performance monitoring, trend analysis, capacity planning calculation, etc.).

More than that obviously, this issue has an impact on the availability of those infrastructures where the IT administrators are in vacation (and there are many on August 12) and cannot operate any recovery.

The users from all around the world are reporting failures of part of their systems and in some case even the complete knock-down.

VMware has over 200,000 enterprise customers (100% of Fortune 100 and 95% of Fortune 500), and it claimed that 59% of them use VMotion in production.
The company didn’t provide any statistics about how many already deployed the Update 2, but the license fault could have impacted thousands of them.

VMware is aware of the issue but couldn’t provide any immediate solution.
At the moment it seems that the entire VMware Knowledge Base collapsed.
Calling the support line customers can just receive a brief message saying that the problem will be solved within 36 hours.
Additionally, VMware removed the capability to download any affected product.

The existence of such issue is more than enough to undermine the credibility of the company (which already made some mistakes in the past) in a complex moment of its successful history.
A 36 hours timeframe to provide a solution is just an unacceptable answer for all those enterprises that deploy virtualization in production.

The whole thing may severely damage the stock performance of today.


Update: The license of VMware ESXi 3.5 U2 (build 103909) is reported as affected by the same problem.

Second update: To further aggravate the situation, today is the so called Microsoft Patch Tuesday, so a number of guest operating systems are being automatically (or manually for those unaware of the issue) rebooted.

As this is not enough, any customer running a VDI environment certainly allows its end-users to reboot their virtual desktops any time they want.

Third update: as the VMware Knowledge Base is still unavailable probably due to overload, virtualization.info publishes the original KB article about this issue.

VMwareKB

Fourth update: The issue also impacts ESX 3.5 Update 1 with certain patches. 
The full details are available in the comment section of this post, thanks to the effort of a virtualization.info reader.

Suddenly the problem is no more a matter of early adoption.


Fifth update:
As promptly reported in the comments section, the VMware’s new CEO, Paul Maritz, published on the official blog an apology, informing that a patch has been released:

…I am sure you’re wondering how this could happen.  We failed in two areas:

  • Not disabling the code in the final release of Update 2
  • Not catching it in our quality assurance process 

We are doing everything in our power to make sure this doesn’t happen again.  VMware prides itself on the quality and reliability of our products, and this incident has prompted a thorough self-examination of how we create and deliver products to our customers.  We have kicked off a comprehensive, in-depth review of our QA and release processes, and will quickly make the needed changes…

Maritz couldn’t desire a worst start for its new role in the company. Nonetheless this is a great opportunity: the co-founder and former VMware CEO, Diane Greene, was often accused of being unable to grow her company as a big enterprise, capable of competing against Microsoft.

Handling this incident Maritz has the first chance to demonstrate that he’s the right person to do better than Greene.

Sixth update: VMware is still unable to republish the ESX 3.5 and ESXi 3.5 Update 2 images for fresh installations.
Their availability is expected by August 13, 2008 at 6pm PST.

Seventh update: VMware just informed its customers that it cannot deliver a new, patched image of the product for the planned deadline.

The images are now planned for release August 14, between 2am and 8am PDT.

Eighth update: A number of enterprise customers may be unable to apply the first patch released (see Fifth update above) for a number of reasons:

  • Unable to schedule a maintenance window
  • Internal change control procedures
  • No available server to VMotion running VM’s onto

VMware is aware of these constrain and informed its customers that is developing a second procedure, called U2 Alternative Install Process (U2 AIP), to apply the patch, available on demand calling the Support.
At the moment (August 15, 2008) there is no release date for this new patch installation procedure.

Meanwhile the full patched images are finally available online and all the download links have been reactivated.
The new build numbers are:

  • ESX 3.5 Update 2 – 110268
  • ESXi 3.5 Installable Update 2 – 110271


blog comments powered by Disqus


virtualization.info Newest articles
Paper: Cisco UCS C240-M3 Rack Server with NVIDIA GRID GPU cards on Citrix XenServer 6.2 and XenDesktop 7.5

October 21st, 2014

Cisco has released a paper titled: “Cisco UCS C240-M3 Rack Server with NVIDIA GRID GPU cards on Citrix XenServer 6.2 and XenDesktop 7.5“.
The paper which contains 38 pages will…

Microsoft announces updates to its public and private cloud portfolio

October 20th, 2014

Microsoft today announced several upcoming features to both its public Microsoft Azure services, as its private cloud solution based on Windows Server and System Center. CEO Satya Nadella stated that…

OpenStack releases the 10th version of its IaaS platform called Juno

October 20th, 2014

OpenStack, the open source cloud computing project has released its 10th version of its IaaS platform for public, private and hybrid clouds. This version has 342 new features and…

VMware decides to disable TPS in future ESXi releases by default

October 17th, 2014

In a knowledge base article titled: “Security considerations and disallowing inter-Virtual Machine Transparent Page Sharing (2080735)” published on October 16th, VMware states that it will disable the Transparant Page Sharing…

Paper: Citrix Virtual Desktop Handbook 7.x

October 16th, 2014

Based on the recent releases of Citrix XenApp and XenDesktop 7.6, Citrix has updated its design guidance called the "Citrix Virtual Desktop Handbook 7.x". The handbook which contains 202 pages…

VMware announces vRealize Log Insight 2.5

October 16th, 2014

VMware has announced the release of version 2.5 of its log aggregation, management and analysis product Log Insight. This version will be the follow up of version 2.0 which was…

Release: Oracle VM VirtualBox 4.3.18

October 16th, 2014

Oracle has released a new version of its virtualization platform VM Virtualbox. Version 4.3.18 is considered a maintenance release which can be installed on top of version 4.3. The update…

Paper: Performance and Scalability of Microsoft SQL Server on VMware vSphere 5.5

October 16th, 2014

VMware has released a paper titled: "Performance and Scalability of Microsoft SQL Server on VMware vSphere 5.5". The Paper which contains 33 pages demonstrates that large Microsoft SQL Server databases…

Paper: Microsoft Exchange Server Performance on VMware Virtual SAN

October 16th, 2014

VMware has released a paper titled: "Microsoft Exchange Server Performance on VMware Virtual SAN". The paper which contains 9 pages shows the results of performance tests of Microsoft Exchange Server…

Microsoft announces support for Docker container virtualization for next version of Windows Server

October 15th, 2014

Microsoft has announced that it will support for Docker in its next version of Windows Server. Docker which provides a so called container virtualization solution currently receives a lot of…

Paper: Achieving Over 1-Million IOPS from Hyper-V VMs in a Scale-Out File Server Cluster Using Windows Server 2012 R2

October 15th, 2014

Microsoft has released a paper titled:"Achieving Over 1-Million IOPS from Hyper-V VMs in a Scale-Out File Server Cluster Using Windows Server 2012 R2". The paper which contains 24 pages demonstrates…

VMworld Europe 2014 Wrap-Up

October 15th, 2014

VMworld Europe 2014 in Barcelona has seen Pat Gelsinger (VMware CEO), Bill Fathers (EVP and GM, Hybrid Cloud Services Business Unit) and Sanjay Poonen (EVP and GM, End-User Computing) in…

Release: Microsoft Virtual Machine Converter 3.0

October 14th, 2014

Microsoft has released version 3.0 of its converter tool from the VMware platform to Hyper-V and Windows Azure, the Microsoft Virtual Machine Converter (MVMC). Version 3.0 is the follow-up of…

Paper: VMware Mirage Large-Scale Reference Architecture

October 13th, 2014

VMware has released a paper titled: "VMware Mirage Large-Scale Reference Architecture". The paper, which contains 30 pages is a provides a reference architecture and real-world testing results for image management,…

 
Monthly Archive