vSphere: Freezing VMs after deleting a volume from the SAN

2 12 2009

This article is moved towards my new site:





5 responses

11 12 2009
Stephen Vogel

I’m having the same issues but its not related to a deleted volume or LUN. In my case, I have two ESX 4 hosts attached to a LeftHand SAN (single node). There are two volumes/datastores that are shared by each ESX host. VMs on ESX1 are fine but any VMs on ESX2 suffer from poor performance (30-40 second freezes) and tons of Disk and symmpi errors in the System Event Viewer.

I have tried rescanning the storage network but the problem has not gotten any better. Do you see and Disk errors in you VM Windows Event logs?? I have had these problem on and off but they have always only affected 1 of my 2 ESX cluster servers.

12 12 2009
Kenneth van Ditmarsch

Hi Stephen,

No I didn’t see errors in the windows eventlog since my drops were below 60 seconds (and windows disk timeout is defaulted to 60 seconds).
What does the vmkernel of the troubled ESX host tell you?, to me this sounds like a mis configuration on ESX level.

Have you’ve tried to move one of the troubled VM’s to the good ESX host to isolate the problem?


7 01 2010
Virtualization Short Take #33 - blog.scottlowe.org - The weblog of an IT pro specializing in virtualization, storage, and servers

[…] article by Kenneth Van Ditmarsch, backed up by this post by Chad Sakac, underscore the need for proper […]

21 01 2010

issue with workaround described in:
Virtual machines might stop responding when any LUN on the ESX/ESXi host is in an all-paths-down condition – http://kb.vmware.com/kb/1016626

Unpresenting a LUN containing a datastore from ESX 4.x and ESXi 4.x – http://kb.vmware.com/kb/1015084

3 02 2010

We experienced the same issue and since we have over 30 hosts it ended up bringing down EVA8001. turns out this is a known reported issue with HP type sans and vsphere. When you unpresent a lun the host go into a panic state and bombard the san to the point that it can no longer rspond. The easiest work around is to turn the machine off when you unpresent a lun or you can mask the individual luns which is a pain if you have a lot of hosts. Vmware released a patch on the 5th janruary

KB article is 1016291 released with reference ESX 4.0 Patch 03. This KB states that the fix is delivered in the patch ESX400-200912401-BG. PR 467188 found in the “PRs Fixed” section is the PR for APD (All path down) situation.

KB Link: http://kb.vmware.com/kb/1016291

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: