vpxd with 100% CPU, vCenter Server unresponsive…

It just so happened that our vCenter Server ran havoc. It is a vCenter 5.5, running on Windows Server 2008 R2, using the bundled MS SQL Express 2008 R2 database.

Symptoms:

  • The .Net AND web client reacted sluggish
  • No connection possible to the Update Manager
  • vpxd service had high CPU, sometimes 100% straight, making the whole server unresponsive.
  • vpxd service sometimes crashed, needing a manual restart.

After a restart, the above cycle would begin roughly after 30 minutes of operations.

When I analyzed the vpxd.log files, I saw many messages regarding the db like so:

Could not allocate space for object XYZ

So I checked the SQL DB and saw that it had only 2 KB of space available. After all, there is a 10GB limit on the 2008 R2 SQL Express.

Searching for a way to purge data, i stumbled across this KB: Purging old data from the database used by vCenter Server (1025914)

It describes similar issues, and helped me to free up about 6 GB of space for the database.

Immediately after starting the procedure mentioned in the link (it took about 40 – 50 minutes to finish) the vpxd service settled and became usable again, solving our problem.

What remains to be figured out is why the Events and Tasks tables had grown so rapidly over the last 90 days that it would jam a 10 GB DB. This environment has only 16 hosts and about 250 VMs.

 

udev and cloning a linux vm: Network not working…

Have you ever stumbled upon a cloned Linux system, in my case CentOS 6.5, where eth0 does not exist and eth1 isn’t started automatically?

When VMware clones a VM it gives its network card a new MAC address, ensuring that you don’t end up with several VMs with the same MAC. If your distro uses udev and it discoveres the new NIC, it gives it a different UUID, thus creating eth1 in the process, since it can’t match the MAC addresses and UUIDs of the NICs. This might break all sorts of scripts or configs.

Here is how to fix it:

  • First we need to remove the discovered and assigned UUIDs from udev:

rm -f /etc/udev/rules.d/70-persistent-net.rules

  • Secondly we need to edit the networking script for eth0:

vi /etc/sysconfig/networking/devices/ifcfg-eth0

Here you should change the old MAC address to the new one the VM got after cloning.

  • Reboot.

Thats it. eth0 should work as it used to on the parent VM.

 

thanks to William: http://www.envision-systems.com.au/blog/2012/09/21/fix-eth0-network-interface-when-cloning-redhat-centos-or-scientific-virtual-machines-using-oracle-virtualbox-or-vmware/

vSphere 5.5 and ESXi 5.5

Hi all,

today I am not writing because of a certain problem or thing I stumbled upon. The “news” I want to share is somewhat “old” (26 August 2013), too: VMware announced vSphere 5.5 and ESXi 5.5!

Why am I posting this? Besides some cool new features in Hardware Version 10 or on the VDP side and Hypervisor side, a mayor change that will affect how we use vCenter in our Company is: Full Mac OS X Client integration (including the plugin for vCenter WebClient).

Now, if that isn’t great news? 😉

Here’s a short sheet about whats new: http://blogs.vmware.com/vsphere/files/2013/09/vSphere-5.5-Quick-Reference-0.5.pdf

And heres the long story: http://www.vmware.com/files/pdf/vsphere/VMware-vSphere-Platform-Whats-New.pdf

All the best,

maybeageek

Execution error: E10056: Restore failed due to existing snapshot. Job Id: (Full Client Path:)

After a while of backing up VMs via vSphere Data Protection (VDP) the backup jobs for four VMs failed. The message said they needed consolidation.

After the consolidation everything started to work for 3 VMs, but not for the fourth. Now I was getting the following error:

Execution error: E10056: Restore failed due to existing snapshot. Job Id: <job-id> (Full Client Path:)

The GUI said nothing about needed consolidation, no snapshots where created, either, and if you look into the VMs config you see that the hdd points to a vmdk, not to a 00001.vmdk snapshot file. So, everything seemed to be in order, right?

After reading some articles I found a vmware KB entry: VDP Backup fails

The solution therein: Old 000001.vmdk-files lying around unused, nowhere referenced or anything. Simply deleting them will help (but an additional move to another location is recommended just to be on the save side).

So with this everything is up and running again! Thanks vmware!

vSphere Data Protection 5.1: Backup fails for Windows Server 2008 R2 VMs

So today I got to the bottom of another interesting case concerning backups with vSphere Data Protection.

After deploying the virtual appliance, registering it to the vSphere Server and creating backup jobs, something interesting happened: Linux VMs got backed up, whereas Server 2008 R2 VMs got errors.

To make a long story short: It has to do with the UUIDs of the virtual HardDisks and Windows VSS, and the fix is quite easy, as can be seen in this KB from VMware:

http://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2035736

esxtop not working in OS X terminal

Also, as I did some troubleshooting lately and came across this issue, here is how to resolv the problem with the OS X terminal and esxtop:

Simply change the setting of the terminal emulation from xterm-256-color to just xterm. voila it works!

Thanks to Punching Clouds: http://www.punchingclouds.com/2013/01/30/esxtop-data-display-issues-with-osx-terminal-application/

Registering vSphere Data Protection to vCenter does not work…

So it seems that when you install vSphere Data Protection and want to use a distinct user that is not Administrator or root, you need to give that user (in this installation it was called datarecovery from the old version) rights on vCenter Level on its own. Just putting that user into a Active Directory Group will not suffice, as registration to vCenter will then give an error as result.

Bild