After our SAN reset itself and lost connection with our VMWare 4.1 ESX hosts there were several servers which were understandably not happy. However, vCenter did not appear to be one of them. In fact, we used it extensively while fixing the various issues caused by the SAN outage, and it never caused an issue. It was therefore a little surprising today when the vCenterServer Service stopped on it, and when I restarted the service the SQL Server 2005 Express instance shot to 99% CPU usage, and the tempdb system database started growing, and wouldn't stop until it had exhausted all available storage space, and then crash. We found restarting the SQL server would free up the space, but the cycle would then start again. We also found stopping the vCenterServer Service would drop the SQL service CPU utilization back to 0 and stop the tempdb file growth.
Naturally I engaged VMWare technical support on this, and we worked the problem for four hours without resolution. We tried truncating the temp tables and history tables in the vCenter DB, reset the credentials for the SQL services, and looked through extensive logs. No luck.
Finally, I tried an Instant Recovery of the vCenter VM from our Veeam backup system, and it worked beautifully; CPU utilization stayed down, and storage stayed stable. I immediately did a standard restore of it, and was back up and running.
All vCenter stores is historical information like performance statistics and your higher level configuration like HA. If you're a smaller shop and you keep your configuration basic, then vCenter isn't storing much of anything, and rolling it back a couple days won't harm anything. In fact, you could do a completely new installation of vCenter and reconfigure it manually pretty fast.
No comments:
Post a Comment