Tuesday, 17 November 2009

VMware ESX Trouble Shooting

Just recently I've had cause to analyse some VMware ESX logs in a bid to understand what was going on with a problematic ESX server and just what could be done to remedy the situation.

Up until now I hadn't given ESX logging much thought, mainly easing my conscience by thinking "Yea I'm sure ESX keeps a log of everything somewhere. It's Linux after all..!"

Well as I've discovered, its no longer time to keep parts of me buried in the sand, it's time to get looking at these logs.  After all they are there to help (I think) and with a bit of prior knowledge you too can start to understand "ESX log speak".

The first thing to realise with ESX logs is that there aren't just three (like Windows) there are in fact twelve!

Obtaining ESX Log Bundles
Rather than go delving around on a live ESX server, lets get the copies of the logs downloaded locally so that we can analyse them at our leisure.

Using the Virtual Infrastructure (VI) Client this is quite a simple thing to do:
  1. Logon to your Virtual Center or your ESX host directly
  2. File - Export - Export Diagnostic Data
  3. Select your ESX host, whether you wish to include Virtual Center Server and VI Client Logs and where you wish to download the logs to
Now just sit back and wait - it can take a while for the logs to be generated and downloaded.

Viewing the ESX Log Bundles
Now I'm sure that there are other ways to do this, but the method I find easiest is to use WinRAR.  Two reasons:
  1. WinRAR can open .tgz compressed files
  2. WinRAR's view function properly displays the log files are in a readable format.  - The only other to way view the log files in Windows is to open the files with Wordpad 
I don't normally bother extracting the .tgz bundle file, I just use the view function in WinRAR to show me the contents of the log files.

Which Log is Which?
OK on to the 'meat and potatoes' of this post.  Here is a break down of what is logged in each ESX log file.

Log File Name Details
/var/log/vmkernel Vmkernel Records activities related to the virtual machines and ESX host [1]
/var/log/vmkwarning Vmkernel Warnings A copy of everything marked as a warning or higher severity from vmkernel log. Easier to look through than vmkernel log [1]
/var/log/vmksummary Vmkernel Summary Used for avaialability and uptime statistics. Human-readable summary in vmksummary.txt
/var/log/vmware/hostd.log Host Agent Log Contains information on the agent that manages and configures the ESX host and its virtual machines
/var/log/vmware/vpx VirtualCenter Agent Contains information on the agent that communicates with VirtualCenter
/var/log/messages Service Console Log from the Linux kernel. Useful for underlying Linux issues. The kernel has no awareness of VMs running on the VMkernel [2]
/var/log/vmware/esxcfg-boot.log ESX Boot Log ESX Boot log, logs all ESX boot events [2]
/var/log/vmware/webAccess Web Access Records information on Web-based access to ESX Server
/var/log/secure Authentication Log Contains records of connections that require authentication, such as VMware daemons and actions initiated by the xinetd daemon
/var/log/vmware/esxcfg-firewall.log ESX Firewall Log Contains all firewall rule events [1]
/var/log/vmware/aam High Availability Log Contains information related to the High Availability (HA) service
/var/log/vmware/esxupdate.log ESX Update Log Logs all updates completed using the esxupdate tool

NOTES:
[1] Logs rotated by logrotate, see KB3402740. Rotated with a numeric extension, current log has no extension and the next newest one has a .1 extension.
[2] Log is symbollically linked to the current real file. Run an 'ls -l logname.log' to see the link.

What to Look For
This really depends on the error you are trying to troubleshoot!

A good starter for ten is to search for the text "error" in any of the logs ;o)

Further Information
Additional Reading:
  • VI3 Advanced Log Analysis - Powerpoint
  • Troubleshooting VMware ESX Server 3and VMware VirtualCenter 2 - PDF
  • Tips for Troubleshooting VMware ESX Server Faults - PDF
  • ESX Server 3 Log Map - link
  • Which ESX Log File - link
VMware Lab Presentation Videos:
  • Tips for Troubleshooting ESX Server 3.x Faults - Presentation (free VMworld login required to view
  • Troubleshooting VI3 - Presentation (free VMworld login required to view)
Conclusion
It's not possible to cover every eventuality in just one blog post. Hopefully the information provided here will at the very least set you on the right road to resolving ESX issues for yourself.

One final thought - All of the ESX issues I've come up against have a logical cause and hence a totally logical solution.  There are no smoke and mirrors here.  With that in mind, have fun.

- Chris

Newer Post Older Post Home