HOWTO – 64-bit Kernel 2.6.31 and VMware Server 2.0.1…

Assuming you have already installed the 2.6 31 kernel, this link has a patch and script to modify the modules VMware compiles when you run the script.  The script is for and later kernels, and works fine for 2.6.31.

  1. Run the script that came with VMware Server 2.0.1, but DO NOT run the script at the end.
  2. Get the patch script – and make it executable.
  3. Get the patch – vmware-server.2.0.1_x64-modules-
  4. Make a directory, say, /usr/src/vmware-patches and cd to it.
  5. Copy the patch, the script and the four module sources (/usr/lib/vmware/modules/source/*.tar) to the patch directory you are now in.
  6. Run the patch – it should build for 64-bit systems.  I do not know about 32-bit systems…
  7. Run the command, and install as normal.

There have been reports of minor script errors, so you may need to make some slight edits.  Or you may not – I had no trouble.  If you need to reinstall, make sure you stop the vmware services, rmmod the vmware modules, and delete everything in the /usr/lib/vmware/modules directory before re-running the installer-patch-config steps above.  You will also need to delete the modules from your system – running the installer should generate a failure message telling you what files to delete from where.  Successfully running the installer will put everything you need back in the /usr/lib/vmware/modules directory.

Big thanks to meubeukeu and michelemase for their work in making these patches!

VMware Server 2.0.1 and Kernel…

I finally decided to get VMware Server running on my new kernel.  Whenever the kernel is updated, there are some things you can count on having to reinstall, such as NVidia video drivers and VMware installations.   I expected problems, so my methodology was to attempt a normal install, expect failure, and search on the resulting errors.  This did not pan out, so I tried the VMware Community Forums, and I found this little gem on how to patch the VMware modules:

This apparently works with 32-bit as well, but may not be confirmed.

I downloaded the patch and shell script, ran the script, and followed the directions of the output:

  • Move original files that could cause issues with VMware – “mv /usr/lib/vmware/modules/binary /usr/lib/vmware/modules/binary-orig
  • Run the config again, without the -d option (otherwise, root would be the only user allowed to log into the web interface) – “

Essentially, there were no problems getting everything running.  Now I have to figure out what my password was to log into my Windows XP VM.  I have to complete some online training that can only be done in Windows (thanks a ton).  I would hate to have to crack my way in to my own VM….

Huge thanks out to both michelmase and Krellan for the patches and scripts!

Some Tips on VMware ESX…

Well, it has been slow posting recently.  For a while.  OK, a long time now.  But I have been working in a lab, building a virtualization environment using VMware ESX Servers and Virtual Center, and lemme tell ya, there are a LOT of moving parts.  I thought it would be useful to jot down some of the tips I have picked up along the way.  This applies to ESX 3.5.0 update 2 and Virtual Center 2.5.0 update 1.

So here goes.  From memory, so there may be *minor* inaccuracies.

  1. Hardware:  Sure, you want as many CPU cores as you can get (VMware counts up to six cores per physical CPU as one).  Sure you want as much RAM as the machine will hold.  Of course you want terabytes of disk space (well, as much as you can get anyway).  Guess what?  You should also make sure you have plenty of network cards handy.  Whatever space isn’t taken up with fiber channel HBAs, iSCSI initiators, etc., throw a NIC in there.  A Gigabit Ethernet NIC, fiber or copper.  10 gig if you can use it.  Just make sure the cards are supported by VMware, or you may be swapping cards a lot learning the hard way…
  2. Network:  Cisco is good – CDP (Cisco Discovery Protocol) and Etherchannel are both great compliments to ESX networking.
  3. Storage:  NFS instead of iSCSI/Fiber Channel.  Huh?  Are you nuts?  Seriously, my mind was blown away at VMWorld 2008 at the sessions covering NFS access to shared storage.  On a NAS.  NetApp appears to be a natural choice, but any NFS will do in a pinch.  The VMware ESX kernel currently supports version 3 of NFS.  Some apps are better fits for FC/iSCSI SANs.  But most should work just great on a NAS over NFS, and it is *way* cheaper, easier to manage, and more flexible.  There are tradeoffs to everything, of course, so investigate closely.
  4. Which NIC is which?  Two ways to find out – being in the ESX command line is useful now.  Assuming you are plugged into a Cisco, you can use CDP.
    • ESX – Set CDP to both listen and advertise on your virtual switch(es) – the default is listen – with this command: ” esxcfg-vswitch -B both vSwitch0 ". Replace vSwitch0 with your vswitch name. Check with the same command, using -b instead of -B.
    • Cisco – Turn on CDP.
    • Cisco – ” show cdp neighbor ” will show you vmnic0, vmnic1, etc. and the Cisco port connecting them.

    Or you can do it all from ESX by plugging in the NIC to the network, and typing in at the ESX command line, ” esxcfg-nics -l “.  Plug in the NICs one at a time and rerun that command each time.  You’ll see.  Be sure to document everything.

  5. Routing:  Can’t ping a service console NIC?  Can’t get to a vmkernel NIC?  Virtual machines not talking to the rest of the network?  Make sure your default routes are set properly with ” esxcfg-route ” (for vmknics), and ” netstat -r ” for your vswif (service console) NICs.  The ” /etc/sysconfig/network ” file also has the service console default route in it for startup – make sure it is correct, change as needed.
  6. VLANs, portgroups, and vmnics:  This is tricky, and something I had to learn on my own.  The ” esxcfg-vswitch ” command lets you create and delete virtual switches, set CDP, add and remove vmnics (the physical network cards ESX detects), and add and remove portgroups (VLANs and their tags, or IDs).  The -L option links the vmnic to the vswitch, on all portgroups.  The -U unlinks.  But then there is also a -M option, which adds a vmnic to a portgroup on the switch, while -N removes vmnics from portgroups.  That is the tricky part – suppose you wanna add a vmnic to one portgroup only (say your switch has three portgroups) – first, you need to be in the command line, because the Virtual Center GUI does not seem to provide this granularity of configuration. If you add it with the -M command to that portgroup, it does not fail, and looks right, but the vmnic does not talk to anything.  You must link it (-L) to the switch first.  THEN add and remove from portgroups using the -M and -N options, one vmnic/portgroup at a time, after which your vmnic will work as you expected.  This is not documented well, and the man page does not clearly explain this, so be aware.
  7. NFS on ESX:  This is not recommended, so do not do it.  Now that you have chosen to ignore my advice, you will need to recompile the kernel with the _NFS_TCPD option set to y – you need this link.  You will need to modprobe two modules, nfs and nfsd.  You will need to start the portmap and NFS services.  You will need to edit the /etc/exports file (using the no_root_squash option) and export it.  Verify with ” showmount -e ” and ” rpcinfo -p “.  Be really careful – I have done this several times in a closed lab environment, just to learn.  But you can seriously gank up your OS trying this – do not miss a step.  I used the /usr/src/linux-2.4.xx source, and did not need to modify the makefile. One more tip – if you do this, and have separated your traffic properly (service consoles, vmotion, NFS, virtual machine nets all on separate IP networks and VLANs), you will need to add in a service console vswif that other machines can access NFS on to keep the traffic away from the service console networks – so if your NFS network traffic flows on the 10.10.11.x network and your service consoles are on the 10.10.10.x network, add a vswif to the .11 NFS network and point other servers to it.  NFS won’t see a vmkernel NIC – it needs to be something that shows up in ifconfig – a vswif.  This allows you to add it to the NFS portgroup (if you are separating the traffic via portgroups/tagged vlans on a single vswitch at layer 2 instead of using multiple vswitches – I do both).  I have had no problems doing this.  YMMV.  Not for production use – get a NetApp (with the NFS license) or build one (FreeNAS, etc.).
  8. Clusters, resource pools, and virtual machines:  So you made a cluster, added your hosts, and created some resource pools.  Ready to import a VM?  READY, SET, FAIL!  I found that importing a Virtual Server 1.x VM from a local disk copy (trying to make it as simple as possible) failed with typically helpful Virtual Center log entries.  You know, the “unknown error” type.  Off to Google.  Turns out that DRS is getting in the way of the import.  Right click on the cluster, edit the properties, set DRS to manual, and then import it directly to the ESX host in the cluster that you want it to go to.  Should work fine after that (knock, knock).  Then you can set DRS back to what it was, and drag the VM to the resource pool desired.  Some posts say to further remove the ESX host from the cluster, but setting DRS to manual was all I needed to do.
  9. ESX host settings:  When you first set up a host (the ESX server itself) on Virtual Center, make sure to set the time properly – use an NTP server source if you can.  You may also want to increase your service console memory  – I max mine out at 800 MB.  This requires a reboot of the ESX host to take effect.  Also, when making partitions during the ESX install (if you do that kind of thing – I always do it manually), make sure you set the vmkcore partition to be larger than 100 MB.  It needs a minimum of 100 MB, so set a number of 104 MB to be sure, as 100 MB may actually format to less than 100 MB, causing your install to fail.
  10. fdisk, vmkcore, and vmfs3:  If you need this, you are really in the deep water.  So, you are not happy and decided to take it out on your partition table.  Using fdisk.  At the ESX command line.  ALLRIGHTY THEN.  I assume you know exactly what the hell you are doing, cuz if you don’t , you sure will.  The partition type for vmfs3 is fb.  The partition type for vmkcore is fc.  You do not need to (and cannot) format the vmkcore partition, but will need to format the vmfs3 partition after rebooting.  You may very well be booted into a maintenance shell (not safe mode, not even that far).  If you change partitions, you change those UUIDs referenced in ” /etc/fstab ” – I got around this by mounting via /dev/ mountpoints instead of UUID within /etc/fstab.  (Hope you like vi.)  Here is a link for the lost and desperate (foolhardy and just-plain-nuts in my case).
  11. Using service consoles creatively:  This can be done, as I mentioned above with NFS on ESX.  I have a situation where I need to get to a time server on our LAN, but the ESX interfaces I want to use for NTP are on private nets – which our LAN administrator absolutely refuses to route on the LAN (good for him).  So, I added a tagged VLAN and interface on my Cisco for an unused production network, adjusted the routing on the uplink switch, and created that VLAN portgroup on all my ESX hosts’ service console vswitches.  I then added a vswif interface on that IP network to the new (NTP) portgroup, and added the service console vmnics (using the -M option) to the new portgroup.  I also had to set the Cisco up as an NTP server, using the upstream NTP server as a peer, and voila!  Accurate time on all my ESX hosts is now a reality.  May not be recommended (I don’t actually know), but you *can* use vswif interfaces for special purpose traffic needs, and still hold true to best practice guidelines (I try to keep all the backend ESX traffic on non-routing private nets).
  12. Troubleshooting extras:  This may cover a few gotchas…
    • Disable the firewall (see below examples).  It might be in the way, so get rid of it as a variable.  Don’t forget to turn it back on and configure it later!
    • Before removing a portgroup from a switch at the ESX command line, make sure you have removed all vswif, vmknic, and vmnic interfaces from the portgroup first. It has to be empty before you remove it.
    • Not sure which NIC has a driver for it?  I loaded up lots of scavenged gigabit NICs and couldn’t tell which was loading (I do things the hard way).  Match the PCI IDs from the ” esxcfg-nics -l ” command with the output of ” lspci ” to be sure.
    • Wanna change a portgroup VLAN ID? Just reissue the esxcfg-vswitch command with the new VLAN ID, like this: ” esxcfg-vswitch -p "Current Portgroup With Wrong ID" -v 74 vSwitch2 “. Now the VLAN tag is changed from whatever it was to 74. No need to remove the portgroup and do it all over.  You can also move a vswif to another portgroup in a similar manner, so you do not need to delete it and recreate it (see below for an example).
  13. Document and plan, plan and document:  This all starts with a plan.  The goal is as robust and flexible a virtual environment as you can afford to make.  You do not want to build this on a poor foundation – you do not want to rip everything up later and do it a better way.  PLAN IT OUT.  Mine took over a month of me researching and dry-running, and I am still not sure I got it all right, but it is far more sophisticated and robust than it was when I start with my original design.  Document all phases of your work.  There are a LOT of moving parts here – you just cannot do this without ruthlessly precise documentation.  No time to be lazy or cut corners – your environment did not come cheap, and it may very well become a critical part of your network.  It has to be your best effort.  Use the best practices available from VMware, Cisco, NetApp, etc.
  14. ESX command line examples:  Here are a few of the really useful ones I have gotten comfortable with… See the man pages for more.
    • Add a vswitch – ” esxcfg-vswitch -a vSwitch4
    • Add a portgroup – ” esxcfg-vswitch -A "My New Portgroup" vSwitch4
    • Add a VLAN tag – ” esxcfg-vswitch -p "My New Portgroup" -v 33 vSwitch4
    • Add a NIC to a vswitch – ” esxcfg-vswitch -L vmnic2 vSwitch4
    • Add a NIC to a portgroup – ” esxcfg-vswitch -M vmnic2 -p "My New Portgroup" vSwitch4
    • Remove a NIC from a portgroup – ” esxcfg-vswitch -N vmnic2 -p "My Old Portgroup" vSwitch4
    • Remove a portgroup – ” esxcfg-vswitch -D "My Old Portgroup" vSwitch4
    • List your vswitches – ” esxcfg-vswitch -l
    • List your vswif NICs – ” esxcfg-vswif -l
    • List your vmkernel NICs – ” esxcfg-vmknic -l
    • List your physical NICs – ” esxcfg-nics -l
    • Add a vswif to a portgroup – ” esxcfg-vswif -a vswif7 -i -n -p "My New Portgroup"
    • Move a vswif to a different portgroup – ” esxcfg-vswif -p "new portgroup" vswif3
    • Add a vmkernel NIC – ” esxcfg-vmknic -a -i -n "My Other New Portgroup"
    • Temporarily disable the firewall for troubleshooting purposes – ” esxcfg-firewall --allowIncoming --allowOutgoing

Well, this ends a pretty long post.  More to come as I progress through this.

VMworld2008 – Some personal thoughts…

I just want to add a little more about my impressions of the VMworld2008 conference in Las Vegas.  I have to say, it has been a long time since I have felt real intellectual intimidation and pressure to keep up, but I felt both in refreshingly uncomfortable quantities at this conference.  So much brain-power was actually a little unsettling, and I wondered if I would even fit in.

Luckily, I subscribe to the belief that it is better to be suspected to be a fool than to open one’s mouth and remove all doubt.  I hope my disguise worked…      ^______^

I had some serious job-envy going on, too, I hafta tell ya.  While I do have a comfortable and secure career, the notion of working at a place that not just allows you to fire on all 12 cylinders, but actually demands and encourages you to spark it up to 16 – well, that definitely tickles something way down deep in the lizard part of my brain.  Salary didn’t even come into consideration…

For the time being, I’ll just have to settle for the eight or so I get to use now – I have had it worse (think two, that’s right, 2!), and I know how to be thankful.  My plan is to know a lot more about virtualization in time for next year, so maybe keeping up will just be a slightly smaller effort on my part.

It could happen…

VMworld 2008…

This was my first real convention (actually, HIMMS in 2004, but I’d rather not count it). First day was horrible. I had not registered, had not a clue what to do, and discovered that after registering, I still needed to register for each session. I didn’t even know where to do that. To add to it, I had forgotten my username and password to the vmworld site – I had tried registering about a month earlier, but the process bombed out and I never went back for a second attempt.

I was ready to go home. Sure glad I gutted it out…

That afternoon, I finally found out my username and password, then went into the Schedule Builder and pieced it together. After that, it was easy, although I had to adjust my schedule several times. I didn’t make any sessions on day one, and barely got into my first the next day, but after that I pretty much got everything I wanted. I was very impressed with most of the sessions I went to, especially the breakout sessions on advanced troubleshooting of one thing or another.

The technical highlight for me was the NAS-NFS session presented by NetApp and Virginia Credit Union. Wow. I am definitely going to give that approach a shot. The morning presentations were just as amazing, though, with the demos of some of the products VMware has in the pipeline.

The real highlight seems to have been the party Wednesday night at the Las Vegas Speedway, however. Free food and booze, fast cars, loud music, and tons of silly games bound to appeal to hordes of drunken geeks – I can’t imagine anyone not having a good time there. I know I did, and I got to meet some very interesting people as well.

The only bummer is the fact that I completely suck at slots. Naturally, this knowledge only came after extensive (expensive) and painful research…

I will definitely try to make this an annual event – it’s just too good a learning session to pass up.

Mata ne!