Last week I have installed VMware vSphere 5.5 on my test host and today was the time to get the NetApp Virtual Storage Console 5.0 going so I could take advantage of Rapid Cloning and other good stuff that VSC 5.0 includes.
Installation was straight forward (recommended read – Virtual Storage Console 5.0 for VMware® vSphere® – Installation and Administration Guide) and next logical step was to add my Storage Systems so I could provision datastores etc. From within VSC section in vSphere Web Client I was trying to add new Storage System just to be presented with the following:
“Unable to add storage systems due to insufficient privileges. You do not have sufficient permission to perform this action on: the root object. Contact your administrator to add the following mission privileges: Add, Modify, and Skip storage systems”
So here we are, lovely Thursday morning at work and requirement for new VM comes up – I’m thinking not a big deal since I have deployed thousands of VMs before but there is a catch this time (there always is!) All of my Windows Server templates are virtual machine HW version 8 and I need to deploy one server to ESXi 4.1 host – great! ESXi 4.1 uses HW version 7 at the most so HW version 8 will not work – if you attempt to add HW version 8 to the inventory on ESXi 4.1 host you will be met by the following outcome:
VM adds fine and without any errors but its grayed out and with invalid status. Not much you can do here apart from removing it from the inventory.
Following on from my last post talking about How to update mpt2sas driver on ESXi 5? today we are going to look at updating network drivers for Broadcom and Intel NICs on VMware ESXi host. Procedure documented below will work with any version of ESXi 4.x and 5.x
Lets start by listing all network interfaces in “Up” state:
esxcfg-nics -l | grep Up
As you can see there are 10 network adapters in “Up” state which happens to be total as well on this host – 4 Broadcom 5709s and 6 Intel 82576s. Portion of the screenshot that we’re particularly interested in is just before the “Up” word i.e. bnx2 and igb – these are driver names that ESXi is using for our network cards. Now that we have this established lets look at the version of said drivers:
I have been getting An error occurred installing the package. Windows Installer returned ‘1601’ while installing/updating VMware tools on 1 VM (out of like 70..) for some strange reason. Fast forward some time and its all apparent now:
Someone changed the startup type for Windows Installer service to Disabled…
Its one of those annoying things you could really could do without on a Friday evening!!
Anyways, that’s that sorted – time to pack up and leave for the day.
Lets dive straight into this error message which happens to be affecting two of my VMs:
As you can see I’m unable to get the console window working! Exact message displayed across the top sheds some light into why this is happening – “Unable to connect to the MKS: Console access to the virtual machine cannot be granted since the connection limit of 0 has been reached.”
I know of at least three different ways to make this error message go away and get to the console window again.
I have been entertaining HP ProLiant MicroServer N36L for nearly a year now. Great machine for the money and with cost of around £120 (after the cash back) it was an absolute bargain at the time! Box itself has been upgraded to 8GB of DDR3 RAM (maximum the motherboard can take) and its running ESXi 4.1 U1 absolutely fine. Disk space wise, there is only 30 GB Vertex SSD for few VMs (and .vswp files), rest of the storage needs is provided by QNAP TS-509 NAS (by means of NFS and iSCSI) This setup has been absolutely flawlessly so far but there is simply not enough RAM and CPU power for my needs (or rather my VMs). CPU Ready is going through the roof quite often due to AMD Athlon II Neo 1.30GHz processor which is just slightly better performer compared to Intel Atom range. 8GB of RAM is tight and ESXi was paging the VMs like mad to .vswp files hence why I put them on SSD which helped only to certain extent. At the end I kinda had enough and decided to build a custom server which would address all of the issues above.
Here is what I came up with:
Processor: Intel Xeon X3450 2.66GHz with HT/VT-x and VT-d
Processor Cooling: Corsair CWCH100 Hydro Series H100 Cooler
Processor Cooling Fans: 2 x Noctua NF-P12
Motherboard: Supermicro X8SIL-F-O Server Board
RAM: Kingston 4 x 8GB [KVR1333D3Q8R9S/8G]
Case: Lian-Li PC-V350B
Case Noise Dampening: AcoustiPack LITE (APL) Multi-Layered Soundproof Material
Case Backplate: Custom backplate to incorporate moving PSU to the right and adding 120mm exhaust fan
5.25″ Drive Bay Cooling: Evercool Armour ATX HDD Cool Box HD-AR
5.25″ Drive Bay Cooling Fan: Noctua NF-R8-1800
Case Exhaust Fan [Back]: 1 x Noctua NF-P12
Case Exhaust Fan Guard: 120mm Standard Wire Case Fan Guard Grill [Black]
Power Supply: Be Quiet! BN180 L8 430W Modular PSU
Storage 1: Samsung 830 256GB SSD (main datastore)
Storage 2: Seagate Barracuda 2TB [ST2000DM001] (second datastore)
Storage 3: Vertex 1 30GB SSD (.vswp datastore)
Storage 4: Patriot Extreme Performance Xporter XT Rage 8GB (local storage for ESXi)
Storage Adapter Bracket: SilverStone SST-FP55B (allows 1 x 5.25″ and 2 x 2.5″ in one 5.25″ slot!)
Network 1: Onboard Dual Intel 82574L Gigabit Ethernet Controllers
Network 2: HP NC360T PCI Express Dual Port Gigabit Server Adapter [which effectively is Intel PRO/1000 PT Dual Port NIC]
Network 3: Intel Ethernet Converged Network Adapter X520-DA2, 10GbE, Dual Port
RAID Controller: IBM ServeRAID M1015 [which kinda is OEM version of LSI 9220-8i]
Project Update #1
ODD: Toshiba/Samsung TS-H653 20x DVD±RW DL SATA Drive
I will be updating this post as work on the server progresses! Stay tuned.
Project Update #2
Project Update #3
Project Update #4
Project Update #5
Project Update #6
Project Update #7
Closer look at what’s happening (or not happening as a matter of fact):
“Perf Charts service experienced an internal error. Message: Report application initialization is not completed successfully. Retry in 60 seconds.”
Now, this error has been around for as long as I can remember. There are many causes of it but I will try to cover the one I have experienced (and solved)
Let’s get to it.
In vCenter 4.x this has never been an issue and charts stopped working since I have upgraded my vCenter to version 5.0 Update 2. Generally you look at log files for vCenter (stats.log is what we’re after) to determine the root cause. Location of stats.log depends on version of Windows and its as follows:
Interesting question and even more interesting is why VMware would use such an archaic version of mpt2sas driver in their fairly recent builds of ESXi. Quick background on why I’m writing about this.
I bought my IBM M1015 RAID controller from eBay for about £65 and since M1015 is not supported by ESXi natively cross-flashing was the only way to get it working without too much of a hassle (if you could call cross-flashing RAID controllers not too much hassle!) I went for IT mode as opposed to IR for simplicity and ease of adding drives without mocking about with virtual disks etc. I will write a separate post about how to cross-flash to IT/IR mode later on this week (if time permits)
Going back to my issue, here is what my IBM card looks like right now cross-flashed to LSI 9211-8i in IT mode:
As you can see it’s running the latest available firmware (P15) and its in IT mode meaning its simply doing straight pass-through for any connected hard drives. Once we’re booted to ESXi we can quickly list all HBAs and the driver names by issuing this command:
Here is rather not interestingly looking error message popping up when you don’t have Syslog configured properly on ESXi 5.x. I have seen few variations of this error but only have one screenshot at hand!
In a nutshell the cryptic message says that you have logs configured on non-persistent storage and they’ll not survive a reboot of the host. If we look closely at the exact location they’re indeed configured to point at ESXi scratch partition i.e.  /scratch/log:
There are at least three ways to get us out of trouble in this situation:
Use 3rd party Syslog server,
Use Syslog server that’s bundled with vCenter 5,
Use persistent storage to store your logs.
Today meant to be just another ordinary day in the office and for most part this was the case. Around 1PM I have noticed that 2AM NetApp VSC backup job was still running… Bit odd I thought as this never happened before – its normally done in 30 minutes tops. vSphere client was showing 1 VM in recent tasks as being in progress. Hmm so what’s up with that VM then? It looked completely stuck, I couldn’t edit settings, power off, reset etc. Nothing worked. Tasks and events tab was explaining the situation a bit better:
So basically backup started and its stuck while taking snapshots due to being unable to quiesce the file system. Beyond this point vSphere client is pretty much useless so its was time to hit the command line via SSH to get me out of trouble. First you need to know the name of your stuck VM, it doesn’t have to be letter for letter as you can simply search for it using grep in a list of active processes on ESXi host. My VM had ‘STD’ in its name (aka Standard flavor of Windows Server 2012) and to find the actual PID number I’d to run the following command:
to kill the process (that run your VM) its simply kill command followed by PID number, in my case:
Now it should be gone. Quick check for PID number that we killed shows there is no such PID anymore – good.
At this point my VSC job simply timed out and moved on to backup other VMs in the datastore.