Sunday, September 27, 2015

ESXi 5.5 Upgrade to 6: Invalid argument when creating vfat file system

Today when we try to upgrade one of our ESXi host we had a strange error.

There was an issue with one of the partitions and somehow the ESXi install was not able to reformat the partition for the upgrade. Maybe because a previous wrong configuration on the partitions.

VMware KB says: "The upgrade attempts to reformat the scratch partition (partition #2) with a VFAT filesystem. This mostly likely fails, as the maximum size allowed for VFAT is 4GB, and most VMFS datastores are larger than 4GB. A 4GB datastore may potentially be erased."

Note: You can check this issue in the VMware KB: KB2015828

So need to investigate which type of partition is this and correct the problem before re-run the upgrade again.

This tasks can be done during the upgrade, with the Alt+F1 we can go to console during the upgrade, or just cancel the upgrade and reboot the server to start normally.

But unfortunately in this case was not possible, the upgrade re-started and ESXi begin loop and never went to normal boot. So go for the first option, correct the problem during the upgrade, or rollback the upgrade and start the ESXi with normal boot.

I decided to cancel and rollback the upgrade. To do this, I need to start the ESXi with the option recovery mode.

After the ESXi started I press SHIFT+R(check image)

After the recovery mode restart, the ESXi will start normally and then we can start the troubleshooting.

Check VMware KB for recovery mode: KB1033604

Connecting to ESXi console(with SSH) started to check the devices/partitions

Since this was an upgrade from ESXi 5.5 to 6.0, to check the partitions I need to use esxcfg-scsidevs command to check the device and partition that was in the error(mpx.vmhba32:C0:T0:L0:2)
# esxcfg-scsidevs -c
Device UID                            Device Type      Console Device                                            Size      Multipath PluginDisplay Name
mpx.vmhba32:C0:T0:L0                  Direct-Access    /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0                  7600MB    NMP     Local USB Direct-Access (mpx.vmhba32:C0:T0:L0)
naa.600508b1001c4118e0e0f8b71eb9b654  Direct-Access    /vmfs/devices/disks/naa.600508b1001c4118e0e0f8b71eb9b654  1144609MB NMP     HP Serial Attached SCSI Disk (naa.600508b1001c4118e0e0f8b71eb9b654)
naa.60a980002d676739503f426f7675504d  Direct-Access    /vmfs/devices/disks/naa.60a980002d676739503f426f7675504d  768062MB  NMP     NETAPP iSCSI Disk (naa.60a980002d676739503f426f7675504d)
As we can see in the image error the device/partition that is preventing the upgrade is the vmhba32:C0:T0:L0. So I need to check the partition #2 in this device.

So we need to use partedUtil to check this.
# partedUtil getptbl /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0
968 255 63 15564800
1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128
5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0
8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
2 15357952 15562751 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0 
The problem is in the #2 partition that is identify as the vmkDiagnostic partition. vmkDiagnostic partition is the coredump partition. So we need to fix the coredump partition.

Just to check the coredump partition run: esxcli system coredump.
# esxcli system coredump partition list

Name                    Path                                        Active  Configured
----------------------  ------------------------------------------  ------  ----------
mpx.vmhba32:C0:T0:L0:2  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:2    true        true
mpx.vmhba32:C0:T0:L0:7  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:7   false       false
I get 2 partitions in this ESXi for coredump. One active and other not active.

A previous ESXi installation and configuration was not very well configured.
So nothing to fix here, just delete the partition that is freezing the upgrade and re-run the upgrade.

To delete the partition we need to use the partedUtil again.

# partedUtil delete /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0 2
Error: Read-only file system during write on /dev/disks/mpx.vmhba32:C0:T0:L0
Unable to delete partition 2 from device /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0
So since this a coredump and read-only partition we cannot delete before we disable the coredump.
# esxcli system coredump partition set --enable false
Lets check the coredump partitions again
# esxcli system coredump partition list
Name                    Path                                        Active  Configured
----------------------  ------------------------------------------  ------  ----------
mpx.vmhba32:C0:T0:L0:2  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:2   false        true
mpx.vmhba32:C0:T0:L0:7  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:7   false       false
Now coredump partition is disable, then we can delete the partition.
# partedUtil delete /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0 2

Checking partitions again, we can check that #2 was deleted.
# esxcli system coredump partition list

Since partition #7 is also set in the ESXi, I deleted also, then after upgrade we can create a new one, or the upgrade itself will create a new one.

Now we can re-run the upgrade again and will finish without any issues.

Hope this article can you help fixing this issue that you may encounter in your ESXi upgrade.

Monday, September 21, 2015

vCenter 6.0: Quick stats on hostname is not up-to-date

Today we have encounter some issues on our upgraded ESXi 6.0 hosts.

Some of our hosts were with the error "Quick stats on hostname is not up-to-date".

An example:

This is know issue in 5.0 and 5.5, that is fix in the updates, but in vCenter 6.0.

The workaround to fix this issue is to add this quickStats parameters to the Advanced Settings of vCenter Server:

  • vpxd.quickStats.HostStatsCheck
  •  vpxd.quickStats.ConfigIssues
Note: Adding these parameters to vCenter Server does not affect future upgrades.

So lets add this paramenters to the vCenter Advanced Settings.

Connect to your vCenter using vSphere Client and go to Administration > vCenter Server Settings

Add the first key in(vpxd.quickStats.HostStatsCheck) and add a value "False"

Click "Add" and then add the second parameter(vpxd.quickStats.ConfigIssues).

After adding both parameters, restart your vCenter Server services.

Then all messages should go away and the problem is fixed.

Note: For a more detail information about this issue, please read KB-2061008 from VMware

Hope this article can you help fixing this issue.

Thursday, September 3, 2015

ESXi: How to upgrade to ESXi 6.x using VMware Update Manager

After we have upgrade our vCenter from 5.0 to 6.x, we now need to upgrade our ESXi hosts. We can upgrade using a ESXi ISO image directly boot the host and manually go trough the process and upgrade the ESXi, or we can use the VMware Update Manager to do this automatically. We will explain both processes in this article.

Note: Before we start, again I need to inform that for security reasons, I need to erase some of the information in some images. Since I have have taken this images in the production environment. 

1- Upgrade to ESXi using VMware Update Manager(VUM):
First we need to upload our ESXi 6.0 ISO file to our Update Manager so that VUM can use this image to upgrade our ESXi. Connect to your vCenter using vSphere Client Tool and go to VUM (Home - Solution and Applications - Update Manager)

Note: To use VMware Update Manager  to upgrade or apply patches in ESXi hosts needs we need o use the vSphere Client tool and not Web Client(Web Client doesn't have Remediate button) Choose "ESXi Images" Tab and click in "Import ESXi Image"

Just choose your ISO image and upload to VUM In our case, since these are HP hosts, I will add a HP ESXi image.

Now the upload process will start. Wait until is finish with success.

After the import is finish you need to create a Baseline(if you don't any in the ESXi Images) that you need to attach to the hosts that will use this upgrade process.

After we upload the image and create the Baseline upgrade, we need now to attach now this new Baseline to our servers, or cluster.

We go back to Host and Clusters and choose tab Update Manager. Select your Cluster, or hosts that you want to attach to this Baseline(in this case the hosts that we will upgrade), then click Attach(upper right conner). In this case we choose to select the Cluster(and consequently all the hosts inside will be attached)

Then select the Baseline you want to attach. For this case is the one that we created while importing the ISO image, with the name ESXi 6.0. Click Attach button

After the Cluster/Hosts are attached to the Baseline we will start the upgrade.
Select again the same Cluster(or the host if you did attached by host) click the Remediate button that is in lower right conner.

Next select upgrade Baselines and then select the proper Baseline(in this case we only have the one we have created ESXi 6.0, and all host will be displayed) and then click Next.

Nest just check all the informations and change what you need for your environment(like move VMs before upgrade, etc. In this case all VMs were power down and hosts were already in Maintenance Mode)

In our case we had this warning. Since we have in this cluster some HP G7, G8 and G9, the upgrade is informing that we should enable Enhanced vMotion Compatibility(EVC) in this cluster.

If you are not so familiar with VMware EVC, you should take a look HERE

Then just click Finish and start the upgrade process.

If you look at the tasks(in the bottom of your vSphere Client Tool), you should see the upgrade task running.

After this, VUM will start the upgrade in the hosts one by one.

In this case, in 8 hosts, 2 did not finish properly. So I decided to upgrade them manually.

2 - Upgrade to ESXi 6.0 using Boot ISO image:

We our case we have a HP G7 to upgrade, so we will use iLO to do this.

Choose and Add the ISO image to the Virtual Drives of your iLO console

Choose the ISO image.

Reboot, or power on the server and then click F11 to start the Boot Menu

Choose 1 to boot from the CD-ROM

Then start to install/upgrade ESXi.

The installation will start to scan any previous installations and disks.

In this case this HP G7 one SD Card with 4Gb(is where ESXi 5.5 is installed) and 2 local RAID volumes with 300Gb and 1Tb. Since we have some VMs in the local Volumes, we need to leave this untouchable so that we don't lose any VMs or Data.

We will select the SD Card.

But before we click Enter, we will press F1 to check the details on the SD Card. This will give us the information of the ESXi that is installed in this volume. Or if we have doubts where the ESXi is installed, we can just click F1 in each Volume to give the details.

After confirmed that ESXi 5.5 is in this Volume, we just select the Volume and click Enter and continue with the upgrade.

After the installation recognize that is a previous ESXi installation on that volume we have the option to do a fresh install(will format and do a clean install), or just select the Upgrade option. In our case we are upgrading, so lets just select Upgrade and continue.

Confirm pressing F11 and continue.

After the Upgrade process will start and Finish

After is finish, just remove the ISO image from the iLO Virtual Driver, if not host will boot again with the ISO and the installation process.

And now ESXi is upgrade to ESXi 6.0.

Note: Using VUM or manually ISO image to upgrade your hosts sometimes after an upgrade the host cannot connect to the vCenter and the host stays gray-out and says disconnect in the vCenter hosts lists, in this case just right click(wait around 5m after the host is power up) in the host and select the option Connect(this will force the host to reconnect to the vCenter).

Hope this article can you help upgrading from ESXi 5.5(or 5.0) to ESXi 6.0 with both options.