Friday, December 4, 2015

ESXi 6.0 - CBT bug and fix(fix VMs and reset CBT) Part II

Continuing the ESXi 6.0 CBT(Change Block Tracking) issue, so that backups and restore can work properly, CBT needs to be reset. This is a mandatory task that needs to be done to all VMs that are in the Backups jobs. Regardless of the Backup tool you use to backup your VMware environment.

Again, in our case we use Veeam and Veeam has a script that will reset CBT in all VMs that are poweron and with no spanshots on it(power off VMs and with Spashots will be excluded from the list of VMs that will have the CBT reset).

What is consisting this CBT fix in the VMs(regardless the Guest OS). This will reset the CBT so that backups can start from zero and have a proper consistence in the incremental backups.

Tasks performed by this script:

•    Disable CBT in the VM
•    Create a VM snapshot
•    Remove VM Snapshot
•    Enable CBT in the VM

We can download the Veeam script HERE

But in our case there was an issue with a code line in that script. Script could not identify VMs with the CBT enabled. VMs list was null.

After create a small and simple scirpt to test CBT enable, or disable in a VM I found the problem Veeam script.

This is my script:
$VMCBT = Get-Cluster "Cluster Test" | get-vm 

foreach ($VMCBTs in $VMCBT){

if ($VMCBTs | where {$_.ExtensionData.config.ChangeTrackingEnabled -eq $true}){
 Write-Host ($VMCBTs.Name) ' CTB Enable'

 } else { Write-Host ($  ' CBT is disable'}
 In Veeam script found the issue in this line:
$ivms = get-vm | ?{$_.ExtensionData.Config.ChangeTrackingEnabled -eq $true};
With this $ivms always returns null, since there was no condition to check if VM had CBT enabled, or not.

So add a where condition to the line, fix the issue and all VMs with CBT enabled were listed.
$ivms = Get-Cluster "Cluster Test" | get-vm | where ' {$_.ExtensionData.config.ChangeTrackingEnabled -eq $true};
Since I will run the script by Cluster and not in all vCenter, I add also the Get-Cluster option.

Hope this helps you to fix your CBT on your VMs also.

Wednesday, November 25, 2015

ESXi 6.0 - CBT bug and fix

In the last weeks there was found another bug in the CBT feature in ESXi 6.0 and ESXi Update 1.

The bug affects all Backups in the VMware Virtual area. This bug affect any tool that use CBT(used in the incremental backups) in their backups.  

VMware note:

"When running incremental virtual machine backups, backup applications typically rely on the vSphere API call QueryDiskChangedAreas() to determine the changed sectors.

This issue occurs due to a problem with CBT in the disklib area, which results in the change tracking information of I/Os that occur during snapshot consolidation to be lost. The main backup payload data is never lost and it is always written to the backend device. However, the corresponding change tracking information entries which occur during the consolidation task are missed. Subsequent QueryDiskChangedAreas() calls do not include these missed blocks and, therefore a backup based on this CBT data is inconsistent." 
To fix this issue, we need to disable CBT in the VMs, or just deselected the use of CBT in the Backups jobs.
  • How to disable CBT in your Veeam Backup & Replication:
             Edit job and in the Storage menu choose Advanced and in vSphere Tab. Disable in the CBT

Or other option was to downgrade to ESXi 5.5 and VM hw revision 10.  This is no solution for us of course.

In out environment this have a huge impact. As we all know, CBT is to be use mainly in Incremental backups, whiteout this all changes are not synchronized and all backups work as was a full backup.

Using CBT backups will only backup the changes that happen in the VMs after the last backup.

In our case we use Veeam Backup & Replication, and on jobs that normally takes 1/2h to finish is taking more then 3/4h. Others with more VMs that takes around 6/7h to finish is taking more than 18h to finish. With this our daily backups cannot finish in a 24h cycle and have a huge impact in our environment.

Plus the size of each backup will increase. When you have around 20Tb of backup repository data already, this size will increase. And in the last days I need to had more 2Tb to our Backup Repository because of this problem.

Of course this is huge problem for many environments. Not only the backups cycle are not running properly(with all issues in the restores that maybe be needed), but also the size of the backups will increase.

UPDATE 27-11-2015: Finally in the last days VMware as launched a fix for this Bug.
We will implement during this next weekend and hope that will fix the problem.

More information about the patch in VMware HERE

Hope this article can you help understanding this bug and fix.

Sunday, September 27, 2015

ESXi 5.5 Upgrade to 6: Invalid argument when creating vfat file system

Today when we try to upgrade one of our ESXi host we had a strange error.

There was an issue with one of the partitions and somehow the ESXi install was not able to reformat the partition for the upgrade. Maybe because a previous wrong configuration on the partitions.

VMware KB says: "The upgrade attempts to reformat the scratch partition (partition #2) with a VFAT filesystem. This mostly likely fails, as the maximum size allowed for VFAT is 4GB, and most VMFS datastores are larger than 4GB. A 4GB datastore may potentially be erased."

Note: You can check this issue in the VMware KB: KB2015828

So need to investigate which type of partition is this and correct the problem before re-run the upgrade again.

This tasks can be done during the upgrade, with the Alt+F1 we can go to console during the upgrade, or just cancel the upgrade and reboot the server to start normally.

But unfortunately in this case was not possible, the upgrade re-started and ESXi begin loop and never went to normal boot. So go for the first option, correct the problem during the upgrade, or rollback the upgrade and start the ESXi with normal boot.

I decided to cancel and rollback the upgrade. To do this, I need to start the ESXi with the option recovery mode.

After the ESXi started I press SHIFT+R(check image)

After the recovery mode restart, the ESXi will start normally and then we can start the troubleshooting.

Check VMware KB for recovery mode: KB1033604

Connecting to ESXi console(with SSH) started to check the devices/partitions

Since this was an upgrade from ESXi 5.5 to 6.0, to check the partitions I need to use esxcfg-scsidevs command to check the device and partition that was in the error(mpx.vmhba32:C0:T0:L0:2)
# esxcfg-scsidevs -c
Device UID                            Device Type      Console Device                                            Size      Multipath PluginDisplay Name
mpx.vmhba32:C0:T0:L0                  Direct-Access    /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0                  7600MB    NMP     Local USB Direct-Access (mpx.vmhba32:C0:T0:L0)
naa.600508b1001c4118e0e0f8b71eb9b654  Direct-Access    /vmfs/devices/disks/naa.600508b1001c4118e0e0f8b71eb9b654  1144609MB NMP     HP Serial Attached SCSI Disk (naa.600508b1001c4118e0e0f8b71eb9b654)
naa.60a980002d676739503f426f7675504d  Direct-Access    /vmfs/devices/disks/naa.60a980002d676739503f426f7675504d  768062MB  NMP     NETAPP iSCSI Disk (naa.60a980002d676739503f426f7675504d)
As we can see in the image error the device/partition that is preventing the upgrade is the vmhba32:C0:T0:L0. So I need to check the partition #2 in this device.

So we need to use partedUtil to check this.
# partedUtil getptbl /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0
968 255 63 15564800
1 64 8191 C12A7328F81F11D2BA4B00A0C93EC93B systemPartition 128
5 8224 520191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
6 520224 1032191 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
7 1032224 1257471 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0
8 1257504 1843199 EBD0A0A2B9E5443387C068B6B72699C7 linuxNative 0
2 15357952 15562751 9D27538040AD11DBBF97000C2911D1B8 vmkDiagnostic 0 
The problem is in the #2 partition that is identify as the vmkDiagnostic partition. vmkDiagnostic partition is the coredump partition. So we need to fix the coredump partition.

Just to check the coredump partition run: esxcli system coredump.
# esxcli system coredump partition list

Name                    Path                                        Active  Configured
----------------------  ------------------------------------------  ------  ----------
mpx.vmhba32:C0:T0:L0:2  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:2    true        true
mpx.vmhba32:C0:T0:L0:7  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:7   false       false
I get 2 partitions in this ESXi for coredump. One active and other not active.

A previous ESXi installation and configuration was not very well configured.
So nothing to fix here, just delete the partition that is freezing the upgrade and re-run the upgrade.

To delete the partition we need to use the partedUtil again.

# partedUtil delete /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0 2
Error: Read-only file system during write on /dev/disks/mpx.vmhba32:C0:T0:L0
Unable to delete partition 2 from device /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0
So since this a coredump and read-only partition we cannot delete before we disable the coredump.
# esxcli system coredump partition set --enable false
Lets check the coredump partitions again
# esxcli system coredump partition list
Name                    Path                                        Active  Configured
----------------------  ------------------------------------------  ------  ----------
mpx.vmhba32:C0:T0:L0:2  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:2   false        true
mpx.vmhba32:C0:T0:L0:7  /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0:7   false       false
Now coredump partition is disable, then we can delete the partition.
# partedUtil delete /vmfs/devices/disks/mpx.vmhba32:C0:T0:L0 2

Checking partitions again, we can check that #2 was deleted.
# esxcli system coredump partition list

Since partition #7 is also set in the ESXi, I deleted also, then after upgrade we can create a new one, or the upgrade itself will create a new one.

Now we can re-run the upgrade again and will finish without any issues.

Hope this article can you help fixing this issue that you may encounter in your ESXi upgrade.

Monday, September 21, 2015

vCenter 6.0: Quick stats on hostname is not up-to-date

Today we have encounter some issues on our upgraded ESXi 6.0 hosts.

Some of our hosts were with the error "Quick stats on hostname is not up-to-date".

An example:

This is know issue in 5.0 and 5.5, that is fix in the updates, but in vCenter 6.0.

The workaround to fix this issue is to add this quickStats parameters to the Advanced Settings of vCenter Server:

  • vpxd.quickStats.HostStatsCheck
  •  vpxd.quickStats.ConfigIssues
Note: Adding these parameters to vCenter Server does not affect future upgrades.

So lets add this paramenters to the vCenter Advanced Settings.

Connect to your vCenter using vSphere Client and go to Administration > vCenter Server Settings

Add the first key in(vpxd.quickStats.HostStatsCheck) and add a value "False"

Click "Add" and then add the second parameter(vpxd.quickStats.ConfigIssues).

After adding both parameters, restart your vCenter Server services.

Then all messages should go away and the problem is fixed.

Note: For a more detail information about this issue, please read KB-2061008 from VMware

Hope this article can you help fixing this issue.

Thursday, September 3, 2015

ESXi: How to upgrade to ESXi 6.x using VMware Update Manager

After we have upgrade our vCenter from 5.0 to 6.x, we now need to upgrade our ESXi hosts. We can upgrade using a ESXi ISO image directly boot the host and manually go trough the process and upgrade the ESXi, or we can use the VMware Update Manager to do this automatically. We will explain both processes in this article.

Note: Before we start, again I need to inform that for security reasons, I need to erase some of the information in some images. Since I have have taken this images in the production environment. 

1- Upgrade to ESXi using VMware Update Manager(VUM):
First we need to upload our ESXi 6.0 ISO file to our Update Manager so that VUM can use this image to upgrade our ESXi. Connect to your vCenter using vSphere Client Tool and go to VUM (Home - Solution and Applications - Update Manager)

Note: To use VMware Update Manager  to upgrade or apply patches in ESXi hosts needs we need o use the vSphere Client tool and not Web Client(Web Client doesn't have Remediate button) Choose "ESXi Images" Tab and click in "Import ESXi Image"

Just choose your ISO image and upload to VUM In our case, since these are HP hosts, I will add a HP ESXi image.

Now the upload process will start. Wait until is finish with success.

After the import is finish you need to create a Baseline(if you don't any in the ESXi Images) that you need to attach to the hosts that will use this upgrade process.

After we upload the image and create the Baseline upgrade, we need now to attach now this new Baseline to our servers, or cluster.

We go back to Host and Clusters and choose tab Update Manager. Select your Cluster, or hosts that you want to attach to this Baseline(in this case the hosts that we will upgrade), then click Attach(upper right conner). In this case we choose to select the Cluster(and consequently all the hosts inside will be attached)

Then select the Baseline you want to attach. For this case is the one that we created while importing the ISO image, with the name ESXi 6.0. Click Attach button

After the Cluster/Hosts are attached to the Baseline we will start the upgrade.
Select again the same Cluster(or the host if you did attached by host) click the Remediate button that is in lower right conner.

Next select upgrade Baselines and then select the proper Baseline(in this case we only have the one we have created ESXi 6.0, and all host will be displayed) and then click Next.

Nest just check all the informations and change what you need for your environment(like move VMs before upgrade, etc. In this case all VMs were power down and hosts were already in Maintenance Mode)

In our case we had this warning. Since we have in this cluster some HP G7, G8 and G9, the upgrade is informing that we should enable Enhanced vMotion Compatibility(EVC) in this cluster.

If you are not so familiar with VMware EVC, you should take a look HERE

Then just click Finish and start the upgrade process.

If you look at the tasks(in the bottom of your vSphere Client Tool), you should see the upgrade task running.

After this, VUM will start the upgrade in the hosts one by one.

In this case, in 8 hosts, 2 did not finish properly. So I decided to upgrade them manually.

2 - Upgrade to ESXi 6.0 using Boot ISO image:

We our case we have a HP G7 to upgrade, so we will use iLO to do this.

Choose and Add the ISO image to the Virtual Drives of your iLO console

Choose the ISO image.

Reboot, or power on the server and then click F11 to start the Boot Menu

Choose 1 to boot from the CD-ROM

Then start to install/upgrade ESXi.

The installation will start to scan any previous installations and disks.

In this case this HP G7 one SD Card with 4Gb(is where ESXi 5.5 is installed) and 2 local RAID volumes with 300Gb and 1Tb. Since we have some VMs in the local Volumes, we need to leave this untouchable so that we don't lose any VMs or Data.

We will select the SD Card.

But before we click Enter, we will press F1 to check the details on the SD Card. This will give us the information of the ESXi that is installed in this volume. Or if we have doubts where the ESXi is installed, we can just click F1 in each Volume to give the details.

After confirmed that ESXi 5.5 is in this Volume, we just select the Volume and click Enter and continue with the upgrade.

After the installation recognize that is a previous ESXi installation on that volume we have the option to do a fresh install(will format and do a clean install), or just select the Upgrade option. In our case we are upgrading, so lets just select Upgrade and continue.

Confirm pressing F11 and continue.

After the Upgrade process will start and Finish

After is finish, just remove the ISO image from the iLO Virtual Driver, if not host will boot again with the ISO and the installation process.

And now ESXi is upgrade to ESXi 6.0.

Note: Using VUM or manually ISO image to upgrade your hosts sometimes after an upgrade the host cannot connect to the vCenter and the host stays gray-out and says disconnect in the vCenter hosts lists, in this case just right click(wait around 5m after the host is power up) in the host and select the option Connect(this will force the host to reconnect to the vCenter).

Hope this article can you help upgrading from ESXi 5.5(or 5.0) to ESXi 6.0 with both options.

Wednesday, August 26, 2015

vCenter: How to Upgrade Windows vCenter 5.0 to vCenter 6.x

We had a vCenter 5.0(Windows 2008 R2) that need to upgraded to 6.0, hosts included. This was a small vCenter and since there was no point of creating a new vCenter 6.0 from scratch, we decided to upgrade directly from v5.0 to v6.0. 

Since we had a SQL Express in this vCenter, the upgrade will also migrate our vCenter SQL DB to vPostgreSQL(this is the new integrated vCenter DB used in vCenter 6.0).

First we should always do a Backup of our vCenter(in our case a full backup of the VM with Veeam Backup). If you don't have a Backup Infrastructure, you should at least do a snapshot of the VM before starting any upgrade.

In vCenter 6.0 there is two installation(and also in upgrading) options:
  • Embedded Deployment Model: (all services in the same VM and a integrated DB. In this case vPostgreSQL)
  1.         vCenter Server with an embedded Platform Services Controller
  • External Deployment Model:(vCenter in one VM and vCenter Services in a second VM. Like SSO, Inventory Services, DB etc.)
  1.         External Platform Services Controller
  2.         vCenter Server with external Platform Services controller 

Since this is small vCenter we also decided to install everything in the same VM. Like we had before the upgrade.

Note: In a normal vCenter, we always go for the External Deployment Model. Where we put vCenter in one VM, vCenter Services in another VM, and also the DB in a different VM(can be a SQL with several vCenter DB, or just with one. We have both cases).

Before you start your upgrade you should check the requirements for vCenter 6.0. You environment can have the requirements for vCenter 5.0, but nor for 6.0. Check HERE the requirements.

Upgrade Process:

Start to run the autorun.exe in your VM.

  • Choose first option "vCenter Server for Windows" to start the upgrade process.

  • As we can check in the next image, we have two options to install/upgrade our vCenter. In our case, as I stated in the beginning of this article, we will use the "Embedded Deployment".

  • Just click "Next"

    • In the next image we need to provide Admin credentials for you current vCenter 5.0. Since this installation/upgrade will do change in the SQL DB. This administrator should have also admin(sysadmin) permissions in the existence vCenter DB(SQL)

        • In the next image upgrade process shows an error when tried to connect to the DB.

          Troubleshooting the issue, we notice that ODBC DSN to connect to vCenter 5.0 DB was using a SQL Native Client driver(older than 10). vCenter 5.1, 5.5 and 6.0, needs at least SQL Native Client 10 version..

          So we need to remove this old ODBC DSN driver and create a new ODBC DSN(with the proper driver) connection to vCenter 5.0 DB. Only after the changes we can continue with the upgrade process. We download from HERE the SQL Native Client 11 and use this version.

          If we create a new ODBC DSN with the same name, same user/password, no need to change anything. Just remove the old one and create a new one and test the connection to the vCenter DB. After connection success we are good to go and continue the upgrade process.

          But unfortunately in our case this was a old implementation and we don't have the user/password that was used before. So we need to create a new one with a new user and pass(now we have a user for any VMware Windows Service needed e and will use that one).

          NOTE: Before do any change in the ODBC DSN and regedit stop vCenter Windows service.

          When using different name and/or new user on the ODBC the Windows register still have the old one(and also the settings in the vCenter), so vCenter services(and DB connection) will not run until you fix this issue.

          Note: If you don;t how, check HERE how to create a ODBC DSN.

          After we create a new ODBC(don't forget that for vCenter is ODBC x64 bit, for Update Manager ODBC is a x32 bit) we need to edit regedit and change the settings.

          Note: To see these keys in a 32-bit version of the Registry Editor in a 64-bit operating system, click Start > Run, type %systemroot%\syswow64\regedit, and click OK.

          Go to “HKEY_LOCAL_MACHINE\SOFTWARE\VMware, Inc.\VMware VirtualCenter\DB”(in our case, different vCenter versions should have different location).

          You should see the values for the old ODBC DNS in regedit

          1. "ODBC name" (need to change)
          2. "user" (need to change)
          3. "password" (no need to change - this will change automatically in the next task)
          4. "SQL Native driver name" (no need to change)

          Now change entry 1 and 2(in our case. You can have the same ODBC DSN name, and change only the 2 that is the user) using right click and choose "Modify"

          After the changes, close the regedit.

          This is an example:

          After all this changes, we just need a last step. We need to change the vCenter settings so that uses the new password for the new user(value nr 3 in the regedit).

          To change the password we need to run the vpxd tool. This tool is located in C:\Program Files\VMware\Infrastructure\VirtualCenter Server\(if you choose a different location in the vCenter instaltion, please use the location of your vCenter installation).

          You need to open a a command prompt(but elevated to administrator), change the folder to the location of the tool and run "vpxd.exe -p" (if the command prompt was not open with the option "run as administrator", the tool will not work)

          The tool will ask you to change the password for the DB, you should give the user password that you used(or will use) in the ODBC DSN.

          An example:

          Start your vCenter Windows Service again and test the vCenter.

          Note: HERE you can have more information about how to change ODBC DSN, regedit and changing vCenter DB settings.

          After this changes, we can continue with our upgrade process.

          If you cancel your upgrade, you can started again, in our case, I left running and did the changes.
          And retry to connect to the DB using the user.

          After the changes, the upgrade process did connect to the DB and the upgrade process continues.

          • Next you can the information that your SQL Express(in our case), will me migrated to vPostgres DB(since this is the integrated DB used in 6.x)

          • Next you have the information about the certificates(if you use the defaults certificates, no problems here, if you use your own, you should backup them, and then re-use)

          • Next we need to create a password for our administrator(in the vCenter domain vsphere.local). This is a very important administrator user password in the SSO, so please remember the password created. Leave the rest of the values in the default.

          Note: "Join a vCenter Single Sign-On domain", is only if you have already a another vCenter instance(in this case a SSO) that you want to connect and add this vCenter. Not in this case.

          • Next we have the vCenter ports. Since we don't need to change anything, we will leave all the default values. 

          •  Next image, just double check all informations and if everything is ok, click "Next"

          • Next image is the location of our installation, no changes here, so "Next" and start the upgrade process.

          • Next image, vCenter Upgrade is running. This is a process that maybe take a while. In our case was 45 minutes. 

          •  After the upgrade is finish and you can click "Finish"
          Run the Web Client and check your new upgraded vCenter 6.0

          After the vCenter upgrade, you should also install/upgrade your vSphere Client Tool to 6.0(not mandatory in the vCenter itself).

          Since we had in this vCenter also VMware Update Manager, we did also ugpraded. But this I will explain in a new article.

          Hope this can help you upgrading your vCenter 5.0 directly yo 6.0.