Tuesday, 27 December 2016

VDP Upgrade To 6.1.3 Fails: Installation Of The Package Stalled

vSphere Data Protection 6.1.3 is a pretty stable release in the 6.x chain. However, there was an issue getting to this version from a prior release. This issue is specifically seen while upgrading VDP from 6.1 to 6.1.3. The upgrade fails approximately around 60 percent, with an error "Installation of the package stalled" and it prompts you to revert to the snapshot taken prior to the upgrade.

The avinstaller.log has the following:

INFO: Working on task: Starting MCS (74 of 124) id: 10312
Dec 12, 2016 1:07:32 PM com.avamar.avinstaller.monitor.PollMessageSender sendMessage
INFO: PollMessageSender sent: <UpdateMessage><Progress>60</Progress><TaskName>Starting MCS (74 of 124)</TaskName><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><TaskId>10312</TaskId><Timestamp>2016/12/12-13:07:32.00489</Timestamp></UpdateMessage>
INFO: Package Location in process: /data01/avamar/repo/temp/vSphereDataProtection-6.1.3.avp_1481542766965
<UpdateMessage><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><Timestamp>2016/12/12-13:07:38.00626</Timestamp><Content>Starting MCS ...</Content></UpdateMessage>
Dec 12, 2016 1:07:38 PM com.avamar.avinstaller.process.TaskTimeManager resetTimer
INFO: startTime: 1481544458653
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.monitor.MessageListener doPost
INFO: MessageListener receiving message
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.monitor.MessageListener doPost
INFO: MessageListener forwarding message: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<UpdateMessage><Content><![CDATA["mcserver.sh --start", exit status=1 (error)]]></Content></UpdateMessage>
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.process.MessageReceiver receiveMessage
INFO: MessageReceiver receive message: <UpdateMessage><Timestamp>2016/12/12-13:08:38.00890</Timestamp><Content>"mcserver.sh --start", exit status=1 (error)</Content></UpdateMessage>
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.monitor.PollMessageSender sendMessage
INFO: PollMessageSender sent: <UpdateMessage><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><Timestamp>2016/12/12-13:08:38.00890</Timestamp><Content>"mcserver.sh --start", exit status=1 (error)</Content></UpdateMessage>
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.process.TaskTimeManager resetTimer
INFO: startTime: 1481544518935
Dec 12, 2016 1:08:43 PM com.avamar.avinstaller.monitor.MessageListener doPost
INFO: MessageListener forwarding message: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<UpdateMessage><Content><![CDATA[----->start MCS failed - operation failed]]></Content></UpdateMessage>
<UpdateMessage><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><Timestamp>2016/12/12-13:08:43.00689</Timestamp><Content>-----&gt;start MCS failed - operation failed</Content></UpdateMessage>

Dec 12, 2016 1:08:44 PM com.avamar.avinstaller.process.handler.ServerScriptHandler$TaskProcessor run
WARNING: From err out: 12/12 13:08:38 error: "mcserver.sh --start", exit status=1 (error)
12/12 13:08:43 error: ----->start MCS failed - operation failed

Dec 12, 2016 1:50:04 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush exit stderr:
Dec 12, 2016 2:50:00 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush command: /usr/local/avamar/bin/avinstaller.pl --flush
Dec 12, 2016 2:50:03 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush exit code: 1
Dec 12, 2016 2:50:03 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush exit stdout: Flushing Avinstaller...
INFO: Flushing AvInstaller.
INFO: AvInstaller flushed.

Unfortunately there is no fix available for this issue. This is a known issue while upgrading from 6.1 to 6.1.3. See the release notes here

The workaround is to upgrade from 6.1 to 6.1.2 and then upgrade the 6.1.2 to 6.1.3
If this workaround fails, please raise a ticket with VMware Support. I cannot provide the resolution here due to confidentiality.

Hope this helps.

Tuesday, 20 December 2016

Part 1: Deploying And Configuring vSphere Replication 6.x

If you are using Site Recovery Manager, you would be using a replication technique to replicate the VM data from a protected to a recovery site. There are two ways to get this done. The first one is using a Host Based Replication and the second is Array Based Replication.

vSphere Replication is a Host Based Replication to perform VM replication via the VMkernel. In this case, I will have two vCenters. One in protected and other in recovery site. The deployment for replication is completely from the web client from the 6.x version onward.

Right click any one of the host and select Deploy OVF Template


Browse the local file location to access the .ovf file of this appliance. Select this file and click Open.
Once this file is loaded, click Next


You will be presented with the OVF details regarding the Vendor and the version of replication being deployed. Click Next


Accept EULA and click Next


Provide a Name for the replication appliance a destination folder for deployment and then click Next.


You can switch between 2 or 4 vCPU for this appliance. I will stick to the default. Click Next.


Select a Datastore to deploy this appliance on and choose an appropriate disk provisioning. Click Next.


Select the network where this appliance should reside and IP Protocol and allocation type. If the type is Static - Manual, then enter the Gateway, DNS and Subnet details.


Enter the root password for your vR and provide the NTP server details. Expand Networking Properties and enter the IP for this appliance.


Review the changes and then complete the OVF deployment.


Post the deployment of the appliance, you will have to configure this to the vCenter. To do this, go to the management URL of the vR appliance.

https://vR-IP:5480


Login with the root credentials that was setup during the deployment. Here, there are few details to be filled:

For the LookupService Address: If your vCenter is embedded PSC deployment, then enter the FQDN of the vCenter Server. If the vCenter is an external PSC deployment, then enter the FQDN of the PSC appliance. Enter the SSO user name and the password for the same.

Click Save and Restart Service. If you are requested to accept certificate information from vCenter / PSC, accept it. Post this, the replication appliance will be configured for your vCenter.


You should be seeing this in your vR plugin in web client. I have two appliances is because one is for protected site and the other is for recovery site. 


With this, the deployment and configuration of vR to vCenter is completed.

Part 2: Configure replication sites and configuring replication for a VM.

Monday, 19 December 2016

vSphere Client Console Does Not Display Full Screen

So, while opening a console for any virtual machine from one particular workstation, the console display is not sized correctly. Below is the screenshot of the display.


This was seen on only one machine, any user sessions and a reinstall of vSphere Client did not help to resolve this. And this was seen on a Windows 10 machine.

To resolve this, you will have to disable Display Scaling.

1. Right click the vSphere Client icon and select Properties
2. Click Compatibility
3. Check box, Disable display scaling on high DPI Settings.
4. Apply Settings
5. Reload the vSphere Client session.

Now, the console session should populate the full screen.

Hope this helps.

Friday, 16 December 2016

Sunday, 11 December 2016

ISO Package Not Available During VDP Upgrade From 6.1

When you try upgrading the vSphere Data Protection Appliance from 6.1 to 6.1.1 / 6.1.2 / 6.1.3 the ISO package might not be detected. Once you mount the ISO and go to the vdp-configure page, the package is not seen and you will see the below message:

To upgrade you VDP appliance please connect a valid upgrade ISO image to the appliance

There is a known issue with the ISO detection while upgrading from 6.1 to the current latest release. To fix this, try the steps in the below order:

1. Login to SSH of the VDP appliance and run the below command to view if the ISO is mounted successfully:
# df -h
You should see this following mount:

/dev/sr0 /mnt/auto/cdrom

If this is not seen, run the below command to manually mount the ISO:
mount /dev/sr0 /mnt/auto/cdrom
Run the df -h command again to verify the mount point is now seen. Once this is true, log back into the vdp-configure page and go to Upgrade tab to verify if the ISO package is now detected. 

If not, then proceed to Step 2

2. Patch the VDP appliance with the ISO detection patch. 

Download this patch by clicking this link here

Once this is downloaded, perform the below steps to patch the appliance.

>> Using WinSCP copy the .gz file to a /tmp directory in the VDP appliance
>> Run the below command to extract the file:
tar -zxvf VDP61_Iso_Hotfix.tar.gz
>> Provide execute permissions to the shell script
chmod a+x VDP61_Iso_Hotfix.sh
>> Run the patch script using the below command:
./VDP61_Iso_Hotfix.sh
The script detects if the appliance is 6.1 and patches if yes. If the appliance is not 6.1 the script exits without making any changes.

Post this, log back into the vdp-configure page and the package would be detected and you will be prompted to initiate the upgrade.

Thursday, 8 December 2016

Unable To Expand VDP Storage: There Are Incorrect Number Of Disks Associated With VDP Appliance

There have been few cases where I have been working on lately where there were issues expanding data storage on VDP appliance. In the vdp-configure page if you select the Expand Storage in the Storage section, you receive the error:

VDP: There are incorrect number of disks associated with VDP appliance.



This is due to incorrect update for numberOfDisk parameter in the vdr-configuration.xml file. This file should be updated if any changes are made to the VDP in the configuration page, which includes the deployment and expansion tasks. 

In my case, I had a 512 GB deployment of VDP, this creates 3 data0? partitions of 256 GB each. Which means the numberOfDisk parameter should be 3. However, in the vdr-configuration.xml file, the value was 2. Below is the snippet:

<numberOfDisk>2</numberOfDisk>

To fix this, edit the vdr-configuration.xml file located in /usr/local/vdr/etc and change this parameter to the number of data drives (do not include the OS drive) present on your VDP appliance. In my case the total number of data drives were 3. 

Save the file after editing, and re-run the expand storage task and this error should not be presented again.

Hope this helps.

Wednesday, 30 November 2016

VDP 6.1: Unable To Expand Storage

So, there has been few tricky issues going on with expanding VDP storage drives. This section would talk about specifically about OS Kernel not picking up the partition extents.

A brief intro about what's going on here. So, you know in vSphere Data Protection 6.x onward the dedup storage drives can be expanded. If your backup data drives are running out of space and you do not wish to delete restore points, then this feature allows you to extend your data partitions. In this case, we will login into the https://vdp-ip:8543/vdp-configure page, go to the Storage tab and select the Expand Storage option. The wizard successfully expands the existing partitions. Post this, if you run df -h from the SSH of the VDP, it should pick up the expanded information. In this case, either none of the partitions are expanded or few of them report inconsistent information. 

So, in my case, I had a 512 GB of VDP deployment, which by default deploys 3 drives of ~256 GB each. 

Post this, I expanded the storage to 1 TB. Which would ideally have 3 drives of ~512 GB each. In my case the expansion in wizard completed successfully, however, the data drives were inconsistent when viewed from command line. In the GUI, Edit Settings of the VM, the correct information was displayed.



When I ran df -h, the below was seen:

root@vdp58:~/#: df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2        32G  5.8G   25G  20% /
udev            1.9G  152K  1.9G   1% /dev
tmpfs           1.9G     0  1.9G   0% /dev/shm
/dev/sda1       128M   37M   85M  31% /boot
/dev/sda7       1.5G  167M  1.3G  12% /var
/dev/sda9       138G  7.2G  124G   6% /space
/dev/sdb1       256G  2.4G  254G   1% /data01
/dev/sdc1       512G  334M  512G   1% /data02
/dev/sdd1       512G  286M  512G   1% /data03

The sdb1 was not expanded to 512 GB whereas the data partitions sdc1 and sdd1 were successfully extended. 

If I run fdisk -l then I see the partitions have been extended successfully for all the 3 data0? mounts with the updated space. 

**If you run the fdisk -l command and do not see the partitions updated, then raise a case with VMware**

Disk /dev/sdb: 549.8 GB, 549755813888 bytes
255 heads, 63 sectors/track, 66837 cylinders, total 1073741824 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot      Start         End      Blocks   Id  System
/dev/sdb1               1  1073736404   536868202   83  Linux

Disk /dev/sdc: 549.8 GB, 549755813888 bytes
255 heads, 63 sectors/track, 66837 cylinders, total 1073741824 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot      Start         End      Blocks   Id  System
/dev/sdc1               1  1073736404   536868202   83  Linux

Disk /dev/sdd: 549.8 GB, 549755813888 bytes
255 heads, 63 sectors/track, 66837 cylinders, total 1073741824 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device Boot      Start         End      Blocks   Id  System
/dev/sdd1               1  1073736404   536868202   83  Linux

If this is the case, run partprobe command. This makes the SUSE Kernel aware of the partition table changes. Post this, run a df -h to verify if the data drives are now updated with the correct size. If yes, then stop here. If not, then proceed further.

**Make sure you do this with a help of VMware engineer if this is a production environment**

If the partprobe does not work, then we will have to grow the xfs volume. To do this:

1. Power down the VDP appliance gracefully
2. Change the data drives from Independent Persistent to Dependent
3. Take a snapshot of the VDP appliance
4. Power On the VDP appliance
5. Once the appliance is booted successfully, stop all the services using the command:
# dpnctl stop
6. Grow the mount point using the command:
# xfs_growfs <mount point>
In my case:
# xfs_growfs /dev/sdb1
If successful, you will see the below output: (Ignore the formatting)

root@vdp58:~/#: xfs_growfs /dev/sdb1
meta-data=/dev/sdb1              isize=256    agcount=4, agsize=16776881 blks
         =                       sectsz=512   attr=2
data     =                       bsize=4096   blocks=67107521, imaxpct=25
         =                       sunit=0      swidth=0 blks
naming   =version 2              bsize=4096   ascii-ci=0
log      =internal               bsize=4096   blocks=32767, version=2
         =                       sectsz=512   sunit=0 blks, lazy-count=1
realtime =none                   extsz=4096   blocks=0, rtextents=0
data blocks changed from 67107521 to 134217050

Run df -h and verify if the partitions are now updated.

root@vdp58:~/#: df -h
Filesystem      Size  Used Avail Use% Mounted on
/dev/sda2        32G  5.8G   25G  20% /
udev            1.9G  148K  1.9G   1% /dev
tmpfs           1.9G     0  1.9G   0% /dev/shm
/dev/sda1       128M   37M   85M  31% /boot
/dev/sda7       1.5G  167M  1.3G  12% /var
/dev/sda9       138G  7.1G  124G   6% /space
/dev/sdb1       512G  2.4G  510G   1% /data01
/dev/sdc1       512G  334M  512G   1% /data02

/dev/sdd1       512G  286M  512G   1% /data03

If yes, then stop here.
If not, then raise a support with VMware, as this would go for engineering fix.

Hope this helps.