Tuesday, 10 January 2017

Avamar Virtual Edition 7.1: Failed To Communicate To vCenter During Client Registration

Once you setup your Avamar Server, you will proceed to add the VMware vCenter as a client to this. You will provide the vCenter IP/FQDN along with the administrator user credentials and the default https port 443. However, it errors out stating:

Failed to communicate to vCenter. Unable to find valid certification path to the vCenter. 


The error is due to Avamar not being able to acknowledge the VMware vCenter certificate warning. To do this, we will have to force the avamar to accept the vCenter certificate. 

Login to SSH of the Avamar Server and edit the mcserver.xml file:
# vi /usr/local/avamar/var/mc/server_data/prefs/mcserver.xml

Locate the vcenter certificate ignore parameter by searching for /cert in the vi editor. You will notice the below line.


Change this value from false to true and save the file.

Change the access to admin mode using sudo su - admin and run the below script to restart MCS.
# mcserver.sh --restart

Post this, you should be able to add this vCenter as a client to avamar successfully


That should be it.

Sunday, 8 January 2017

Part 5: Creating Recovery Plans In SRM 6.1

Part 4: Creating protection groups for virtual machines in SRM 6.1

Once you create a protection group, it's time to create a recovery plan. When you want to perform a DR test or a test recovery, it is the recovery plan that you will execute. A recovery plan is tasked to run a set of steps in a particular order to fail over the VMs or test the failover to the recovery site. You cannot change the workflow of the recovery plan, however you can customize by adding your required checks and tasks in between.

Select the production site in SRM inventory and under Summary Tab select Create a recovery plan.


Provide a name for the recovery plan and an optional description and click Next.


Select the recovery site where you want the VMs to failover to and click Next.


The Group type will be VM protection groups and then select the required protection groups to be added to this recovery plan. Only the VMs in the protection group added to the recovery plan will be failed over in an event of disaster. Click Next.


We have something called as Test Recovery. Test recovery does a test failover of the protected VMs to the recovery site without impacting the production VMs working or network identity. A test network or a bubble network (A network with no uplinks) will be created on the recovery site and these VMs will be placed there and bough up to verify if the recovery plan is working good. Keep the default auto create settings and click Next.


Review your recovery plan settings and click Finish to complete the create recovery plan wizard.


If you select the protected site, Related Objects and Recovery plans you can see this recovery plan being listed.


If you select the Recovery Plans in the Site Recovery Inventory, you will see the status of the plan and their related details.


Before you test your recovery, you will have to configure this recovery plan. Browse to, Recovery Plans, Related Objects, Virtual Machines. The VMs available under this recovery plan will be listed. Right click the virtual machine and select Configure Recovery


There are two options here, Recovery properties and IP customization.

The recovery properties discusses the order of VM startup, VM dependencies and additional steps that has to be carried out during and after Power On.

Since I just have one virtual machine in this recovery plan, the priority and the dependencies does not really matter. Set these options as to your requirement.


In the IP Customization option, you will provide the network details for the virtual machine in the Primary and the Recovery Site.


Select Configure Protection and you will be asked to configure IP settings of the VM in protected site. If you have VM tools running on this machine (Recommended), then click Retrieve and it will auto populate the IP settings. Click DNS option and enter the DNS IP and the domain name manually. Click OK to complete. The same steps has to be performed in the Recovery Site too under Configure Recovery, however, all the IP details has to be entered manually (If DHCP is not used) since there are no VM tools or powered On VM on the recovery site.


Once both are complete, you should see the below information in the IP Customization section. Click OK to finish configuring VM recovery.


Once this is performed for all the virtual machines in the recovery plan, the plan customization is complete and ready to be tested. You can also use the DR IP Customization tool to configure VM recovery settings.

In the next article, we will have a look at testing a recovery plan.

Part 4: Creating Virtual Machine Protection Groups In SRM 6.1

Part 3: Configuring Inventory Mappings in SRM 6.1

Once you configure inventory mappings, you will then have to configure protection groups. In protection group you will add the required virtual machines to be failed over by SRM in case of a disaster event.

Select the protected site and in the Summary tab under Guide to configuring SRM click Create a protection group.


Specify a name for this protection group that you will be creating and a description (not mandatory)


You will get to choose the direction for the protection group and the protection group type. In this, vcenter-prod is my production site and vcenter-dr is my recovery site.

I am using a vSphere Replication appliance, host based replication appliance, hence the replication group type would be vSphere Replication so that we can choose the VMs being replicated by this.
You cannot have vR and array-based replication VMs in the same protection group.


I have one virtual machine being replicate by vR. Check the required virtual machine and click Next.


Review the settings and click Finish to complete creating the protection group.


Under the production site, Related Objects, Protection Groups, you will be able to see the protection group listed.


Under the Protection Group option in the SRM inventory, you will see the same protection group listed.


And now in the recovery site, under the Recovered resource pool (selected during Inventory Mapping) you will be able to see the placeholder virtual machine.


With this, we have successfully created a protection group for virtual machines.

Part 5: Creating Recovery Plan in SRM 6.1

Thursday, 5 January 2017

Part 3: Configuring Inventory Mappings In SRM 6.1


Once the SRM sites are paired, you will then have to configure Inventory Mappings on the production site. Inventory mappings basically tell, these are the resources that will be available for the virtual machines in the protected site and these are another set of resources for the same VMs in the recovery site when failed over. Only when inventory mappings are established successfully you will be able to create placeholder virtual machines. If inventory mappings are not performed, then each VM has to be configured individually for resource availability after a fail-over.

In the below screenshot you can see that inventory mappings can be created for resource pools, folders, networks and datastores. There is only a Green check for Site Pairing as this was done in the previous article.


First we will be configuring resource mappings. Click the Create Resource Mappings. In the Production vCenter section, select the Production resource Pool. The virtual machine Router which we replicated earlier resides in this resource pool. Select the appropriate resource pool on the recovery site as well. Click Add Mappings and you should now see the direction of mapping in the bottom section. Click Next.


If you would like to establish re-protection, that is to fail back from the recovery site to the Protected Site, then check the option in Prepare reverse mappings. This is optional and right now, I do not want reverse mapping and I will leave it unchecked. Click Finish.


Now back in the summary tab,the resource mapping would be green checked and next we will be configuring Folder Mappings. In the Summary tab section, click Create Folder Mappings. I will manually configure this, so select Prepare mappings manually. Click Next.


Similarly like resource mapping, select the Folder that is required on the protected site where the protected virtual machines are and select the appropriate folder in the recovery site and click Add Mappings. Once the direction is displayed, click Next.


I will not be configuring any reverse mappings, so will leave this option unchecked and click Finish.


Once the Folder mapping is green checked, click Create Network Mapping and again this will be a manual configuration. Click Next.


In the same way, select the network on the Protected site where the protected VMs will be and an appropriate network on the recovery site and click Add Mappings. Then click Next.


Test networks would be created by default if you would like to test your recovery plans. The production VMs would be running as they are and a failed over instance of these VMs will be bought up on an isolated test network on the recovery site. Click Next.


Again, no reverse mappings here. I will simply click Finish.


With this the inventory mapping for resource, network and folder should be completed.

Next we will be configuring the Placeholder Datastore.

For every virtual machine in protected site, SRM will create a Placeholder VM on the recovery site. This placeholder VM will reside on a datastore and that datastore will be called as placeholder datastore. Once you specify this placeholder datastore, SRM will create VM files on that datastore in recovery site and uses them to register placeholder VMs on the recovery site inventory.

Two key requirements for placeholder datastore:
1. If there is a cluster in recovery site, the placeholder datastore must be visible to all the hosts in that cluster
2. You cannot use a replicated datastore as a placeholder datastore.

Select the Recovery Site in the SRM inventory and click Configure Placeholder Datastore. From the list below, select the required datastore to be used for placeholder and click OK.


If you would like to establish re-protection and fail-back, then you will have to select the placeholder datastore on the production site as well. You will have to click the protected site in the SRM inventory and configure placeholder datastore again.

With this, the inventory mapping will be completed.

Part 4: Creating virtual machine protection groups in SRM 6.1

Wednesday, 4 January 2017

VDP 6.1.3 - ESXi 5.1 Compatibility Issues

The VMware interoperability matrix says VDP 6.1.3 is compatible with ESXi 5.1. However, if you backup a VM when VDP is running on a 5.1 ESXi it will fail. I tried this with the following setup.

I deployed a 6.1.3 VDP on an ESXi 5.1 Build 1065491 and created a virtual machine on this same host. Then, I ran an on demand backup for this VM and it failed immediately. The backup job log had the following entry:

2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: Unknown SSL Error
2017-01-04T21:13:36.723-05:-30 avvcbimage Info <16041>: VDDK:SSL Error: error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol
2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: connect failed (1)
2017-01-04T21:13:36.723-05:-30 avvcbimage Info <16041>: VDDK:CnxAuthdConnect: Returning false because SSL_ConnectAndVerify failed
2017-01-04T21:13:36.724-05:-30 avvcbimage Info <16041>: VDDK:CnxConnectAuthd: Returning false because CnxAuthdConnect failed
2017-01-04T21:13:36.724-05:-30 avvcbimage Info <16041>: VDDK:Cnx_Connect: Returning false because CnxConnectAuthd failed
2017-01-04T21:13:36.724-05:-30 avvcbimage Info <16041>: VDDK:Cnx_Connect: Error message:
2017-01-04T21:13:36.724-05:-30 avvcbimage Warning <16041>: VDDK:[NFC ERROR] NfcNewAuthdConnectionEx: Failed to connect to peer. Error:
2017-01-04T21:13:36.742-05:-30 avvcbimage Warning <16041>: VDDK:SSL: Unknown SSL Error
2017-01-04T21:13:36.705-05:-30 avvcbimage Info <16041>: VDDK:NBD_ClientOpen: attempting to create connection to vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902
2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: Unknown SSL Error
2017-01-04T21:13:36.723-05:-30 avvcbimage Info <16041>: VDDK:SSL Error: error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol
2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: connect failed (1)
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-DSCPTR: : "vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902" : Failed to open NBD extent.
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-LINK  : "vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902" : failed to open (NBD_ERR_NETWORK_CONNECT).
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-CHAIN : "vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902" : failed to open (NBD_ERR_NETWORK_CONNECT).
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-LIB   : Failed to open 'vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902' with flags 0x1e NBD_ERR_NETWORK_CONNECT (2338).
2017-01-04T21:13:36.780-05:-30 avvcbimage Info <16041>: VDDK:NBD_ClientOpen: attempting to create connection to vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902

The VDDK disk release on 6.1.3 VDP is:
2017-01-04T21:13:27.244-05:-30 avvcbimage Info <16041>: VDDK:VMware VixDiskLib (6.5) Release build-4241604

Cause:
Refer to Backward compatibility of TLS in the VDDK release article here.

Recommended Fix:
Upgrade to 5.5 U3e or later.

Workaround:
On the VDP appliance, edit the below file:
# vi /etc/vmware/config
Add the below line:
tls.protocols=tls1.0,tls1.1,tls1.2

Save the file. There is no need to restart any services. Re-run the backup job and the backups should now complete successfully. 

Part 2: Pairing Sites in Site Recovery Manager 6.1

Part 1: Installing Site Recovery Manager

In this article we will see how to pair the two SRM sites we installed previously. So if you see here, I have two SRM sites, Production and a DR site. Soon after a fresh install these sites will not be paired and you will see the message "Site is not paired" for both the Production and Recovery Site. In order for failover to take place the Site needs to be paired. Click the option Pair Site in the center screen.


Enter the Platform Services Controller of the DR site and click Next.


The SRM plugin extension would be com.vmware.vcDr unless you chose to create a custom Plugin ID during install. Once the PSC detail is given in previous step, the vCenter corresponding to it will be populated automatically. Select the vCenter and provide the SSO user credentials for authentication and click Finish.


If you are presented with a certificate warning, click Yes to proceed. The pairing should now be completed.


Now, you can see the Sites have been paired from the Summary tab. And also the paired site details will be populated in the "Paired Site" section.


That's it.

Part 3: Configuring Inventory Mappings in SRM 6.1

Tuesday, 3 January 2017

Part 3: Recover A VM Using vSphere Replication

Part 2: Pairing vR Sites and configuring replication for a virtual machine

In this article, we will be performing a recovery of replicated virtual machine using vSphere replication. To perform a recovery, you will have to select the target vCenter (vCenter-DR in my case), select Monitor and Incoming replication.


You will see the below screen at this point and you will notice a big red button with a play symbol. This would be the recovery option. Select this icon.


You will then be presented with the type of recovery you would like to do


Recover with recent changes: This first option will need to have the source VM powered down. Before initiating the recovery process it will sync the recent changes with the source VM, so the recovered VM will be up-to-date.

Use latest available data: If you would not like to power down the source or if the source is unavailable or corrupted, you will choose this option. Here, it will make use of the recent replicated data to recover the virtual machine.

We will be using the second option to recover the virtual machine. In this wizard you will have to choose a destination folder to restore this virtual machine to.


You will then have to select the ESXi host and (if available) a resource pool to recover this virtual machine to.


You will have an option to keep the recovered VM Powered on or off. Depending on your requirement you can select this, and click Finish to begin the recovery process.


Once the recovery is complete, the virtual machine will be now available in the target site, and all the VM files that were named as hbr.UUID.vmdk (VM files that were replicated) will be renamed to the actual virtual machine files)

The status of the replication will now switch to Recovered and there will be no more active replication for this virtual machine.



Resuming Replication After Replication: Reprotect and Failback.

In most scenarios, once a VM is recovered you would like to re-establish the replication the other way to ensure there is a new replicated instance in case if this recovered virtual machine fails at some point. This is called as reverse replication or reprotection.

Initially, the replication was from vcenter-prod to vcenter-dr with the virtual machine residing on the vcenter-prod. Post a recovery, the virtual machine is now running on vcenter-dr. So, now the replication direction changes from vcenter-dr to vcenter-prod.

You will have to first stop the current configured replication for the virtual machine. On the target site, under incoming replication (Above screenshot), right click the VM with status as recovered and select Stop. Then, the virtual machine on the source has to be unregistered (Remove from inventory) on the source side. Once the replication is stopped and the source (old) virtual machine is unregistered, you will then have to reconfigure the replication. The process is same as discussed in Part 2 of this article.

The only difference is, when you select a destination datastore for the replication data to reside you will receive the following message. Select Use Existing. With this option, it will inform you that there are already a set of drives available on the target site and they will be the replication seeds. A initial Full Sync will still occur, but it will not be a copy of data, it will be just a check of the hash to ensure the validity. Once this is done, the new data will then be replicated first, and then replicated according to your set RPO.


Once the replication status goes to OK, you will have a valid replicated instance of the virtual machine at the new target site ready to be recovered.

Performing A Manual Recovery.

Until now, you saw vSphere Replication taking care of all recovery operation. But for some reason, the vCenter is down and you would like to recover a critical virtual machine. If vCenter is down, you cannot manage your vSphere Replication. Then in this case, we will be performing a manual recovery.

From the SSH of the ESXi host, you can see the VM files that are replicated:
# cd /vmfs/volumes/54ed030d-cd8f4a16-9fef-ac162d7a2fa0/Router

-rw-------    1 root     root        8.5K Jan  3 08:24 hbrcfg.GID-c3732b6f-de63-4c55-a830-a4437d91a143.4.nvram.8
-rw-------    1 root     root        3.1K Jan  3 08:24 hbrcfg.GID-c3732b6f-de63-4c55-a830-a4437d91a143.4.vmx.7
-rw-------    1 root     root       84.0K Jan  3 08:24 hbrdisk.RDID-297047a6-c7d0-4322-b290-bb610582daf1.5.59562057314158-delta.vmdk
-rw-------    1 root     root         368 Jan  3 08:24 hbrdisk.RDID-297047a6-c7d0-4322-b290-bb610582daf1.5.59562057314158.vmdk
You will have to rename these VM files to vmdk, flat.vmdk, vmx, nvram extensions. So, create a new folder under the datastore directory.
# cd /vmfs/volumes/54ed030d-cd8f4a16-9fef-ac162d7a2fa0/
# mkdir Rec
Pause the replication and copy / clone the vmdk to the new location using vmkfstools -i
# cd /vmfs/volumes/54ed030d-cd8f4a16-9fef-ac162d7a2fa0/Router
# vmkfstools -i hbrdisk.RDID-297047a6-c7d0-4322-b290-bb610582daf1.5.59562057314158.vmdk -d thin /vmfs/volumes/54ed030d-cd8f4a16-9fef-ac162d7a2fa0/Rec/Rec.vmdk
You will see the following output:
Destination disk format: VMFS thin-provisioned
Cloning disk 'hbrdisk.RDID-297047a6-c7d0-4322-b290-bb610582daf1.5.59562057314158.vmdk'...
Clone: 100% done.

Copy / Rename the vmx and nvram files using the below command:
# cp -a hbrcfg.GID-c3732b6f-de63-4c55-a830-a4437d91a143.4.vmx.7 /vmfs/volumes/54ed030d-cd8f4a16-
9fef-ac162d7a2fa0/Rec/Rec.vmx
# cp -a hbrcfg.GID-c3732b6f-de63-4c55-a830-a4437d91a143.4.nvram.8 /vmfs/volumes/54ed030d-cd8f4a1
6-9fef-ac162d7a2fa0/Rec/Rec.nvram
Finally, register the VM from the command line using:
# vim-cmd solo/registervm /vmfs/volumes/54ed030d-cd8f4a16-9fef-ac162d7a2fa0/Rec/Rec.vmx
If the registration was successful there will be a VM ID allocated as the output and you can verify the same in the vSphere client.

That's pretty much it.