Monday, 16 January 2017

Slow GUI Response On VDP 6.1.3

I recently ran into an issue, where the Backup Job tab was extremely slow in loading the jobs and when I say extremely slow it was taking forever to load the jobs. This was the same with the other tabs as well in the Web Client VDP GUI.

In the axis2.log under /usr/local/avamar/var/mc/server_log the following was logged:

2017-01-16 12:57:35,105 [1690894503@qtp-1786872722-26] ERROR org.apache.axis2.transport.http.AxisServlet  - Java heap space
java.lang.OutOfMemoryError: Java heap space
        at java.util.Arrays.copyOf(Unknown Source)
        at java.lang.AbstractStringBuilder.expandCapacity(Unknown Source)
        at java.lang.AbstractStringBuilder.ensureCapacityInternal(Unknown Source)
        at java.lang.AbstractStringBuilder.append(Unknown Source)
        at java.lang.StringBuffer.append(Unknown Source)
        at java.io.StringWriter.write(Unknown Source)
        at com.ctc.wstx.sw.BufferingXmlWriter.flushBuffer(BufferingXmlWriter.java:1103)
        at com.ctc.wstx.sw.BufferingXmlWriter.fastWriteRaw(BufferingXmlWriter.java:1114)
        at com.ctc.wstx.sw.BufferingXmlWriter.writeStartTagEnd(BufferingXmlWriter.java:743)
        at com.ctc.wstx.sw.BaseNsStreamWriter.closeStartElement(BaseNsStreamWriter.java:388)
        at com.ctc.wstx.sw.BaseStreamWriter.writeCharacters(BaseStreamWriter.java:446)
        at org.apache.axiom.util.stax.wrapper.XMLStreamWriterWrapper.writeCharacters(XMLStreamWriterWrapper.java:100)
        at org.apache.axiom.om.impl.MTOMXMLStreamWriter.writeCharacters(MTOMXMLStreamWriter.java:289)
        at org.apache.axis2.databinding.utils.writer.MTOMAwareXMLSerializer.writeCharacters(MTOMAwareXMLSerializer.java:139)
        at com.avamar.mc.sdk10.EventMoref.serialize(EventMoref.java:155)
        at com.avamar.mc.sdk10.EventMoref.serialize(EventMoref.java:78)
        at com.avamar.mc.sdk10.ActivityInfo.serialize(ActivityInfo.java:889)
        at com.avamar.mc.sdk10.ActivityInfo.serialize(ActivityInfo.java:198)
        at com.avamar.mc.sdk10.ArrayOfActivityInfo.serialize(ArrayOfActivityInfo.java:216)
        at com.avamar.mc.sdk10.TaskInfo.serialize(TaskInfo.java:630)
        at com.avamar.mc.sdk10.DynamicValue.serialize(DynamicValue.java:244)
        at com.avamar.mc.sdk10.DynamicValue.serialize(DynamicValue.java:152)
        at com.avamar.mc.sdk10.ArrayOfDynamicValues.serialize(ArrayOfDynamicValues.java:216)
        at com.avamar.mc.sdk10.ArrayOfDynamicValues.serialize(ArrayOfDynamicValues.java:160)
        at com.avamar.mc.sdk10.GetDynamicPropertyResponse.serialize(GetDynamicPropertyResponse.java:165)
        at com.avamar.mc.sdk10.GetDynamicPropertyResponse.serialize(GetDynamicPropertyResponse.java:109)
        at com.avamar.mc.sdk10.GetDynamicPropertyResponse$1.serialize(GetDynamicPropertyResponse.java:97)
        at org.apache.axis2.databinding.ADBDataSource.serialize(ADBDataSource.java:93)
        at org.apache.axiom.om.impl.llom.OMSourcedElementImpl.internalSerialize(OMSourcedElementImpl.java:692)
        at org.apache.axiom.om.impl.util.OMSerializerUtil.serializeChildren(OMSerializerUtil.java:556)
        at org.apache.axiom.om.impl.llom.OMElementImpl.internalSerialize(OMElementImpl.java:874)
        at org.apache.axiom.soap.impl.llom.SOAPEnvelopeImpl.internalSerialize(SOAPEnvelopeImpl.java:230)
2017-01-16 12:58:20,038 [1690894503@qtp-1786872722-26] ERROR org.apache.axis2.transport.http.AxisServlet  - Java heap space
java.lang.OutOfMemoryError: Java heap space
2017-01-16 12:59:07,070 [486089829@qtp-1786872722-25] ERROR org.apache.axis2.transport.http.AxisServlet  - Java heap space
java.lang.OutOfMemoryError: Java heap space

Because of this the backup jobs never used to run and the mccli commands took forever to report the outputs.
The solution was to obviously increase the Java heap memory. The steps would be to:
(Backup the files before editing them)

1. Browse to the following location:
# cd /usr/local/avamar/var/mc/server_data/prefs
2. Make a backup of the mcserver.xml file before editing it. Open the mcserver.xml file using a vi editor
# vi mcserver.xml
3. Locate the following line. <entry key="maxJavaHeap" value="-Xmx1G" /> and change the value from 1G to 2G

Before Edit:
<entry key="maxJavaHeap" value="-Xmx1G" />
After Edit:
<entry key="maxJavaHeap" value="-Xmx2G" />

4. Save this file

5. Go to the following location
# cd /usr/local/avamar/lib/
6. Make a copy of mcserver.xml file in this location and open it in vi editor and edit the same parameter in this file too:

Before Edit:
<entry key="maxJavaHeap" value="-Xmx1G" merge="newvalue" />
After Edit:
<entry key="maxJavaHeap" value="-Xmx2G" merge="newvalue" />

7. Save the file

8. Switch to admin mode of VDP sudo su - admin and restart MCS using:
# mcserver.sh --restart

Post this, the GUI should show some relief in terms of loading and you should no longer see the java heap error in axis2.log

Hope this helps.

Tuesday, 10 January 2017

Avamar Virtual Edition 7.1: Failed To Communicate To vCenter During Client Registration

Once you setup your Avamar Server, you will proceed to add the VMware vCenter as a client to this. You will provide the vCenter IP/FQDN along with the administrator user credentials and the default https port 443. However, it errors out stating:

Failed to communicate to vCenter. Unable to find valid certification path to the vCenter. 


The error is due to Avamar not being able to acknowledge the VMware vCenter certificate warning. To do this, we will have to force the avamar to accept the vCenter certificate. 

Login to SSH of the Avamar Server and edit the mcserver.xml file:
# vi /usr/local/avamar/var/mc/server_data/prefs/mcserver.xml

Locate the vcenter certificate ignore parameter by searching for /cert in the vi editor. You will notice the below line.


Change this value from false to true and save the file.

Change the access to admin mode using sudo su - admin and run the below script to restart MCS.
# mcserver.sh --restart

Post this, you should be able to add this vCenter as a client to avamar successfully


That should be it.

Sunday, 8 January 2017

Part 5: Creating Recovery Plans In SRM 6.1

Part 4: Creating protection groups for virtual machines in SRM 6.1

Once you create a protection group, it's time to create a recovery plan. When you want to perform a DR test or a test recovery, it is the recovery plan that you will execute. A recovery plan is tasked to run a set of steps in a particular order to fail over the VMs or test the failover to the recovery site. You cannot change the workflow of the recovery plan, however you can customize by adding your required checks and tasks in between.

Select the production site in SRM inventory and under Summary Tab select Create a recovery plan.


Provide a name for the recovery plan and an optional description and click Next.


Select the recovery site where you want the VMs to failover to and click Next.


The Group type will be VM protection groups and then select the required protection groups to be added to this recovery plan. Only the VMs in the protection group added to the recovery plan will be failed over in an event of disaster. Click Next.


We have something called as Test Recovery. Test recovery does a test failover of the protected VMs to the recovery site without impacting the production VMs working or network identity. A test network or a bubble network (A network with no uplinks) will be created on the recovery site and these VMs will be placed there and bough up to verify if the recovery plan is working good. Keep the default auto create settings and click Next.


Review your recovery plan settings and click Finish to complete the create recovery plan wizard.


If you select the protected site, Related Objects and Recovery plans you can see this recovery plan being listed.


If you select the Recovery Plans in the Site Recovery Inventory, you will see the status of the plan and their related details.


Before you test your recovery, you will have to configure this recovery plan. Browse to, Recovery Plans, Related Objects, Virtual Machines. The VMs available under this recovery plan will be listed. Right click the virtual machine and select Configure Recovery


There are two options here, Recovery properties and IP customization.

The recovery properties discusses the order of VM startup, VM dependencies and additional steps that has to be carried out during and after Power On.

Since I just have one virtual machine in this recovery plan, the priority and the dependencies does not really matter. Set these options as to your requirement.


In the IP Customization option, you will provide the network details for the virtual machine in the Primary and the Recovery Site.


Select Configure Protection and you will be asked to configure IP settings of the VM in protected site. If you have VM tools running on this machine (Recommended), then click Retrieve and it will auto populate the IP settings. Click DNS option and enter the DNS IP and the domain name manually. Click OK to complete. The same steps has to be performed in the Recovery Site too under Configure Recovery, however, all the IP details has to be entered manually (If DHCP is not used) since there are no VM tools or powered On VM on the recovery site.


Once both are complete, you should see the below information in the IP Customization section. Click OK to finish configuring VM recovery.


Once this is performed for all the virtual machines in the recovery plan, the plan customization is complete and ready to be tested. You can also use the DR IP Customization tool to configure VM recovery settings.

In the next article, we will have a look at testing a recovery plan.

Part 4: Creating Virtual Machine Protection Groups In SRM 6.1

Part 3: Configuring Inventory Mappings in SRM 6.1

Once you configure inventory mappings, you will then have to configure protection groups. In protection group you will add the required virtual machines to be failed over by SRM in case of a disaster event.

Select the protected site and in the Summary tab under Guide to configuring SRM click Create a protection group.


Specify a name for this protection group that you will be creating and a description (not mandatory)


You will get to choose the direction for the protection group and the protection group type. In this, vcenter-prod is my production site and vcenter-dr is my recovery site.

I am using a vSphere Replication appliance, host based replication appliance, hence the replication group type would be vSphere Replication so that we can choose the VMs being replicated by this.
You cannot have vR and array-based replication VMs in the same protection group.


I have one virtual machine being replicate by vR. Check the required virtual machine and click Next.


Review the settings and click Finish to complete creating the protection group.


Under the production site, Related Objects, Protection Groups, you will be able to see the protection group listed.


Under the Protection Group option in the SRM inventory, you will see the same protection group listed.


And now in the recovery site, under the Recovered resource pool (selected during Inventory Mapping) you will be able to see the placeholder virtual machine.


With this, we have successfully created a protection group for virtual machines.

Part 5: Creating Recovery Plan in SRM 6.1

Thursday, 5 January 2017

Part 3: Configuring Inventory Mappings In SRM 6.1


Once the SRM sites are paired, you will then have to configure Inventory Mappings on the production site. Inventory mappings basically tell, these are the resources that will be available for the virtual machines in the protected site and these are another set of resources for the same VMs in the recovery site when failed over. Only when inventory mappings are established successfully you will be able to create placeholder virtual machines. If inventory mappings are not performed, then each VM has to be configured individually for resource availability after a fail-over.

In the below screenshot you can see that inventory mappings can be created for resource pools, folders, networks and datastores. There is only a Green check for Site Pairing as this was done in the previous article.


First we will be configuring resource mappings. Click the Create Resource Mappings. In the Production vCenter section, select the Production resource Pool. The virtual machine Router which we replicated earlier resides in this resource pool. Select the appropriate resource pool on the recovery site as well. Click Add Mappings and you should now see the direction of mapping in the bottom section. Click Next.


If you would like to establish re-protection, that is to fail back from the recovery site to the Protected Site, then check the option in Prepare reverse mappings. This is optional and right now, I do not want reverse mapping and I will leave it unchecked. Click Finish.


Now back in the summary tab,the resource mapping would be green checked and next we will be configuring Folder Mappings. In the Summary tab section, click Create Folder Mappings. I will manually configure this, so select Prepare mappings manually. Click Next.


Similarly like resource mapping, select the Folder that is required on the protected site where the protected virtual machines are and select the appropriate folder in the recovery site and click Add Mappings. Once the direction is displayed, click Next.


I will not be configuring any reverse mappings, so will leave this option unchecked and click Finish.


Once the Folder mapping is green checked, click Create Network Mapping and again this will be a manual configuration. Click Next.


In the same way, select the network on the Protected site where the protected VMs will be and an appropriate network on the recovery site and click Add Mappings. Then click Next.


Test networks would be created by default if you would like to test your recovery plans. The production VMs would be running as they are and a failed over instance of these VMs will be bought up on an isolated test network on the recovery site. Click Next.


Again, no reverse mappings here. I will simply click Finish.


With this the inventory mapping for resource, network and folder should be completed.

Next we will be configuring the Placeholder Datastore.

For every virtual machine in protected site, SRM will create a Placeholder VM on the recovery site. This placeholder VM will reside on a datastore and that datastore will be called as placeholder datastore. Once you specify this placeholder datastore, SRM will create VM files on that datastore in recovery site and uses them to register placeholder VMs on the recovery site inventory.

Two key requirements for placeholder datastore:
1. If there is a cluster in recovery site, the placeholder datastore must be visible to all the hosts in that cluster
2. You cannot use a replicated datastore as a placeholder datastore.

Select the Recovery Site in the SRM inventory and click Configure Placeholder Datastore. From the list below, select the required datastore to be used for placeholder and click OK.


If you would like to establish re-protection and fail-back, then you will have to select the placeholder datastore on the production site as well. You will have to click the protected site in the SRM inventory and configure placeholder datastore again.

With this, the inventory mapping will be completed.

Part 4: Creating virtual machine protection groups in SRM 6.1

Wednesday, 4 January 2017

VDP 6.1.3 - ESXi 5.1 Compatibility Issues

The VMware interoperability matrix says VDP 6.1.3 is compatible with ESXi 5.1. However, if you backup a VM when VDP is running on a 5.1 ESXi it will fail. I tried this with the following setup.

I deployed a 6.1.3 VDP on an ESXi 5.1 Build 1065491 and created a virtual machine on this same host. Then, I ran an on demand backup for this VM and it failed immediately. The backup job log had the following entry:

2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: Unknown SSL Error
2017-01-04T21:13:36.723-05:-30 avvcbimage Info <16041>: VDDK:SSL Error: error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol
2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: connect failed (1)
2017-01-04T21:13:36.723-05:-30 avvcbimage Info <16041>: VDDK:CnxAuthdConnect: Returning false because SSL_ConnectAndVerify failed
2017-01-04T21:13:36.724-05:-30 avvcbimage Info <16041>: VDDK:CnxConnectAuthd: Returning false because CnxAuthdConnect failed
2017-01-04T21:13:36.724-05:-30 avvcbimage Info <16041>: VDDK:Cnx_Connect: Returning false because CnxConnectAuthd failed
2017-01-04T21:13:36.724-05:-30 avvcbimage Info <16041>: VDDK:Cnx_Connect: Error message:
2017-01-04T21:13:36.724-05:-30 avvcbimage Warning <16041>: VDDK:[NFC ERROR] NfcNewAuthdConnectionEx: Failed to connect to peer. Error:
2017-01-04T21:13:36.742-05:-30 avvcbimage Warning <16041>: VDDK:SSL: Unknown SSL Error
2017-01-04T21:13:36.705-05:-30 avvcbimage Info <16041>: VDDK:NBD_ClientOpen: attempting to create connection to vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902
2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: Unknown SSL Error
2017-01-04T21:13:36.723-05:-30 avvcbimage Info <16041>: VDDK:SSL Error: error:14077102:SSL routines:SSL23_GET_SERVER_HELLO:unsupported protocol
2017-01-04T21:13:36.723-05:-30 avvcbimage Warning <16041>: VDDK:SSL: connect failed (1)
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-DSCPTR: : "vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902" : Failed to open NBD extent.
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-LINK  : "vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902" : failed to open (NBD_ERR_NETWORK_CONNECT).
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-CHAIN : "vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902" : failed to open (NBD_ERR_NETWORK_CONNECT).
2017-01-04T21:13:36.743-05:-30 avvcbimage Info <16041>: VDDK:DISKLIB-LIB   : Failed to open 'vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902' with flags 0x1e NBD_ERR_NETWORK_CONNECT (2338).
2017-01-04T21:13:36.780-05:-30 avvcbimage Info <16041>: VDDK:NBD_ClientOpen: attempting to create connection to vpxa-nfcssl://[datastore1 (2)] Thick/Thick.vmdk@10.109.10.171:902

The VDDK disk release on 6.1.3 VDP is:
2017-01-04T21:13:27.244-05:-30 avvcbimage Info <16041>: VDDK:VMware VixDiskLib (6.5) Release build-4241604

Cause:
Refer to Backward compatibility of TLS in the VDDK release article here.

Recommended Fix:
Upgrade to 5.5 U3e or later.

Workaround:
On the VDP appliance, edit the below file:
# vi /etc/vmware/config
Add the below line:
tls.protocols=tls1.0,tls1.1,tls1.2

Save the file. There is no need to restart any services. Re-run the backup job and the backups should now complete successfully. 

Part 2: Pairing Sites in Site Recovery Manager 6.1

Part 1: Installing Site Recovery Manager

In this article we will see how to pair the two SRM sites we installed previously. So if you see here, I have two SRM sites, Production and a DR site. Soon after a fresh install these sites will not be paired and you will see the message "Site is not paired" for both the Production and Recovery Site. In order for failover to take place the Site needs to be paired. Click the option Pair Site in the center screen.


Enter the Platform Services Controller of the DR site and click Next.


The SRM plugin extension would be com.vmware.vcDr unless you chose to create a custom Plugin ID during install. Once the PSC detail is given in previous step, the vCenter corresponding to it will be populated automatically. Select the vCenter and provide the SSO user credentials for authentication and click Finish.


If you are presented with a certificate warning, click Yes to proceed. The pairing should now be completed.


Now, you can see the Sites have been paired from the Summary tab. And also the paired site details will be populated in the "Paired Site" section.


That's it.

Part 3: Configuring Inventory Mappings in SRM 6.1