Friday, 30 December 2016

Part 2: Pair Replication Sites And Configuring Replication

In the earlier article, we saw how to deploy and configure vSphere Replication Appliance.

In this, we will see how to pair the replication sites and configure replication for a virtual machine. vSphere Replication 6.1 is managed via the vSphere Web Client. So, you will have to login to the web client with an administrator user. Now, in the production vCenter site you see there are no target sites for Replication. 


In this section, we will click "Connect to target site" option and you will see the below screen. Since my vCenters are in linked mode, I will choose "Connect to a local site" and select the DR site vCenter. If your vCenters were not in linked mode, then you will have to choose "Connect to a remote site" option and provide the PSC details of the remote site. 


Once the vCenter site is selected and configuration is done, you will see the Target Sites section being populated with the DR vCenter details.


Similarly in the DR site, you will see the production site vCenter in the Target Sites section.


Once this is done, we can proceed to configure replication for a VM. In my case, I will be choosing a VM called Router which has almost No data on its VMDK as it booting off a Floppy. This would be easier to complete as a test replication to ensure connectivity.

Right click the VM > All vSphere Replication Actions > Configure Replication.


We will be replicating from the Production vCenter to the DR vCenter, hence I will choose Replicate to a vCenter Server.


Select the DR site vCenter as the Target Site to send the replication data to.


Here we do not have additional replication servers deployed. Replication servers can be deployed to handle large replication load. If there are none, then the replication appliance will handle this traffic. So I will keep the default, Auto-assign Replication Server.


Select the datastore where the replicated data should reside in the Target Location section.


I will not check Quiescing or Network Compression. This is up to your requirement.


In this section, you will get to specify the RPO and Point In Time Copy for your replication of that specific VM. The more low the RPO the more frequent the replication is initiated and ensures all new data is constantly replicated. This would also mean the network would be under heavy load. 

Point In time instances mean, how many replicated instances have to be saved. If you say keep 5 instance for 1 day, then when a VM is restored from the replicated instance there will be 5 snapshots available and you can revert to any one of your requirement. After the 1 day mark those 5 instances will be removed and the next new 5 instances will be saved. 

More the instances to be saved, more the data space used on the destination datastore.


Once the replication is configured, the Initial Full Sync will start and you will see the below screen. 
Full Sync transfers all the VM data to the DR site. This can take some time depending on how big the source VM is. Post the full sync, we will be performing the incremental replication and the changes will be recorded in the persistent state file (.psf) 


Once the Full Sync completes, the Status will be OK and the replication details will be populated.


And now, if you browse the datastore which was configured to retain the replicated data, you will see the below files for the replicated VM.


Part 3: Recover a virtual machine using vSphere Replication. 

Connecting VDP To Web Client Causes The Screen To Gray Out Indefinitely

Quite a while back there was a known issue in 6.1 version of VDP when residing on a distributed switch. Clicking the Connect button for VDP in Web Client caused the screen to gray out completely forever until a manual refresh was done. The resolution to this can be found here

This is a similar issue, but is seen when VDP is not residing on a distributed switch. The deployment was a simple one. One vCenter, few ESXi hosts, a handful of VMs and a vSphere Data Protection Appliance. The connection to VDP in web client caused the screen to gray out forever. However, we were able to login to the vdp-configure page and SSH into the appliance without issues.

In the vdr-server.log the following was noticed when the connect operation was in progress.

2016-12-29 14:00:30,302 ERROR [Thread-4]-vi.ViJavaServiceInstanceProviderImpl: Failed To Create ViJava ServiceInstance
com.vmware.vim25.InvalidLogin
        at sun.reflect.GeneratedConstructorAccessor195.newInstance(Unknown Source)
        at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(Unknown Source)
        at java.lang.reflect.Constructor.newInstance(Unknown Source)
        at java.lang.Class.newInstance(Unknown Source)
        at com.vmware.vim25.ws.XmlGen.fromXml(XmlGen.java:205)
        at com.vmware.vim25.ws.XmlGen.parseSoapFault(XmlGen.java:82)
        at com.vmware.vim25.ws.WSClient.invoke(WSClient.java:170)
        at com.vmware.vim25.ws.VimStub.login(VimStub.java:1530)
        at com.vmware.vim25.mo.SessionManager.login(SessionManager.java:164)
        at com.vmware.vim25.mo.ServiceInstance.<init>(ServiceInstance.java:143)
        at com.vmware.vim25.mo.ServiceInstance.<init>(ServiceInstance.java:95)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:252)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:150)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:92)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.getViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:70)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.waitForViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:166)
        at com.emc.vdp2.server.VDRServletLifeCycleListener$1.run(VDRServletLifeCycleListener.java:73)
        at java.lang.Thread.run(Unknown Source)

And in the mcserver.out log, the following was noticed:

Exception running : VMWare
Caught Fault -
Type : com.vmware.vim25.InvalidLogin
Actor : null
Code : null
Reason : Cannot complete login due to an incorrect user name or password.
Fault String : Cannot complete login due to an incorrect user name or password.

The cause can be if there is a registration issue between the VDP and the vCenter, or if the password for the user which was used to configure VDP to vCenter was changed.

To resolve this:

1. Verify what user is being used to configure VDP to vCenter:
# less /usr/local/vdr/etc/vcenterinfo.cfg

2. Login to the vdp-configure page and perform a re-registration of the VDP to the vCenter. During the re-registration process, use the same user name and the current password for that user. If you are unsure of the password, then try logging into the vCenter with that credentials. If it works, then the password is valid.

Click here for the steps to re-register VDP to vCenter

3. Once the registration is complete, restart the tomcat service on the VDP:
# emwebapp.sh --restart

Post this, refresh the web client and now you should be able to connect successfully to the data protection appliance.

Thursday, 29 December 2016

Part 1: Installing Site Recovery Manager 6.1

Site Recovery Manager is a DR solution provided by VMware to ensure business continuity in an event of a site failure. The VMs configured for protection will be failed over to the Recovery site to ensure there is minimal downtime in productivity.

For a list of Site Recovery Manager prerequisites you can visit this link here. In this article we will see how to install SRM in the Production Site. We will not cover the installation steps of the Recovery site as it will be the same as the Production site.

Download the required version of SRM from the MyVMware downloads page. Ensure that the same version of SRM is going to be used in the production and recovery site.

Run the exe file and select the language for the Install Wizard to proceed.



The installation progress begins and the files will be extracted and prepared for installation.



You will be presented with the first page of the installation wizard where you can confirm the version of SRM being installed in the bottom left corner. Click Next to begin the installation.


You will be presented by the Copyright page. Simply go ahead and click Next.


Read through the EULA, Accept it and click Next.


Select in which directory you would like to install your SRM. By default it will be at the C drive. Click Next.


Now, SRM can be installed when the vCenter sites are in Enhanced Linked Mode (ELM) or not. If the 2 vCenter sites are not in ELM, then they can be either an embedded deployment or an external PSC deployment. If the 2 vCenter sites are in ELM, then both the sites will be an external PSC deployment.

In this case, I have two vCenter sites, with ELM, hence external Platform Services Controller. In the Address section, enter the FQDN of the PSC node. Provide the SSO user for the Username and it's Password. Click Next.


You will be presented with the PSC certificate. The recommendation here is to deploy the vCenter and PSC nodes with FQDN and register SRM to these via FQDN only. This is because, in future if you would like to change the IP address of the vCenter or SRM we can do so without breaking any certificates.

Accept the PSC certificate.


You will be provided with the respective vCenter Server for the previously entered PSC address. Verify that we are registering the SRM to the correct vCenter server and click Next.


You will now be presented with the vCenter Server Certificate in the same way you were presented with the PSC earlier. Accept the vCenter certificate to proceed further. 


The Local Site Name will be populated by default. Enter the administrator email address for notifications. The Local Host IP will the Windows Server IP hosting this SRM node. Click Next.


You will be provided with a SRM Plugin ID page. Keep the default Plugin option. Only if we are using Shared recovery, it would be best to use a custom SRM plugin.


If the vCenter is using a default certificate, then proceed to use default certificate for your SRM node as well. Choose Automatically generate a certificate option and proceed Next.


Provide the Org and OU details for the self signed certificate for SRM and click Next.


This is a new installation of SRM and hence I will be using the embedded postgres database. If you are using SQL, then use a custom database server and provide the DSN that was created on the SRM box. Click Next.


Provide the DSN information for the embedded Postgres database and click Next.


Select a service account on which the SRM service should run and click Next. Post this you will be provided to click Finish to begin the installation.


Once the installation completes for the primary site, you will perform the same steps again for the recovery or the DR site.

Now, when you login to the web client of either Primary or DR vCenter, you will be able to see both the SRM sites (This is because both of my vCenters are in ELM). If your vCenters are standalone, then you will see the SRM instance configured with that vCenter node.



Part 2: Pairing sites in Site Recovery Manager.

Wednesday, 28 December 2016

vSphere 6.0 Web Client Does Not Load After An IP Change

Recently, I was tasked to change the IP address of my lab environment from 192.x range to a 10.x range. The deployment of vCenter was an appliance with an external Platform Services Controller.
All the deployment and configuration was done with FQDN so changing IP address was not an issue.

I changed the IP of vCenter and PSC node from the Web Management Interface and restarted both these nodes. However, post the restart the web client was not loading up and I was seeing the below message:


When I checked service --status-all on the vCenter node, I saw a bunch of services were not running. I started these services manually from the CLI using this KB here

However, there was no luck. If we check the hosts file on both PSC and vCenter, it is still using the old IP address.



All we have to do here is, edit this file and change the IP from the old address to the new one. Post this, save the file and restart both the PSC and vCenter nodes.

Now, all the services in vCenter will start automatically and the web client will load without issues.

Hope this helps.

Tuesday, 27 December 2016

VDP Upgrade To 6.1.3 Fails: Installation Of The Package Stalled

vSphere Data Protection 6.1.3 is a pretty stable release in the 6.x chain. However, there was an issue getting to this version from a prior release. This issue is specifically seen while upgrading VDP from 6.1 to 6.1.3. The upgrade fails approximately around 60 percent, with an error "Installation of the package stalled" and it prompts you to revert to the snapshot taken prior to the upgrade.

The avinstaller.log has the following:

INFO: Working on task: Starting MCS (74 of 124) id: 10312
Dec 12, 2016 1:07:32 PM com.avamar.avinstaller.monitor.PollMessageSender sendMessage
INFO: PollMessageSender sent: <UpdateMessage><Progress>60</Progress><TaskName>Starting MCS (74 of 124)</TaskName><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><TaskId>10312</TaskId><Timestamp>2016/12/12-13:07:32.00489</Timestamp></UpdateMessage>
INFO: Package Location in process: /data01/avamar/repo/temp/vSphereDataProtection-6.1.3.avp_1481542766965
<UpdateMessage><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><Timestamp>2016/12/12-13:07:38.00626</Timestamp><Content>Starting MCS ...</Content></UpdateMessage>
Dec 12, 2016 1:07:38 PM com.avamar.avinstaller.process.TaskTimeManager resetTimer
INFO: startTime: 1481544458653
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.monitor.MessageListener doPost
INFO: MessageListener receiving message
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.monitor.MessageListener doPost
INFO: MessageListener forwarding message: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<UpdateMessage><Content><![CDATA["mcserver.sh --start", exit status=1 (error)]]></Content></UpdateMessage>
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.process.MessageReceiver receiveMessage
INFO: MessageReceiver receive message: <UpdateMessage><Timestamp>2016/12/12-13:08:38.00890</Timestamp><Content>"mcserver.sh --start", exit status=1 (error)</Content></UpdateMessage>
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.monitor.PollMessageSender sendMessage
INFO: PollMessageSender sent: <UpdateMessage><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><Timestamp>2016/12/12-13:08:38.00890</Timestamp><Content>"mcserver.sh --start", exit status=1 (error)</Content></UpdateMessage>
Dec 12, 2016 1:08:38 PM com.avamar.avinstaller.process.TaskTimeManager resetTimer
INFO: startTime: 1481544518935
Dec 12, 2016 1:08:43 PM com.avamar.avinstaller.monitor.MessageListener doPost
INFO: MessageListener forwarding message: <?xml version="1.0" encoding="UTF-8" standalone="yes" ?>
<UpdateMessage><Content><![CDATA[----->start MCS failed - operation failed]]></Content></UpdateMessage>
<UpdateMessage><ProcessKey>VSphereDataProtection20161020Oct101476964022</ProcessKey><ProcessInstanceId>VSphereDataProtection20161020Oct101476964022.10007</ProcessInstanceId><Timestamp>2016/12/12-13:08:43.00689</Timestamp><Content>-----&gt;start MCS failed - operation failed</Content></UpdateMessage>

Dec 12, 2016 1:08:44 PM com.avamar.avinstaller.process.handler.ServerScriptHandler$TaskProcessor run
WARNING: From err out: 12/12 13:08:38 error: "mcserver.sh --start", exit status=1 (error)
12/12 13:08:43 error: ----->start MCS failed - operation failed

Dec 12, 2016 1:50:04 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush exit stderr:
Dec 12, 2016 2:50:00 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush command: /usr/local/avamar/bin/avinstaller.pl --flush
Dec 12, 2016 2:50:03 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush exit code: 1
Dec 12, 2016 2:50:03 PM com.avamar.avinstaller.util.FlushTimer run
INFO: Flush exit stdout: Flushing Avinstaller...
INFO: Flushing AvInstaller.
INFO: AvInstaller flushed.

Unfortunately there is no fix available for this issue. This is a known issue while upgrading from 6.1 to 6.1.3. See the release notes here

The workaround is to upgrade from 6.1 to 6.1.2 and then upgrade the 6.1.2 to 6.1.3
If this workaround fails, please raise a ticket with VMware Support. I cannot provide the resolution here due to confidentiality.

Hope this helps.

Tuesday, 20 December 2016

Part 1: Deploying And Configuring vSphere Replication 6.x

If you are using Site Recovery Manager, you would be using a replication technique to replicate the VM data from a protected to a recovery site. There are two ways to get this done. The first one is using a Host Based Replication and the second is Array Based Replication.

vSphere Replication is a Host Based Replication to perform VM replication via the VMkernel. In this case, I will have two vCenters. One in protected and other in recovery site. The deployment for replication is completely from the web client from the 6.x version onward.

Right click any one of the host and select Deploy OVF Template


Browse the local file location to access the .ovf file of this appliance. Select this file and click Open.
Once this file is loaded, click Next


You will be presented with the OVF details regarding the Vendor and the version of replication being deployed. Click Next


Accept EULA and click Next


Provide a Name for the replication appliance a destination folder for deployment and then click Next.


You can switch between 2 or 4 vCPU for this appliance. I will stick to the default. Click Next.


Select a Datastore to deploy this appliance on and choose an appropriate disk provisioning. Click Next.


Select the network where this appliance should reside and IP Protocol and allocation type. If the type is Static - Manual, then enter the Gateway, DNS and Subnet details.


Enter the root password for your vR and provide the NTP server details. Expand Networking Properties and enter the IP for this appliance.


Review the changes and then complete the OVF deployment.


Post the deployment of the appliance, you will have to configure this to the vCenter. To do this, go to the management URL of the vR appliance.

https://vR-IP:5480


Login with the root credentials that was setup during the deployment. Here, there are few details to be filled:

For the LookupService Address: If your vCenter is embedded PSC deployment, then enter the FQDN of the vCenter Server. If the vCenter is an external PSC deployment, then enter the FQDN of the PSC appliance. Enter the SSO user name and the password for the same.

Click Save and Restart Service. If you are requested to accept certificate information from vCenter / PSC, accept it. Post this, the replication appliance will be configured for your vCenter.


You should be seeing this in your vR plugin in web client. I have two appliances is because one is for protected site and the other is for recovery site. 


With this, the deployment and configuration of vR to vCenter is completed.

Part 2: Configure replication sites and configuring replication for a VM.

Monday, 19 December 2016

vSphere Client Console Does Not Display Full Screen

So, while opening a console for any virtual machine from one particular workstation, the console display is not sized correctly. Below is the screenshot of the display.


This was seen on only one machine, any user sessions and a reinstall of vSphere Client did not help to resolve this. And this was seen on a Windows 10 machine.

To resolve this, you will have to disable Display Scaling.

1. Right click the vSphere Client icon and select Properties
2. Click Compatibility
3. Check box, Disable display scaling on high DPI Settings.
4. Apply Settings
5. Reload the vSphere Client session.

Now, the console session should populate the full screen.

Hope this helps.