Tuesday, 15 November 2016

vSphere 6.5: What is vCenter High Availability

In 6.0 we had the option to provide high availability for the Platform Services Controller by deploying redundant PSC nodes in the same SSO domain and utilizing a manual re point command or a Load balancer to switch to a new PSC if the current one was down. However, for vCenter nodes there was no such option available, and the only way to have HA for vCenter node was to either configure Fault Tolerance or have the vCenter virtual machine in a HA enabled cluster.

Now with the release of vSphere 6.5, there has been a new much awaited feature added to provide redundancy or high availability for your vCenter node too. This is the VCHA or the vCenter High Availability feature.

The design of VCHA is somewhat similar to your regular clustering mechanism. Before we get to the working of this, here are few prerequisites for VCHA to work:

1. Applicable to vCenter Server Appliance only. Embedded VCSA is currently not supported.
2. Three unique ESXi hosts. One for each node (Active, Passive and Witness)
3. Three unique datastores to contain each of these nodes.
4. Same Single Sign On Domain for Active and Passive nodes
5. One public IP to access and use vCenter
6. 3 Private IP in a different subnet to that of public IP. This will be used for internal communication to check node state.

vCenter High Availability (VCHA) Deployment:
There are three nodes available or deployed once your vCenter is configured for high availability. Active node, Passive node and the Witness (Quorum) node. The active node will be the one that would have the Public IP vNIC in up state. This public IP will be used to access and connect to your vSphere Web Client for management purpose.

The second node is the Passive node which is the exact clone of the active node. It has the same memory, CPU and disk configurations as that of the Active node. The public IP vNIC will be down for this node and the vNIC used for Private IP will be up. The private network between Active and Passive is for cluster operations. The active node will have it's database and files updated regularly and this has to be synced across the Passive node, and these information will be synced over the Private network.

The third node, also called as quorum node acts as a witness. This node is introduced to avoid split-brain scenario which arises due to network partition. In a case of network partition we cannot have two active nodes up and running and the quorum node decides which node is active and which has to be passive.

The vPostgres Replication is used to enable database replication between active and passive nodes and this is a synchronous replication. The vCenter files are replicated using native Linux Rsync which is a asynchronous replication.

 What happens during a failover?

When the active node goes down, the passive node becomes the active and assumes the public IP address. The state of the VCHA cluster enters a degraded state since one of the node is down. The recovery time is not transparent and there will be a RTO of ~5 minutes.

Also, your cluster can enter a degraded state when your active node is still running in a healthy state, but either the passive or the witness node are down. In short, if one node in the cluster is down, then the VCHA is in a degraded state. More about VCHA states and deployment will be in a later article.

Hope this was helpful.

Friday, 11 November 2016

VDP / EMC Avamar Console: Login Incorrect Before Entering Password

Today while force simulating failed root login attempts I ran into an issue. So, I had a VDP 6.1.2 appliance installed and opened a console to this. Upon requesting for the "root" credentials, I simply went ahead and entered wrong password multiple times. Then, at one point of time, as soon as I enter "root" for VDP login, it complained Login Incorrect.


Even before entering the password the login was failing. If I SSH into the appliance, I was able to access the VDP. If I use admin to login to VDP either via console or SSH it works and I can sudo to root from there.

From messages.log the following was recorded during incorrect root login from the VM Console:

Nov  6 09:58:40 vdp58 login[26056]: FAILED LOGIN 1 FROM /dev/tty2 FOR root, Authentication failure

If I run which terminal I am on in the SSH of the VDP using the # tty command, I saw that the terminal used was /dev/pts/0
And when I logged into VDP as admin and sudo to root and checked for the terminal name, as expected, it was /dev/tty2

The Fix:

1. Change your directory to:
# cd /etc
2. Edit the securetty file using vi
# vi securetty
The contents of this file was "console" and "tty1"
Go ahead and add the terminal which you are trying to access the root user for the appliance from, in my case tty2, and save the file.

Post this, I was able to login as root from the VDP console.

Also, applies to EMC Avamar Virtual Edition.


Thursday, 3 November 2016

VDP Status Error: Data Domain Storage Status Is Not Available

Today when I logged into my lab to do bit of testing on my vSphere Data Protection 6.1 appliance I noticed the following in the VDP plugin in Web Client:

It was stating, "The Data Domain storage status is not available. Unable to get system information
So, I logged into the vdp-configure page and switched to the Storage tab and noticed the similar error message.

So the first instinct was to go ahead and Edit the Data Domain Settings to try re-adding it back. But that failed too, with the below message.


When a VDP is configured to a Data Domain, these configuration error logging will be in ddrmaintlogs located under /usr/local/avamar/var/ddrmaintlogs

Here, I noticed the following:

Oct 29 01:54:27 vdp58 ddrmaint.bin[30277]: Error: get-system-info::body - DDR_Open failed: 192.168.1.200, DDR result code: 5040, desc: calling system(), returns nonzero
Oct 29 01:54:27 vdp58 ddrmaint.bin[30277]: Error: <4780>Datadomain get system info failed.
Oct 29 01:54:27 vdp58 ddrmaint.bin[30277]: Info: ============================= get-system-info finished in 62 seconds
Oct 29 01:54:27 vdp58 ddrmaint.bin[30277]: Info: ============================= get-system-info cmd finished =============================
Oct 29 01:55:06 vdp58 ddrmaint.bin[30570]: Warning: Calling DDR_OPEN returned result code:5040 message:calling system(), returns nonzero

Well, this says a part of the problem. It is unable to fetch the Data Domain System Information. So the next thing is to see vdr-configure logs located under /usr/local/avamar/var/vdr/server_logs
All operations done in the vdp-configure page will be logged under vdr-configure logs.

Here, the following was seen:

2016-10-29 01:54:28,564 INFO  [http-nio-8543-exec-9]-services.DdrService: Error Code='E30973' Message='Description: The file system is disabled on the Data Domain system. The file system must be enabled in order to perform backups and restores. Data: null Remedy: Enable the file system by running the 'filesys enable' command on the Data Domain system. Domain:  Publish Time: 0 Severity: PROCESS Source Software: MCS:DD Summary: The file system is disabled. Type: ERROR'

This is a much detailed Error. So, I logged into my Data Domain system using the "sysadmin" credentials and ran the below command to check the status of the filesystem:
# filesys status

The output was:
The filesystem is enabled and running.

The Data Domain was reporting the file-system is already up and running. Perhaps this was in a non responding / stale state. So, I re-enabled the file-system using:
# filesys enable

Post this, the data domain automatically connected to the VDP appliance and the right status was displayed.

Tuesday, 1 November 2016

Avigui.html Shows Err_Connection_Refused in Avamar Virtual Edition 7.1

Recently I started deploying and testing the EMC Avamar Virtual Edition, and one of the first issue I ran into was with the configuration. The deployment of the appliance is pretty simple. The Avamar virtual edition 7.1 is a 7zip file, which when extracted provides the ovf file. Using the deploy ovf template option I was able to get this appliance deployed. Post this, as per the installation guide of AVE (Avamar Virtual Edition), I added the data drives, configured the networking for this appliance and rebooted post a successful configuration. 

However, when trying to access the https://avamar-IP:8543/avi/avigui.html, I received the Err_Connection_Closed message. No matter what I tried I was unable to get into the actual configuration GUI to initialize the services. 

Looks like there are couple of steps I had to run. There is a package called AviInstaller.pl which is responsible for package installations. So, this had to be installed. To do this, SSH into the avamar appliance as root and password as changeme and browse the below directory:
# cd /usr/local/avamar/src/
Run the aviInstaller bootstrap with the below command:
# ./avinstaller-bootstrap-version.sles11_64.x86_64.run

Once this runs, log back into the same avigui.html URL and we should be able to see the below login screen.
That's pretty much it.

Saturday, 29 October 2016

Migrating VDP From 5.8 and 6.0 To 6.1.x With Data Domain

You cannot upgrade a vSphere Data Protection appliance from 5.8.x and 6.0.x to 6.1.x due to the difference in the underlying SUSE Linux version. Since the earlier versions of vSphere Data Protection used SLES 11 SP1 and the 6.1.x uses SLES 11 SP3, we will be performing the migrate.

This article only discusses about migrating a VDP appliance from 5.8.x and 6.0.x with a data domain attached. If you had a VDP appliance without a data domain, we would choose the "Migrate" option in the vdp-configure wizard during the setup of the new 6.1.x appliance. However, this is not the path we will follow when the destination storage is an EMC Data Domain. A VDP appliance with Data Domain migration would be done by a process called as checkpoint restore. Let's discuss these steps below...

For this instance let's consider the following setup:
1. A vSphere Data Protection 5.8 appliance
2. A Virtual Edition of EMC Data Domain Appliance (Process is still the same for physical as well)
3. The 5.8 VDP was deployed as a 512GB deployment.
4. The IP address of this VDP appliance was 192.168.1.203
5. The IP address of the Data Domain appliance is 192.168.1.200

Pre-requisites:
1. In the point (3) above you saw that the 5.8 VDP appliance was setup with a 512 GB local drives. The first question that comes here is, why have a local drive when the backups are residing on the Data Domain?
A vSphere Data Protection appliance with a Data Domain would still have a local VMDK is to store the meta-data of the client backups. The actual data of the client is deduplicated and stored on the DD appliance and the meta-data of this backup is stored under the /data0?/cur directory on the VDP appliance. So, if your source appliance was of 512 GB deployment, then the destination has to be either equal to or greater than the source deployment.

2. The IP address, DNS name, domain and all other networking configuration of the destination appliance should be same as the source.

3. It is best to keep the same password on the destination appliance during the initial setup process.

4. On the source appliance make sure the Checkpoint Copy is Enabled. To verify this, go to https://vdp-ip:8543/vdp-configure page, select the Storage tab, click the Gear Icon and click Edit Data Domain. The first page displays this option. If this is not checked, then the checkpoint on the source appliance will not be copied over to the Data Domain, and you will not be able to perform a checkpoint restore.

The migration process:
1. Take a SSH to the source VDP appliance and run the below command to get the checkpoint list:
# cplist

The output would be similar to:
cp.20161011033032 Tue Oct 11 09:00:32 2016   valid rol ---  nodes   1/1 stripes     25
cp.20161011033312 Tue Oct 11 09:03:12 2016   valid --- ---  nodes   1/1 stripes     25

Make a note of this output.

2. Run the below command to obtain the Avamar System ID:
# avmaint config --ava | grep -i "system"
The output would be similar to:
  systemname="vdp58.vcloud.local"
  systemcreatetime="1476126720"
  systemcreateaddr="00:50:56:B9:3E:6D"

Make a note of this output as well.  1476126720 would be the Avamar System ID. This is used to determine which mTree this VDP appliance corresponds to on the Data Domain.

3. Run the below command to obtain the hashed Avamar Root Password. This would be to test the GSAN login if the migration fails. This will be used for VMware Support, so you can skip this step. 
# grep ap /usr/local/avamar/etc/usersettings.cfg
The output would be similar to:
password=6cbd70a95847fc58beb381e72600a4cb33d322cc3d9a262fdc17acdbeee80860a285534ab1427048

4. Power off the source appliance

5. Deploy VDP 6.1.x appliance via the OVF template, provide the same networking details during the ova deployment and power on the 6.1.x appliance once the ova deployment completes successfully.

6. Go to the https://vdp-ip:8543/vdp-configure page and complete the configuration process for the new appliance. As mentioned above, during the "Create Storage" section in the wizard specify the local storage space, either equal to or greater than the source VDP appliance system. Once the appliance configuration completes, it will reboot the new 6.1.x system.

7. Once the reboot is completed, open a SSH to the 6.1.x appliance and run the below command to list the available checkpoints on the data domain.
# ddrmaint cp-backup-list --full --ddr-server=<data-domain-IP> --ddr-user=<ddboost-user-name> --ddr-password=<ddboost-password>

Sample command from my lab:
# ddrmaint cp-backup-list --full --ddr-server=192.168.1.200 --ddr-user=ddboost-user --ddr-password=VMware123!
The output would be similar to:
================== Checkpoint ==================
 Avamar Server Name           : vdp58.vcloud.local
 Avamar Server MTree/LSU      : avamar-1476126720
 Data Domain System Name      : 192.168.1.200
 Avamar Client Path           : /MC_SYSTEM/avamar-1476126720
 Avamar Client ID             : 200e7808ddcde518fe08b6778567fa4f397e97fc
 Checkpoint Name              : cp.20161011033032
 Checkpoint Backup Date       : 2016-10-11 09:02:07
 Data Partitions              : 3
 Attached Data Domain systems : 192.168.1.200

The highlighted parts are what we need. The avamar-1476126720 would be the Avamar mTree on the data domain. We received this system ID earlier in this article. The checkpoint cp.20161011033032 was also a checkpoint on the source VDP appliance which was copied over to the data domain.

8. Now, we will perform a cprestore to this checkpoint. The command to perform the cprestore is:
# /usr/local/avamar/bin/#: cprestore --hfscreatetime=<avamar-ID> --ddr-server=<data-domain-IP> --ddr-user=<ddboost-user-name> --cptag=<checkpoint-name>

Sample command from my lab:
# /usr/local/avamar/bin/#: cprestore --hfscreatetime=1476126720 --ddr-server=192.168.1.200 --ddr-user=ddboost-user --cptag=cp.20161011033032
Where, 1476126720 is the Avamar System ID and cp.20161011033032 is a valid checkpoint. Do not rollback if the checkpoint is not valid. If the checkpoint is not validated, then on the source VDP appliance you will have to run an integrity check to generate a valid checkpoint and copy this over to the Data Domain system.

The output would be:
Version: 1.11.1
Current working directory: /space/avamar/var
Log file: cprestore-cp.20161011033032.log
Checking node type.
Node type: single-node server
Create DD NFS Export: data/col1/avamar-1476126720/GSAN
ssh ddboost-user@192.168.1.200 nfs add /data/col1/avamar-1476126720/GSAN 192.168.1.203 "(ro,no_root_squash,no_all_squash,secure)"
Execute: ssh ddboost-user@192.168.1.200 nfs add /data/col1/avamar-1476126720/GSAN 192.168.1.203 "(ro,no_root_squash,no_all_squash,secure)"
Warning: Permanently added '192.168.1.200' (RSA) to the list of known hosts.
Data Domain OS
Password:

Enter the data domain password when prompted. Once the password is authenticated, the cprestore will start. It is going to copy the meta data of the backups for the displayed checkpoint on to the 6.1.x appliance. 

The output would be similar to:
[Thu Oct  6 08:24:44 2016] (22497) 'ddnfs_gsan/cp.20161011033032/data01/0000000000000015.chd' -> '/data01/cp.20161011033032/0000000000000015.chd'
[Thu Oct  6 08:24:44 2016] (22498) 'ddnfs_gsan/cp.20161011033032/data02/0000000000000019.wlg' -> '/data02/cp.20161011033032/0000000000000019.wlg'
[Thu Oct  6 08:24:44 2016] (22497) 'ddnfs_gsan/cp.20161011033032/data01/0000000000000015.wlg' -> '/data01/cp.20161011033032/0000000000000015.wlg'
[Thu Oct  6 08:24:44 2016] (22499) 'ddnfs_gsan/cp.20161011033032/data03/0000000000000014.wlg' -> '/data03/cp.20161011033032/0000000000000014.wlg'
[Thu Oct  6 08:24:44 2016] (22498) 'ddnfs_gsan/cp.20161011033032/data02/checkpoint-complete' -> '/data02/cp.20161011033032/checkpoint-complete'
[Thu Oct  6 08:24:44 2016] (22499) 'ddnfs_gsan/cp.20161011033032/data03/0000000000000016.chd' -> '/data03/cp.20161011033032/0000000000000016.chd'

This would keep going on until all the meta-data is copied over. The length of cprestore process would depend on the amount of backup data. Once the process is complete you will see the below message.

Restore data01 finished.
Cleanup restore for data01
Changing owner/group and permissions: /data01/cp.20161011033032
PID 22497 returned with exit code 0
Restore data03 finished.
Cleanup restore for data03
Changing owner/group and permissions: /data03/cp.20161011033032
PID 22499 returned with exit code 0
Finished restoring files in 00:00:04.
Restoring ddr_info.
Copy: 'ddnfs_gsan/cp.20161011033032/ddr_info' -> '/usr/local/avamar/var/ddr_info'
Unmount NFS path 'ddnfs_gsan' in 3 seconds
Execute: sudo umount "ddnfs_gsan"
Remove DD NFS Export: data/col1/avamar-1476126720/GSAN
ssh ddboost-user@192.168.1.200 nfs del /data/col1/avamar-1476126720/GSAN 192.168.1.203
Execute: ssh ddboost-user@192.168.1.200 nfs del /data/col1/avamar-1476126720/GSAN 192.168.1.203
Data Domain OS
Password:
kthxbye

Once the data domain password is entered, the cprestore process completes with a kthxbye message.

9. Run the # cplist command on the 6.1.x appliance and you should notice that the checkpoint that was displayed in the cpbackup list is now listing under the 6.1.x checkpoints:

cp.20161006013247 Thu Oct  6 07:02:47 2016   valid hfs ---  nodes   1/1 stripes     25
cp.20161011033032 Tue Oct 11 09:00:32 2016   valid rol ---  nodes   1/1 stripes     25

The cp.20161006013247 is the 6.1.x appliance's local checkpoint and the cp.20161011033032 is the checkpoint of source appliance which was copied over from the data domain during the cprestore.

10. Once the restore is complete, we need to perform a rollback to this checkpoint. So first, you will have to stop all core services on the 6.1.x appliance using the below command:
# dpnctl stop
11. Initiate the force rollback using the below command:
# dpnctl start --force_rollback

You will see the following output:
Identity added: /home/dpn/.ssh/dpnid (/home/dpn/.ssh/dpnid)
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
Action: starting all
Have you contacted Avamar Technical Support to ensure that this
  is the right thing to do?
Answering y(es) proceeds with starting all;
          n(o) or q(uit) exits
y(es), n(o), q(uit/exit):

Select yes (y) to initiate the rollback. The next set of output you will see is:

dpnctl: INFO: Checking that gsan was shut down cleanly...
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
Here is the most recent available checkpoint:
  Tue Oct 11 03:30:32 2016 UTC Validated(type=rolling)
A rollback was requested.
The gsan was shut down cleanly.

The choices are as follows:
  1   roll back to the most recent checkpoint, whether or not validated
  2   roll back to the most recent validated checkpoint
  3   select a specific checkpoint to which to roll back
  4   restart, but do not roll back
  5   do not restart
  q   quit/exit

Choose option 3 and the next set of output you will see is:

Here is the list of available checkpoints:

     2   Thu Oct  6 01:32:47 2016 UTC Validated(type=full)
     1   Tue Oct 11 03:30:32 2016 UTC Validated(type=rolling)

Please select the number of a checkpoint to which to roll back.

Alternatively:
     q   return to previous menu without selecting a checkpoint
(Entering an empty (blank) line twice quits/exits.)

So in the earlier cplist command you will notice that the cp.20161011033032 had a time-stamp of Oct 11. So choose option (1) and the next output you will see is:
-  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -  -
You have selected this checkpoint:
  name:       cp.20161011033032
  date:       Tue Oct 11 03:30:32 2016 UTC
  validated:  yes
  age:        -7229 minutes

Roll back to this checkpoint?
Answering y(es)  accepts this checkpoint and initiates rollback
          n(o)   rejects this checkpoint and returns to the main menu
          q(uit) exits

Verify if this indeed the checkpoint and proceed yes (y) upon confirmation. The GSAN and MCS rollback begins and you will notice this in the console:

dpnctl: INFO: rolling back to checkpoint "cp.20161011033032" and restarting the gsan succeeded.
dpnctl: INFO: gsan started.
dpnctl: INFO: Restoring MCS data...
dpnctl: INFO: MCS data restored.
dpnctl: INFO: Starting MCS...
dpnctl: INFO: To monitor progress, run in another window: tail -f /tmp/dpnctl-mcs-start-output-24536
dpnctl: WARNING: 1 warning seen in output of "[ -r /etc/profile ] && . /etc/profile ; /usr/local/avamar/bin/mcserver.sh --start"
dpnctl: INFO: MCS started.

**If this process fails, open a ticket with VMware support. I cannot provide the troubleshooting steps for this as this is confidential. Request / Add information in your support ticket to contact me if needed for the engineer assigned to run a check past me**

If the rollback goes through successfully you might be presented with an option to restore the tomcat database.

Do you wish to do a restore of the local EMS data?

Answering y(es) will restore the local EMS data
          n(o) will leave the existing EMS data alone
          q(uit) exits with no further action.

Please consult with Avamar Technical Support before answering y(es).

Answer n(o) here unless you have a special need to restore
  the EMS data, e.g., you are restoring this node from scratch,
  or you know for a fact that you are having EMS database problems
  that require restoring the database.

y(es), n(o), q(uit/exit):

I would choose no if my database is not causing issues in my environment. Post this, the remaining services will be started. The output:

dpnctl: INFO: EM Tomcat started.
dpnctl: INFO: Resuming backup scheduler...
dpnctl: INFO: Backup scheduler resumed.
dpnctl: INFO: AvInstaller is already running.
dpnctl: INFO: [see log file "/usr/local/avamar/var/log/dpnctl.log"]

That should be pretty much it. When you login to https://vdp-ip:8543/vdp-configure page, you should be able to see the Data Domain automatically in the Storage Tab. If not, open a support ticket with VMware

There are couple of post-migration steps:
1. If you are using internal proxy, un-register the proxy and re-register it back from the VDP configure page.
2. External proxies (if used) will be orphaned, so you will have to delete the external proxies, change the VDP root password and re-add the external proxy
3. If you are using Guest Level backups, then the agents for SQL, Exchange, Sharepoint has to be re-installed. 
4. If this appliance is replicating to another VDP appliance, then the replication agents need to be re-registered. Follow the below 4 commands in the same order to perform this:
# service avagent-replicate stop
# service avagent-replicate unregister 127.0.0.1 /MC_SYSTEM
# service avagent-replicate register 127.0.0.1 /MC_SYSTEM
# service avagent-replicate start

And that should be it...

Friday, 28 October 2016

VDP Stuck In A Configuration Loop

There have been a few cases logged with VMware where the newly deployed VDP appliance gets stuck in a configuration loop. Not to worry, there is now a fix for this. 

A little insight to what this is: So, we will go ahead and deploy a VDP (6.1.2 in my case) as an ova template. The deployment goes through successfully, and then we power On the VDP appliance which too completes successfully. Then, we go to the https://vdp-ip:8543/vdp-configure page and run through the configuration wizard. Everything goes here as well, the configuration wizard completes and requests you to reboot the appliance. Once the appliance is rebooted, it's going to make certain changes to the appliance, configure alarms and initialize core services. There will be a task called as "VDP: Configure Appliance" which will be initiated. Here, this task gets stuck somewhere around 45 to 70 percent. The appliance will boot up completely, however, when you go back to the vdp-configure page, you will notice that it is taking you through the configuration wizard again. You can run up to the configure storage section post which you will receive an error, as the appliance is already configured with the storage. And no matter which browser or how many times you access this vdp-configure page, you will be taken back to the configuration wizard. This will end up as an infinite loop.

This issue is mainly and mostly (almost certainly) seen only on vCenter 5.5 U3e release. This is because, the VDP uses JSAFE/BSAFE Java libraries and these do not go well with the vCenter SSL ciphers in the 5.5 U3e. To fix this, we switch from JSAFE to Java JCE libraries on the VDP appliance.

Before, we get to this, you can visit the vdr-server.log at the time of the issue (/usr/local/avamar/var/vdr/server_logs) to verify the following:

2016-10-29 01:15:40,676 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: vcenter-ignore-cert ? true
2016-10-29 01:15:40,714 WARN  [Thread-7]-vi.VCenterServiceImpl: No VCenter found in MC root domain
2016-10-29 01:15:40,714 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: visdkUrl = https:/sdk
2016-10-29 01:15:40,715 ERROR [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: Failed To Create ViJava ServiceInstance owing to Remote VCenter connection error
java.rmi.RemoteException: VI SDK invoke exception:java.lang.IllegalArgumentException: protocol = https host = null; nested exception is:
        java.lang.IllegalArgumentException: protocol = https host = null
        at com.vmware.vim25.ws.WSClient.invoke(WSClient.java:139)
        at com.vmware.vim25.ws.VimStub.retrieveServiceContent(VimStub.java:2114)
        at com.vmware.vim25.mo.ServiceInstance.<init>(ServiceInstance.java:117)
        at com.vmware.vim25.mo.ServiceInstance.<init>(ServiceInstance.java:95)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:297)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:159)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:104)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.createViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:96)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.getViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:74)
        at com.emc.vdp2.common.vi.ViJavaServiceInstanceProviderImpl.waitForViJavaServiceInstance(ViJavaServiceInstanceProviderImpl.java:212)
        at com.emc.vdp2.server.VDRServletLifeCycleListener$1.run(VDRServletLifeCycleListener.java:71)
        at java.lang.Thread.run(Unknown Source)

Caused by: java.lang.IllegalArgumentException: protocol = https host = null
        at sun.net.spi.DefaultProxySelector.select(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect0(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.plainConnect(Unknown Source)
        at sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream0(Unknown Source)
        at sun.net.www.protocol.http.HttpURLConnection.getOutputStream(Unknown Source)
        at sun.net.www.protocol.https.HttpsURLConnectionImpl.getOutputStream(Unknown Source)
        at com.vmware.vim25.ws.WSClient.post(WSClient.java:216)
        at com.vmware.vim25.ws.WSClient.invoke(WSClient.java:133)
        ... 11 more

2016-10-29 01:15:40,715 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: Retry ViJava ServiceInstance Acquisition In 5 Seconds...
2016-10-29 01:15:45,716 INFO  [Thread-7]-vi.ViJavaServiceInstanceProviderImpl: vcenter-ignore-cert ? true
2016-10-29 01:15:45,819 WARN  [Thread-7]-vi.VCenterServiceImpl: No VCenter found in MC root domain

The mcserver.out log file should show the below:

Caught Exception : Exception : org.apache.axis.AxisFault Message : ; nested exception is:
javax.net.ssl.SSLHandshakeException: Unsupported curve: 1.2.840.10045.3.1.7 StackTrace : AxisFault
faultCode: {http://schemas.xmlsoap.org/soap/envelope/}Server.userException faultSubcode:
faultString: javax.net.ssl.SSLHandshakeException: Unsupported curve: 1.2.840.10045.3.1.7 faultActor:
faultNode:
faultDetail:
{http://xml.apache.org/axis/}stackTrace:javax.net.ssl.SSLHandshakeException: Unsupported curve: 1.2.840.10045.3.1.7

To fix this:

1. Discard the newly deployed appliance completely. 
2. Deploy the VDP appliance again. Go through the ova deployment and power on the appliance. Stop here, do not go to the vdp-configure page.

3. To enable the Java JCE library we need to add a particular line in the mcsutils.pm file under the $prefs variable. The line is exactly as below:

. "-Dsecurity.provider.rsa.JsafeJCE.position=last "

4. vi the following file;
# vi  /usr/local/avamar/lib/mcsutils.pm
The original content would look like:

my $rmidef = "-Djava.rmi.server.hostname=$rmihost ";
   my $prefs = "-Djava.util.logging.config.file=$mcsvar::lib_dir/mcserver_logging.properties "
             . "-Djava.security.egd=file:/dev/./urandom "
             . "-Djava.io.tmpdir=$mcsvar::tmp_dir "
             . "-Djava.util.prefs.PreferencesFactory=com.avamar.mc.util.MCServerPreferencesFactory "
             . "-Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl "
             . "-Djavax.net.ssl.keyStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Djavax.net.ssl.trustStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Dfile.encoding=UTF-8 "
             . "-Dlog4j.configuration=file://$mcsvar::lib_dir/log4j.properties ";  # vmware/axis

After editing it would look like:

 my $rmidef = "-Djava.rmi.server.hostname=$rmihost ";
   my $prefs = "-Djava.util.logging.config.file=$mcsvar::lib_dir/mcserver_logging.properties "
             . "-Djava.security.egd=file:/dev/./urandom "
             . "-Djava.io.tmpdir=$mcsvar::tmp_dir "
             . "-Djava.util.prefs.PreferencesFactory=com.avamar.mc.util.MCServerPreferencesFactory "
             . "-Djavax.xml.parsers.DocumentBuilderFactory=org.apache.xerces.jaxp.DocumentBuilderFactoryImpl "
             . "-Djavax.net.ssl.keyStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Djavax.net.ssl.trustStore=" . MCServer::get( "rmi_ssl_keystore" ) ." "
             . "-Dfile.encoding=UTF-8 "
             . "-Dsecurity.provider.rsa.JsafeJCE.position=last "
             . "-Dlog4j.configuration=file://$mcsvar::lib_dir/log4j.properties ";  # vmware/axis

5. Save the file
6. There is no use of restarting mcs using mcserver.sh --restart, as the VDP appliance is not yet configured and hence the core services are not yet initialized. 
7. Reboot the appliance.
8. Once the appliance is booted up, go to the configure page and begin the configuration and this should avoid the configuration loop issue.

If the VDP was already deployed and the vCenter was upgraded later, then you can follow the same steps until 6. Instead of rebooting the VDP this time, we should be good to restart the MCS using the mcserver.sh --restart --verbose command.

That's it. A permanent fix is in talks with engineering for the future VDP release.

Update:
A permanent fix is in 6.1.3 version of VDP.