Tuesday, 9 January 2018

Unable To Connect VDP To Web Client

In couple of cases, either on fresh VDP deploy or an existing deployment, the connection from VDP to web client via the plugin might fail. 

It would generically say, 

Unable to connect to the requested VDP Appliance. Would you like to be directed to the VDP Configuration utility to troubleshoot the issue? 



The vdp-configure page does not tell much about any of these errors and all the services seem to be running fine. You can also try restarting tomcat service on VDP using emwebapp.sh --restart, however that would might not help. 

In case of this above error, the first logs you need to look at is the web client logs on the vCenter. And in this, the following was logged:

[2018-01-09T19:42:06.634Z] [INFO ] http-bio-9443-exec-27         com.emc.vdp2.api.impl.BaseApi                                     Connecting to VDP at: [https://x.x.x.x:8543/vdr-server/auth/login]
[2018-01-09T19:42:06.646Z] [INFO ] http-bio-9443-exec-27         com.emc.vdp2.api.impl.BaseApi                                     Setting the session ID to: null
[2018-01-09T19:42:06.656Z] [WARN ] http-bio-9443-exec-27         org.springframework.flex.core.DefaultExceptionLogger              The following exception occurred during request processing by the BlazeDS MessageBroker and will be serialized back to the client:  flex.messaging.MessageException: java.lang.NullPointerException : null

[2018-01-09T19:42:06.889Z] [WARN ] http-bio-9443-exec-26 org.springframework.flex.core.DefaultExceptionLogger The following exception occurred during request processing by the BlazeDS MessageBroker and will be serialized back to the client: flex.messaging.MessageException: org.eclipse.gemini.blueprint.service.ServiceUnavailableException : service matching filter
=[(objectClass=com.emc.vdp2.api.ActionApiIf)] unavailable
at flex.messaging.services.remoting.adapters.JavaAdapter.invoke(JavaAdapter.java:444)
at com.vmware.vise.messaging.remoting.JavaAdapterEx.invoke(JavaAdapterEx.java:50)
at flex.messaging.services.RemotingService.serviceMessage(RemotingService.java:183)
at flex.messaging.MessageBroker.routeMessageToService(MessageBroker.java:1400)
at flex.messaging.endpoints.AbstractEndpoint.serviceMessage(AbstractEndpoint.java:1011)
at flex.messaging.endpoints.AbstractEndpoint$$FastClassByCGLIB$$1a3ef066.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
at org.springframework.aop.framework.Cglib2AopProxy$CglibMethodInvocation.invokeJoinpoint(Cglib2AopProxy.java:689)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:150)
at org.springframework.flex.core.MessageInterceptionAdvice.invoke(MessageInterceptionAdvice.java:66)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.adapter.ThrowsAdviceInterceptor.invoke(ThrowsAdviceInterceptor.java:124)
at org.springframework.aop.framework.ReflectiveMethodInvocation.proceed(ReflectiveMethodInvocation.java:172)
at org.springframework.aop.framework.Cglib2AopProxy$FixedChainStaticTargetInterceptor.intercept(Cglib2AopProxy.java:573)
at flex.messaging.endpoints.AMFEndpoint$$EnhancerByCGLIB$$72c7df65.serviceMessage(<generated>)
at flex.messaging.endpoints.amf.MessageBrokerFilter.invoke(MessageBrokerFilter.java:103)
at flex.messaging.endpoints.amf.LegacyFilter.invoke(LegacyFilter.java:158)
at flex.messaging.endpoints.amf.SessionFilter.invoke(SessionFilter.java:44)
at flex.messaging.endpoints.amf.BatchProcessFilter.invoke(BatchProcessFilter.java:67)
at flex.messaging.endpoints.amf.SerializationFilter.invoke(SerializationFilter.java:166)
at flex.messaging.endpoints.BaseHTTPEndpoint.service(BaseHTTPEndpoint.java:291)
at flex.messaging.endpoints.AMFEndpoint$$EnhancerByCGLIB$$72c7df65.service(<generated>)
at org.springframework.flex.servlet.MessageBrokerHandlerAdapter.handle(MessageBrokerHandlerAdapter.java:109)
at org.springframework.web.servlet.DispatcherServlet.doDispatch(DispatcherServlet.java:923)
at org.springframework.web.servlet.DispatcherServlet.doService(DispatcherServlet.java:852)

In an event when we see this back trace with the com.emc.vdp2.api.ActionApi, the connection call is unavailable when a login request is sent by VDP. And due to this the connection fails. 

To resolve this clear the SerenityDB on the vCenter Server. Follow this Knowledge base article for the procedure.

Post this, re-login back to web client and then connect to the VDP appliance and it should go through successfully. 

Hope this helps.

Wednesday, 6 December 2017

SRM Service Crashes After A Failed Recovery With "abrRecoveryEngine" Backtrace

In some instances, when you are running Array Based Replication for SRM, a failed planned migration might cause the SRM service to crash. In the vmware-dr.log found on the SRM machine, we will notice the following backtrace

2017-12-06T09:55:38.620-05:00 panic vmware-dr[06076] [Originator@6876 sub=Default] 
--> 
--> Panic: Assert Failed: "ok (Dr::Providers::Abr::AbrRecoveryEngine::AbrRecoveryEngineImpl::LoadFromDb: Unable to insert post failover info object 212337205 for group vm-protection-group-121101624 array pair array-pair-7065)" @ d:/build/ob/bora-6014840/srm/src/providers/abr/common/abrRecoveryEngine/abrRecoveryEngine.cpp:244
--> Backtrace:
--> [backtrace begin] product: VMware vCenter Site Recovery Manager, version: 6.5.1, build: build-6014840, tag: vmware-dr, cpu: x86_64, os: windows, buildType: release
--> backtrace[00] vmacore.dll[0x001F29FA]
--> backtrace[01] vmacore.dll[0x00067D60]
--> backtrace[02] vmacore.dll[0x0006A20E]
--> backtrace[03] vmacore.dll[0x002245A7]
--> backtrace[04] vmacore.dll[0x00224771]
--> backtrace[05] vmacore.dll[0x00059C0D]
--> backtrace[06] dr-abr-recoveryEngine.dll[0x00028A91]
--> backtrace[07] dr-abr-recoveryEngine.dll[0x00015199]
--> backtrace[08] dr-abr-recoveryEngine.dll[0x002DB368]
--> backtrace[09] dr-abr-recoveryEngine.dll[0x002DB913]
--> backtrace[10] vmacore.dll[0x001D6ACC]
--> backtrace[11] vmacore.dll[0x001865AB]
--> backtrace[12] vmacore.dll[0x0018759C]
--> backtrace[13] vmacore.dll[0x002202E9]
--> backtrace[14] MSVCR120.dll[0x00024F7F]
--> backtrace[15] MSVCR120.dll[0x00025126]
--> backtrace[16] KERNEL32.DLL[0x000013D2]
--> backtrace[17] ntdll.dll[0x000154E4]
--> [backtrace end]

This is seen when there are issues unmounting the source datastore or demoting the source datastore. 

Disclaimer: Modifying database tables is done by VMware. Do this at your own risk.

The fix is:

1. Make sure SRM service is stopped on both sites
2. Backup the SRM databases on both sites
3. Login to the database either using PGadmin or SQL management studio depending on the type of database used
4. Open this table "pda_grouppostfailoverinfo"
5. Here we need to remove the db_id which is available from the back trace. In my case it is: 212337205
6. Once this is done, start the SRM service. If it crashes again, it usually generates another object ID and repeat the process.

And that should be it.

Thursday, 30 November 2017

Unable To Protect a VM In SRM: "Object not found"

So there's a rare instance where you will be unable to protect a VM and the error it throws out is:
Internal error: class Vmacore::NotFoundException "Object not found"

Under Protection Groups > Related Objects > Virtual Machines, you will see the VM coming up as Not Configured.


And when you try to right click this and say Configure protection, you will notice that the Device Status will come up as Non-replicated 



And if you browse the recovery location and provide the path of the replicated VMDK, you will run into this error.

In the web client logs, you will see:

[2017-11-28T09:27:50.156-06:00] [ERROR] srm-client-thread-1253 70015389 101315 201173 com.vmware.srm.client.infraservice.tasks.FakeTaskImpl [DrVmodlFakeTask:srm-fake-task-11:fake-server-guid]: com.vmware.vim.binding.dr.fault.DrRuntimeFault: Task Failed
at com.vmware.srm.client.infraservice.util.ExceptionUtil.newRuntimeFault(ExceptionUtil.java:92)
at com.vmware.srm.client.infraservice.util.ExceptionUtil.newRuntimeFault(ExceptionUtil.java:68)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl.getSingleError(MultiTaskProgressUpdaterImpl.java:89)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl.updateProgress(MultiTaskProgressUpdaterImpl.java:222)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl$3.run(MultiTaskProgressUpdaterImpl.java:431)
at $java.lang.Runnable$$FastClassByCGLIB$$36fc6471.invoke(<generated>)
at net.sf.cglib.proxy.MethodProxy.invoke(MethodProxy.java:149)
at com.vmware.srm.client.topology.impl.osgi.aop.HttpRequestContextAdvice$CallInterceptor.intercept(HttpRequestContextAdvice.java:53)
at com.vmware.srm.client.topology.impl.osgi.aop.HttpRequestContextAdvice$Base$$EnhancerByCGLIB$$b6ab80b4.run(<generated>)
at com.vmware.srm.client.infraservice.tasks.MultiTaskProgressUpdaterImpl$4.run(MultiTaskProgressUpdaterImpl.java:442)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
at java.lang.Thread.run(Thread.java:748)
Caused by: com.vmware.vim.binding.dr.fault.InternalError: Internal error: class Vmacore::NotFoundException "Object not found"
[context]zKq7AVMEAQAAAHjHWwAUdm13YXJlLWRyAACoLwpkci1yZXBsaWNhdGlvbi5kbGwAAGEbCgASaT8AAy5BAOv/QACT9EABuSMCY29ubmVjdGlvbi1iYXNlLmRsbAABx3QCAccrAgGg8AABPUMBAccrAgGSLgMBdwgDARb3AgHHKwIBuSMCAXcIAwEW9wIBxysC[/context].
at sun.reflect.GeneratedConstructorAccessor614.newInstance(Unknown Source)
at sun.reflect.DelegatingConstructorAccessorImpl.newInstance(DelegatingConstructorAccessorImpl.java:45)
at java.lang.reflect.Constructor.newInstance(Constructor.java:423)
at java.lang.Class.newInstance(Class.java:442)



The reason, one of them, is the source VMX file has some corrupt or incorrect entries.
So let's have a look at the VM's vmx file.

I will be looking for lines in this file which has a datastore path reference like:

vmx.log.filename = "/vmfs/volumes/58780b1d-045e1100-0efa-0025b5e01a45/Test-1/vmware.log"
sched.swap.derivedName = "/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1/Test-1-932448b9.vswp"

I have two UUIDs here, 58780b1d-045e1100-0efa-0025b5e01a45 and 59a30e4d-647fd9f2-2e66-000c295e9f61

But, when I run:

[root@Wendy:/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1] esxcfg-scsidevs -m
mpx.vmhba1:C0:T0:L0:3                                            /vmfs/devices/disks/mpx.vmhba1:C0:T0:L0:3 599ffcb3-d9ece508-7576-000c295e9f61  0  Wendy-Local
mpx.vmhba1:C0:T1:L0:1                                            /vmfs/devices/disks/mpx.vmhba1:C0:T1:L0:1 59a30e4d-647fd9f2-2e66-000c295e9f61  0  VDP-Storage

I just have these two UUIDs which do not match the one's in the VMX file. So these incorrect references are causing this drive status to be non replicated in turn causing issues with VM protection.
You might have one or more such entries in the VMX file. 

Power off the virtual machine on source and then backup the VMX file and edit it to provide the UUID of the datastore where the VM resides / the appropriate UUID where the respective files should reside. In my case the Test-1 VM runs on VDP-Storage, which is 59a30e4d-647fd9f2-2e66-000c295e9f61

So the new VMX entry looks as:

vmx.log.filename = "/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1/vmware.log"
sched.swap.derivedName = "/vmfs/volumes/59a30e4d-647fd9f2-2e66-000c295e9f61/Test-1/Test-1-932448b9.vswp"

Reload the VMX using:

# vim-cmd vmsvc/reload <vm-id>

The vm-id can be obtained from

# vim-cmd vmsvc/getallvms

Then Power on the VM and then right click the VM in protection group and configure recovery, this time the hard drive status will be displayed as replicated.


And that's pretty much it. Usually this is seen, when vmware.log files are configured to a different datastore and that particular datastore is no longer available.

Hope this helps.

Wednesday, 8 November 2017

VDP Expired Certificate

There has been a lot of issues going on around the VDP deployment due to an expired certificate issued to the OVF template.

Basically, if you are running vCenter 6.5. then the web client is the only option to deploy the OVA files. And you cannot move past the section where it displays the certificate section as expired. If you are using pre 6.5 vCenter, then you can deploy this through the Windows C# client. Even though it says "Invalid" certificate, you can still click Next and proceed further.

If you are on 6.5, then the workaround is this:
1. Download the required version of VDP Server. All of them have their certificates expired around September.
2. Use a 7-zip utility to extract the OVA template. This will give you 4 files. The VMDK, OVF, MF and the CER.
3. In web client, when you deploy OVA, you can multi select the files. So select the 3 files (vmdk, ovf and mf) excluding the .cer file
4. This then displays No Certificate during the deployment and let's you proceed further.

This certificate is signed just for the OVA template and not for any particular port / service for the VDP itself.

EMC is currently working to update the certificate information for these templates. Hope this helps!

Monday, 28 August 2017

Bash Script To Extract vSphere Replication Job Information

Below is one bash script that extracts information about replication for configured VMs. It displays, the name of the virtual machine, if yes or no for quiesce Guest OS and Network Compression. Then it tabulates RPO (in minutes) as "bc" is unsupported on vR SUSE to perform hour floating calculations and then the datastore MoRef ID.

The complete updated script can be accessed from my GitHub Repo:
https://github.com/happycow92/shellscripts/blob/master/vR-jobs.sh

As and when I add more or reformat the information the script in the link will be updated.

#!/bin/bash
clear
echo -e " -----------------------------------------------------------------------------------------------------------"
echo -e "| Virtual Machine | Network Compression | Quiesce | RPO | Datastore MoRef ID |"
echo -e " -----------------------------------------------------------------------------------------------------------"
cd /opt/vmware/vpostgres/9.3/bin
./psql -U vrmsdb << EOF
\o /tmp/info.txt
select name from groupentity;
select networkcompressionenabled from groupentity;
select rpo from groupentity;
select quiesceguestenabled from groupentity;
select configfilesdatastoremoid from virtualmachineentity;
EOF
cd /tmp
name_array=($(awk '/name/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
quiesce_array=($(awk '/networkcompressionenabled/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
compression_array=($(awk '/quiesceguestenabled/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
rpo_array=($(awk '/rpo/{i=1;next}/ro*/{i=0}{if (i==1){i++;next}}i' info.txt))
datastore_array=($(awk '/configfilesdatastoremoid/{i=1;next}/ro/{i=0} {if (i==1){i++;next}}i' info.txt))
length=${#name_array[@]}
for ((i=0;i<$length;i++));
do
printf "| %-32s | %-23s | %-10s | %-10s| %-20s|\n" "${name_array[$i]}" "${quiesce_array[$i]}" "${compression_array[$i]}" "${rpo_array[$i]}" "${datastore_array[$i]}"
done
rm -f info.txt
echo && echo

For any questions, do let me know. Hope this helps. Thanks.

Wednesday, 9 August 2017

Bash Script To Export VDP Backup Job Details

So you can use this script to export your current backup and replication job configurations to a text file and save it to your local desktop. In case if you run into any redeployment situation and you are unaware of the backup configuration, you can have a look at the exported text file.

The script exports, Job Name, State of the job, Clients in the job, Schedule, Retention and the type.
It currently does not export agent level backup jobs such as SQL, Exchange and Share-point.

The script needs the MCS service to be up as it relies on that. I am planning to export details from psql which can be used even when MCS is down.

This is what I have for right now. The script can be accessed from the below link:
https://github.com/happycow92/shellscripts/blob/master/backup-job-detail.sh

Suggestions and bugs are always welcome. Drop a comment for anything.

Hope this helps!

Sunday, 30 July 2017

Bash Script To Determine Retired Clients.

While in VDP you have a built in feature for unprotected VMs (That is VMs not added to VDP backup job) you might need a script to determine if VMs are missing from a backup job.

The script has a simple algorithm:
> The first time it runs it creates a file to gather all the protected client list
> The next time it runs it will check what is missing since the last protect client list.
> New added VMs will not be considered as Missing, however on Next iteration of script execution it will run a check to see if the new clients are missing.
> If you remove the first generated file for protected list post your second execution, then the third iteration will be void as it will generate a new protected client list.

The script has an email feature to send the output to a mailing address. If you want to exclude this, then discard line-21 to line-32. If you want to run the script as a cronjob, you can add it to crontab -e, but you cannot have manual email address input running in the script. You will have to create a constant for your email address and call it in the EOF.

The script can be accessed from my repository here:
https://github.com/happycow92/shellscripts/blob/master/missing-client.sh

The code {}

#!/bin/bash
IFS=$(echo -en "\n\b")
FILE=/tmp/protected_client.txt
if [ ! -f $FILE ]
then
client_list=$(mccli client show --recursive=true | grep -i /$(cat /usr/local/vdr/etc/vcenterinfo.cfg | grep vcenter-hostname | cut -d '=' -f 2)/VirtualMachines | awk -F/ '{print $(NF-2)}')
echo "$client_list" &> /tmp/protected_client.txt
sort /tmp/protected_client.txt -o /tmp/protected_client.txt
else
new_list=$(mccli client show --recursive=true | grep -i /$(cat /usr/local/vdr/etc/vcenterinfo.cfg | grep vcenter-hostname | cut -d '=' -f 2)/VirtualMachines | awk -F/ '{print $(NF-2)}')
echo "$new_list" &> /tmp/new_list.txt
sort /tmp/new_list.txt -o /tmp/new_list.txt
missing=$(comm -3 /tmp/protected_client.txt /tmp/new_list.txt | sed 's/^ *//g')
if [ -z "$missing" ]
then
printf "\nNo Client's missing\n"
else
printf "\nMissing Client is:\n" | tee -a /tmp/email_list.txt
printf "$missing\n\n" | tee -a /tmp/email_list.txt
printf "Emailing the list\n"
FILE=/tmp/email_list.txt
read -p "Enter Your Email: " TO
FROM=admin@$(hostname)
(cat - $FILE)<< EOF | /usr/sbin/sendmail -f $FROM -t $TO
Subject: Missing VMs from Jobs
To: $TO
EOF
sleep 2s
printf "\nEmail Sent. Exiting Script\n\n"
fi
rm /tmp/new_list.txt
rm -f /tmp/email_list.txt
fi

Feel free to reply for any issues. Hope this helps!