Saturday, 22 October 2016

VDP Reports Incorrect Information About Protected Clients

When you connect to vSphere Data Protection in your web client, switch to the Reports tab and select Unprotected Clients, you will see a list of VMs that are not protected by VDP. When I say not protected by VDP, it means that they are not added to any backup jobs in that particular appliance. 

In some cases, you will see the virtual machine is still listed under the Unprotected Client section when the VM is already added in the backup job. This mostly occurs when a rename operation is done on the virtual machine. When a rename is done on the virtual machine, the backup job picks up the new name. The Unprotected Clients under Restore tab will not pick this up. 

Here is the result of a small test.

1. I have a backup job called "Windows" and a VM called "Windows_With_OS" is added under it. 


2. In the Unprotected Client section, you can see that this "Windows_With_OS" VM is not listed as it is already protected. 


3. Now, I will re-name this virtual machine in my vSphere Client to "Windows_New"


4. Back in the vSphere Data Protection, you can see the name is updated in the backup job list, but not in the Reporting Tab.


You can see that Windows_New is now coming up under Unprotected Clients even though it is already protected. (Ignore the vmx file name as this is renamed for other purposes)


This is an incorrect report and the VDP appliance should sync these changes automatically with vCenter naming changes. You can restart services, proxy, the entire appliance too and it will not fix this reporting. 

This can be also confirmed from the virtual name report in MCS and GSAN. To check this:

1. Open a SSH / Putty to the VDP appliance. Login as admin and elevate to root.
2. Run the below command:
# mccli client show --recursive=true


So if you observer here, the MCS still picks up the old virtual machine name. (mccli is only for MCS related information)

3. If you check what the GSAN shows, run the below command:
# avmgr getl --path=/vCenter-IP/Virtual-Machine-Domain.
The vCenter IP and VM domain can be found from the above mccli command which in my case is /192.168.1.1/VirtualMachines. The output is:


The avgmr is only for GSAN related information and also shows the Client ID is for the VM with the older name. 

So your vdr server naming is out of sync with the MCS and GSAN sync. 

The solution:

You will have to force sync the naming changes between the Avamar server and the vCenter Server. To do this, you will need the proxycp.jar file which can be downloaded from here

A brief about proxycp.jar, this is a java archive file which contains a set of built in commands that can be used to automate or run a specific set of tasks from the command line. Some of the things would require changes from multiple locations and numerous files, and the proxycp.jar will help you do these things by running the required commands.

1. So once you download the proxycp.jar file, open a WinSCP to the VDP appliance and copy this file into your /root or preferably /tmp  folder. 

2. Then SSH into your VDP appliance and change directory to where the proxycp.jar file is and run the following command
# java -jar proxycp.jar --syncvmnames
The output:


The In Sync column was false for the renamed virtual machine, and the "syncvmnames" switch updated this value.

3. Now if I go back to the Unprotected Client's list, this VM is no longer listed and if you run the mccli and avmgr command mentioned earlier will show the updated name.

If something is a bit off for you in this case, feel free to comment.

Wednesday, 12 October 2016

vSphere Data Protection /data0? Partitions Are 100 Percent.

VDP can be connected to a data domain or a local deduplication store to contain all the backup data. This article discusses in specific when VDP is connected to a data domain. As far as the deployment process goes, a VDP with data domain attached to it, would still have a local data partition as well. The sda mount is for your OS partitions and sdb, sdc and so on are for your data partitions (Hard Disk1, 2, 3..and so on).

These partitions, data01, data02....(Grouped as data0?) contain the metadata of the backups that is stored on the data domain. So, if you cd to /data01/cur and do a list "ls", you will see the metadata stripes.

0000000000000000.tab 
0000000000000008.wlg 
0000000000000012.cdt 
0000000000000017.chd

Before, we get into the cause of this issue, let's have a quick look at what a retention policy is. When you create a backup job for a client / group of clients, you will define a retention policy for the restore points. The retention policy tells, how long you need your restore points to be saved after a backup. The default is 60 days and can be adjusted as per need. 

Once the retention policy is reached, that restore point which has reached its expiration date will be deleted. Then, during the maintenance window, the Garbage Collection (GC), will be executed, which will perform the space reclamation. If you run, status.dpn, you will notice, "Last GC" and amount of space that was reclaimed. 

Space reclamation by GC is done only on the data0? partitions. So, if your data0? partitions are 100 percent, then there are few explanations. 

1. Your retention period for all backup is set to "Never Expire", which is not recommended to be set.
2. The GC was not executed at all during the maintenance window. 

If you set the backups to never expire, then go ahead and set an expiration date for it, otherwise your data0? partitions will frequently enter 100 percent space usage. 

To check if your GC was executed successfully or not, run the below command:
# status.dpn
The output, you should look at is the "Last GC". You will either see an error here such as DDR_ERROR or Last GC was executed somewhere weeks back. 

Also, if you login to vdp-configure page, you should notice that your maintenance services are not running. If this is the case, then your space reclamation task will not run, and if your space reclamation task is not running, then those metadata for expired backup are not cleared. 

To understand why this happens, let's have a basic look at how MCS talks to data domain. Your MCS will be running on your VDP appliance. If there is a data domain attached to the appliance, the MCS will be querying the data domain via the DD SSH keys

This means, we have a private-public key combination on the VDP appliance and the data domain system. When there is a public-private key combination, there is no need as password authentication for MCS to connect to data domain. Your MCS will use it's private key and Data Domains public key, and similarly, the data domain will use it's private key and VDP's public key to communicate. 

You can do a simple test to see if this working by performing the below steps:

1. On the VDP appliance load and add the private key. 
# ssh-agent bash 
# ssh-add ~admin/.ssh/ddr_key
2. Once the key is added, you can login to Data Domain from the VDP SSH directly without a password. This is how the MCS works too. 
# ssh sysadmin@192.168.1.200
Two outcomes here: 

1. If there is no prompt to enter a password, it will directly connect you to the Data Domain console, and we are good to go. 

2. It will prompt you to enter a passphrase and/or a password to login to Data Domain. If you run into this issue, then it means that the SSH public keys for VDP are not loaded / unavailable on the Data Domain end.

For this issue, we will be most likely running into Outcome (2)

How to verify public key availability on data domain end:

1. On the VDP appliance run the following command to list the public key:
# cat ~admin/.ssh/ddr_key.pub
The output would be similar to:

ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAw7XWjEK0jVPrT0z6JDmdKUDLfvvoizdzTpWPoCWNhJ/LerUs9L4UkNr0Q0mTK6U1tnlzlQlqeezIsWvhYJHTcU8rh
yufw1/YZLoGeA0tsHl6ruFAeCIYuf5+mmLXluPhYrjGMdsDa6czjIAtoA4RMY9WjAtSOPX3L2B73Wf3BScigzC/D83aX8GnaldwQU88qkfmhN+dpy2IdxiFm4
hnK+2m4XMtveBTq/8/7medeBTMXYYe7j7DVffViU4DizeEpGj2TBxHIe2dGe0epFDDc9wpa8W5a/XPOeiz4WelHfKtqS1hYUpFEQWXUOngwjDPpqG+6k1t
1HoOp/+OVC3lGw== admin@vmrel-ts2014-vdp

2. On the Data Domain end, run the following command:
# adminaccess show ssh-keys user <ddbost-user>
You can enter your custom ddboost user or sysadmin, if this was itself promoted to ddboost user. 

In our case, we should not see the above mentioned public key in the list. 
The DD will have its private key and VDP will have its private key. The public key of the VDP is not available on the data domain end, which leads to password request when connecting from SSH of VDP to DD. Due to this, the GC will not run as MCS will be waiting for a manual password entry. 

To fix this:

1. Copy the public key of the VDP appliance obtained from the "cat" command mentioned earlier. Copy the entire thing starting from and including ssh-rsa to the end, including -vdp
Make sure no spaces are copied, else this will not work. 

2. Login to DD with sysadmin and run the following command:
# adminaccess add ssh-keys user <ddboost-user>
You will see a prompt like below:

Enter the key and then press Control-D, or press Control-C to cancel.

Then, enter the copied key and Press Ctrl+D (You will see the "key accepted" message)

ssh-user AAAAB3NzaC1yc2EAAAABIwAAAQEAw7XWjEK0jVPrT0z6JDmdKUDLfvvoizdzTpWPoCWNhJ/LerUs9L4UkNr0Q0mTK6U1tnlzlQlqeezIsWvhYJHTcU8rh
yufw1/YZLoGeA0tsHl6ruFAeCIYuf5+mmLXluPhYrjGMdsDa6czjIAtoA4RMY9WjAtSOPX3L2B73Wf3BScigzC/D83aX8GnaldwQU88qkfmhN+dpy2IdxiFm4
hnK+2m4XMtveBTq/8/7medeBTMXYYe7j7DVffViU4DizeEpGj2TBxHIe2dGe0epFDDc9wpa8W5a/XPOeiz4WelHfKtqS1hYUpFEQWXUOngwjDPpqG+6k1t
1HoOp/+OVC3lGw== admin@vmrel-ts2014-vdpSSH key accepted.

3. Now test the login from VDP to DD using ssh sysadmin@192.168.1.200 and you should be directly connected to the data domain.

Even though, we have re-established the MCS connectivity to DD, we will have to now manually run a garbage collection to force clear the expired metadata. 

You have to first stop the backup scheduler and maintenance service else you will receive the below error when trying to run GC:
ERROR: avmaint: garbagecollect: server_exception(MSG_ERR_SCHEDULER_RUNNING)

To stop the backup scheduler and maintenance service:
# dpnctl stop maint
# dpnctl stop sched
Then, run the below command to force start a GC:
# avmaint garbagecollect --timeout=<how many seconds should GC run> --ava
4. run df -h again, and the space has to be reduced considerably provided all the backups have a good retention policy set.


**If you are unsure about this process, open a ticket with VMware to drive this further**

Wednesday, 5 October 2016

Understanding VDP Backups In A Data Domain Mtree

This is going to be a high level view of how to find out which backups on data domain relate to which clients on the VDP. Previously, we saw how to deploy and configure a data domain and connect it to a vSphere Data Protection 5.8 appliance. I am going to discuss what is an Mtree and certain information about it, as this is needed for the next article, which would be migration of VDP from 5.8/6.0 to 6.1 

In a data domain file system, a Mtree is created to store the files and checkpoint data of VDP and data domain snapshots under the avamar ID node of the respective appliance. 

Now, on the data domain appliance, the below command needs to be executed to display the mtree list. 
# mtree list
The output is seen as:

Name                                             Pre-Comp (GiB)   Status
----------------------------   ----------------------------------  ------
/data/col1/avamar-1475309625       32.0                      RW
/data/col1/backup                             0.0                        RW
----------------------------   ---------------------------------   ------

So the avamar node ID is 1475309625. To confirm this is the same mtree node created for your VDP appliance, run the following on the VDP appliance:
# avmaint config --ava | grep -i "system"
The output is:
  systemname="vdp58.vcloud.local"
  systemcreatetime="1475309625"
  systemcreateaddr="00:50:56:B9:54:56"

The system create time is nothing but the avamar ID. Using these two commands you can confirm which VDP appliance corresponds to which mtree on the data domain. 

Now, I mentioned earlier that mtree is where your backup data, VDP checkpoints and other related files reside. So, the next question is how to see what are the directories under this mtree? To check the directories under mtree, run the following command in the data domain appliance.
 # ddboost storage-unit show avamar-<ID>
The output is:

Name                Pre-Comp (GiB)   Status   User
-----------------   --------------   ------   ------------
avamar-1475309625             32.0   RW       ddboost-user
-----------------   --------------   ------   ------------
cur
ddrid
VALIDATED
GSAN
STAGING
  • The cur directory is also called as the "current" directory where all your backup data is stored. 
  • The validated folder is where the validated checkpoints reside
  • The GSAN folder contains the checkpoint that was copied over from the VDP appliance. Remember, in the previous article we checked the option for "Enable Checkpoint Copy". This is what copies over the daily generated checkpoints on VDP to the data domain. Why this is required? We will look into this in much detail during the migrate operation. 
  • The STAGING folder is where all your "in-progress" backup job is saved. Once the backup job completes successfully, they will be moved to the cur directory. If the backup job fails, it will remain in the STAGING folder and will be cleared out during the next GC on the data domain.
Now, as mentioned before, DDOS does not have complete commands that is available in Linux, which is why you will have to enter SE (System Engineering) mode and enable bash shell to obtain the superset of commands to browse and modify directories. 

Please note: This is meant to be handled by a EMC technician only. All the information I am displaying here is purely from my lab. Try this at your own risk. If you are uncomfortable, stop now, and involve EMC support. 

To enable the bash shell, we will have to first enter the SE mode. To do this, we will need the password for SE which would be your system serial number. This can be obtained from the below command:
# system show serialno
The output is similar to:
Serial number: XXXXXXXXXXX

Enable SE mode using the below command and enter the Serial number as the password when prompted for:
 # priv set se
Once the password is provided you can see the user sysadmin@data-domain has changed to SE@data-domain

Now, we need to enable the bash shell for your data domain. Run these commands in the same order:

1. Display the OS information using:
# uname
You will see:
Data Domain OS 5.5.0.4-430231

2. Enable the File system using:
# fi st
You will see:
The filesystem is enabled and running.

3. Run the below command to show the filesystem space:
 # filesys show space
You will see

Active Tier:
Resource           Size GiB   Used GiB   Avail GiB   Use%   Cleanable GiB*
----------------   --------   --------   ---------   ----   --------------
/data: pre-comp           -       32.0           -      -                -
/data: post-comp      404.8        2.0       402.8     0%              0.0
/ddvar                 49.2        2.3        44.4     5%                -
----------------   --------   --------   ---------   ----   --------------
 * Estimated based on last cleaning of 2016/10/04 06:00:58.

4. Press "Ctrl+C" three times and then type shell-escape
This enters you to the bash shell and you will see the following screen.

*************************************************************************
****                            WARNING                              ****
*************************************************************************
****   Unlocking 'shell-escape' may compromise your data integrity   ****
****                and void your support contract.                  ****
*************************************************************************
!!!! datadomain YOUR DATA IS IN DANGER !!!! #

Again, proceed at your own risk and 100^10 percent, involve EMC when you do this. 

You saw the mtree was located at the path /data/col1/avamar-ID. The data partition is not mounted by default and needs to be mounted and unmounted manually. 

To mount the data partition run the below command:
# mount localhost:/data /data
This will return to the next line and will not show any output. Once the partition has been mounted successfully, you can then use your regular Linux commands to browse the mtree. 

So, a cd to the /data/col1/avamar-ID will show the following:

drwxrwxrwx  3 ddboost-user users 167 Oct  1 01:46 GSAN
drwxrwxrwx  3 ddboost-user users 190 Oct  2 02:40 STAGING
drwxrwxrwx  9 ddboost-user users 563 Oct  3 20:33 VALIDATED
drwxrwxrwx  4 ddboost-user users 279 Oct  2 02:40 cur
-rw-rw-rw-  1 ddboost-user users  40 Oct  1 01:43 ddrid

As mentioned, before the "cur" directory has all your successfully backed up data. If you change your directory to cur and do a "ls" you will find the following:

drwxrwxrwx  4 ddboost-user users 229 Oct  2 07:30 5890c0677a03211b49a9cf08bf1dcebd2d7cd77d

Now, this is the Client ID of the client (VM) that was successfully backed up by VDP.
To find which client on VDP corresponds to which CID on the data domain, we have 2 simple commands. 

To understand this, I presume you have a fair idea of what MCS and GSAN is on vSphere Data Protection. Your GSAN node is responsible for storing all the actual backup data if you have a local vmdk storage. If your VDP is connected to the data domain, then GSAN only holds the meta data of the backup and not the actual backup data (As this will be on the data domain) 
The MCS in brief is what waits for the work-order and calls in the avagent and avtar to perform the backup. The MCS if it understands there is a data domain connected to it, then, using the DD public-private key combination (Also called SSH keys) will talk to DD to perform the regular maintenance tasks. 

So, first, we will run the avmgr command (avmgr command is only for GSAN and will not work if GSAN is not running), to display the client ID on the GSAN node. The command would be:
# avmgr getl --path=/VC-IP/VirtualMachines
The output is:

1  Request succeeded
1  RHEL_UDlVr74uB7JdXN8jgjRLlQ  location: 5890c0677a03211b49a9cf08bf1dcebd2d7cd77d      pswd: 0d0d7c6b09f2a2234c108e4f0647c277e8bf2562

The one highlighted in red is nothing but the Client ID on the GSAN for the client RHEL (a virtual machine)

Then, we will run the mccli command (mccli command is only for MCS and needs MCS to be up and running) to display the client ID on the MCS server. The command would be:
# mccli client show --domain=/VC-IP/VirtualMachines --name="Client_name"
For example,
# mccli client show --domain=/192.168.1.1/VirtualMachines -name="RHEL_UDlVr74uB7JdXN8jgjRLlQ"
The output is a pretty detailed one, what we are interested is in this particular line:
CID                      5890c0677a03211b49a9cf08bf1dcebd2d7cd77d

So, we see the client ID on data domain = client ID on the GSAN = client ID on the MCS

Here, if your client ID on GSAN does not match the client ID on MCS, then your full VM restore and File Level Restores will not work. We will have this CID to be corrected in case of a mismatch to get the restores working. 

Now, back to the data domain end, we were under the cur directory, right? Next, I will change directory to the CID

# cd 5890c0677a03211b49a9cf08bf1dcebd2d7cd77d

I will then do another "ls" to list the sub directories under it, and you may or may not notice the following:

drwxrwxrwx  2 ddboost-user users 1.2K Oct  2 02:55 1D21C9327C2E4C6
drwxrwxrwx  2 ddboost-user users 1.4K Oct  2 07:30 1D21CB99431214C

If you have one folder which a sub client ID, then it means there has been only one backup executed and completed successfully for the virtual machine. If you see multiple folder, then it means there has been multiple backups completed for this VM. 

To find out which backup was done first and which were the subsequent backups, we will have to query the GSAN, as you know, the GSAN holds the meta-data of the backups. 

Hence, on the VDP appliance, run the below command:
# avmgr getb --path=/VC-IP/VirtualMachines/Client-Name --format=xml
For example:
# avmgr getb --path=/192.168.1.1/VirtualMachines/RHEL_UDlVr74uB7JdXN8jgjRLlQ --format=xml
The output will be:

<backuplist version="3.0">

  <backuplistrec flags="32768001" labelnum="2" label="RHEL-DD-Job-RHEL-DD-Job-1475418600010" created="1475418652" roothash="505f1aba07f19d64df74670afa59ed39a3ece85d" totalbytes="17180938240.00" ispresentbytes="0.00" pidnum="1016" percentnew="0" expires="1476282600" created_prectime="0x1d21cb99431214c" partial="0" retentiontype="daily,weekly,monthly,yearly" backuptype="Full" ddrindex="1" locked="1"/>
  
  <backuplistrec flags="16777217" labelnum="1" label="RHEL-DD-Job-1475401181065" created="1475402150" roothash="22dc0dddea797d909a2587291e0e33916c35d7a2" totalbytes="17180938240.00" ispresentbytes="0.00" pidnum="1016" percentnew="0" expires="1476265181" created_prectime="0x1d21c9327c2e4c6" partial="0" retentiontype="none" backuptype="Full" ddrindex="1" locked="0"/>
</backuplist>

Looks confusing? Maybe, let's look at specific fields:

labelnum field shows the order of the backups. 
labelnum=1 means first backup, 2 means second and so on.

roothash is the hash value of the backup job. Next time you run incremental backup, it will check for the existing hashes, and ddboost will only backup the new hashes. The atomic hashes are then combined to form one unique root hash. So, root hash for each backup is unique. 

created_prectime is the main thing what we need. This is what we called as the sub client ID. 
For labelnum=1, we see the sub CID is 0x1d21c9327c2e4c6
For labelnum=2, we see the sub CID is 0x1d21cb99431214c

Now, let's go further into the CID. For example if I cd into the 0x1d21c9327c2e4c6 and perform a "ls" I will see the following:

-rw-rw-rw-  1 ddboost-user users  485 Oct  2 02:40 1188BE924964359A5C8F5EAEF552E523FBA83566
-rw-rw-rw-  1 ddboost-user users 1.1K Oct  2 02:40 140A189746A6EC3C49D24EA43A7811205345F1F4
-rw-rw-rw-  1 ddboost-user users 3.8K Oct  2 02:40 2CE724F2760C46CB67F679B76657C23606C06869
-rw-rw-rw-  1 ddboost-user users 2.5K Oct  2 02:40 400206DF07A942C066971D84F0CF063D2DE50F08
-rw-rw-rw-  1 ddboost-user users 1.0M Oct  2 02:55 4F50E1E506477801D0A566DEE50E5364B0F04BF0
-rw-rw-rw-  1 ddboost-user users  451 Oct  2 02:55 79DDA236EEEF192EED66CF605CD710B720A41E1F
-rw-rw-rw-  1 ddboost-user users 1.1K Oct  2 02:55 AFB6C8621EB6FA86DD8590841F80C7C78AC7BEEC
-rw-rw-rw-  1 ddboost-user users 1.9K Oct  2 02:40 B17DD9B7E8B2B6EE68294248D8FA42A955539C4C
-rw-rw-rw-  1 ddboost-user users  16G Oct  2 02:55 B212DB46684FFD5AFA41B87FD71A44469B04A38C
-rw-rw-rw-  1 ddboost-user users   15 Oct  2 02:40 D2CFFD87930DAEABB63EAEAA3C8C2AA9554286B5
-rw-rw-rw-  1 ddboost-user users 9.4K Oct  2 02:40 E2FF0829A0F02C1C6FA4A38324A5D9C23B07719B
-rw-rw-rw-  1 ddboost-user users 3.6K Oct  2 02:55 ddr_files.xml

Now there is a main file (record file) called ddr_files.xml. This file will have all the information regarding what the other files are for in this directory.

So if I take the first Hex number and grep for it in the ddr_files.xml I see the following;
# grep -i 1188BE924964359A5C8F5EAEF552E523FBA83566 ddr_files.xml
The interested output is:
clientfile="virtdisk-descriptor.vmdk"

So this a vmdk file that was backed up.

Similarly,
# grep -i 400206DF07A942C066971D84F0CF063D2DE50F08 ddr_files.xml
The interested output is:
clientfile="vm.nvram"

And one more example:
# grep -i 4F50E1E506477801D0A566DEE50E5364B0F04BF0 ddr_files.xml
The interested output is:
clientfile="virtdisk-flat.vmdk"

So if your VM file IDs are not populated correctly in the ddr_files.xml, then again your restores will not work. Engage EMC to get this corrected, because I am stressing again, do not fiddle with this in your production environment.

That's pretty much it for this. If you have questions feel free to comment or in-mail. The next article is going to be about Migrating VDP 5.8/6.0 to 6.1 with a data domain.

Saturday, 1 October 2016

Configuring Data Domain Virtual Edition And Connecting It To A VDP Appliance

Recently while browsing through my VMware repository, I came across EMC Data Domain Virtual Edition. Before this, the only time where I worked on EMC DD cases where on a web ex, which had integration / backup issues between VDP and Data-Domain. Currently, my expertise on Data-Domain is limited and with this appliance edition, you can expect more updates on VDP-DD issues. 

First things first, this article is going to be about deploying the virtual edition of Data-Domain and configuring it to VDP. You can say this is the initial setup. I am going to skip certain parts such as OVF deployment as I take into account this is already known to you. 

1. The Data Domain ova can be acquired from the EMC download site. It contains and ovf file, vmdk file and .mf (manifest) file. 
A prerequisite would be to have your DNS Forward/Reverse configured for your DD Appliance. Once this is in place, use either the web client or the vSphere Client to complete your deployment. Do not power On the appliance post deployment. 

2. Once the deployment is complete, if you go to the Edit Settings of the Data Domain Appliance, you can see that there are two hard drives for your system, 120GB and 10GB. These are your system drives and you will need to add an additional drive (500GB) for storing the backup data. At this point, you can go ahead and perform an "Add Device" operation and add a new vmdk of 500GB in size. Post successful addition of the drive, go ahead and power on the data domain appliance. 

3. The default credentials are username: sysadmin, password: abc123. At this point of time, there will be a script that will be executed automatically to configure network. I have not included this in the screenshot as it is fairly simple task. The options will be configure IPv4 network, IP address, DNS Server address, hostname, domain name, subnet mask and default gateway. IPv6 is optional and can be ignored if not required. This prompt will also ask you if you would like to change the password for sysadmin

4. The next step is to add the 500GB storage that was created for this appliance. Run the following command in the SSH of the data domain appliance to add the 500GB disk. 
# storage add dev3
This would be seen as:


5. Once the additional disk is added to the active tier, you then need to create a file system on this. Run the following command to create a file system:
# filesys create
This would be seen as:


This will provision the storage and create the FS. The progress will be displayed once the prompt is acknowledged with "yes" as above.


6. Once the file system is created, the next part is to enable the file system, which again is done by a pretty simple command:
# filesys enable

Note that, this is not a Full Linux OS, so most to almost all Linux commands will not work on a data domain. The Data domain has it's own OS, DD-OS which has a set of commands that are only supported. 

The output for enabling file system would be seen as:


Perform a restart of the data domain appliance before proceeding with the next steps. 

7. Once the reboot is completed, open a browser and go to the following link: 
https://data-domain-ip/ddem
If you see an unable to connect message in the browser, then go back to the SSH and run the following command:
# admin show
The output which we need would be:

Service   Enabled   Allowed Hosts
-------   -------   -------------
ssh       yes       -
scp       yes       (same as ssh)
telnet    no        -
ftp       no        -
ftps      no        -
http      no        -
https     no        -
-------   -------   -------------

Web options:
Option            Value
---------------   -----
http-port         80
https-port        443
session-timeout   10800

So the ports are already set, however, the services are not enabled. So we need to enable http and https services here. The command would be:
sysadmin@datadomain# adminaccess enable http

HTTP Access:    enabled
sysadmin@datadomain# adminaccess enable https

HTTPS Access:   enabled
This should do the trick, and now you should be able to login to the data domain manager from the browser.


The login username would be sysadmin and the password for this user. (Either default if you chose not to change it in SSH or the newly configured password)

8. Now, the VDP connects to the data domain via a ddboost user. You need to create a ddboost user with admin rights, and enable ddboost. This can be done from the GUI or from the command line. The command line is much easier.

The first step would be to "add" a ddboost user:
# user add ddboost-user password VMware123!
The next step would be to "set" the created ddboost user:
# ddboost set user-name ddboost-user
The last step would be to enable ddboost:
# ddboost enable 
You can also check ddboost status with:
# ddboost status

This will automatically reflect in the GUI. The location for ddboost in web manager of data domain is. Data Management > DD Boost > Settings. So, if you choose not to add the user from command line, you can do the same from here.


9. Now, you need to grant this ddboost user the admin rights, else, you will get the following error during connecting the data domain to the VDP appliance:


To grant the ddboost user with admin rights, from the web GUI navigate to System Settings > Access Management > Local Users. Check the ddboost user and select the Edit option, From the management role drop-down select admin > Click OK.


10. Post this, there is one more small configuration required for connecting the VDP appliance. This would be the NTP settings. From the data domain web GUI navigate to System Settings > General Configuration > Time and Date Settings and add your NTP Server. 

11. With this completed, now we will go ahead and connect this data domain to the VDP appliance. 
Login to the vdp-configure page at:
https://vdp-ip:8543/vdp-configure
12. Click Storage > Gear Icon > Add  Data Domain


13. Provide the primary IP of the data domain, the ddboost user and password and select Enable Checkpoint Copy

14. Enter the community string for the SNMP configuration.


Once the data domain addition completes successfully, you should be able to see the data domain storage details under the "Storage" section:


If you SSH into the VDP appliance and navigate to /usr/local/avamar/var, you will see a file called ddr_info which has the information regarding the data domain. 

This is the basic setup for connecting virtual edition of data domain to a VDP appliance. 

Friday, 16 September 2016

Expired Certificate On VDP Port 29000

So recently I was working on a case where the certificate had expired on Port 29000 for VDP 6.1.2. The port 29000 is used for replication data, this can be seen on page 148 of this admin guide here. Now, on the same admin guide, on page 46 it talks about replacing the certificate for the appliance, however, this is applicable on the Web management page for Port 8543 which is for your vdp-configure page. 

If you still go to " https://vdp-IP:29000 " you will still see the certificate as expired as below:


How do we get this certificate replaced? Well, here are the steps:
Here we are using self signed certificates and the replacing is also done with a new self signed certificate: 

1. Login to the VDP appliance as admin
2. Type the below command:
# cd ~
3. Use the below command to generate a new self signed certificate:
# openssl req -x509 -newkey rsa:3072 -keyform PEM -keyout avamar-1key.pem -nodes -outform PEM -out avamar-1cert.pem -subj "/C=US/ST=California/L=Irvine/O=Dell EMC, Inc./OU=Avamar/CN=vdp-hostname.customersite.com" -days 3654
Replace vdp-hostname.customersite.com with your VDP appliance hostname
The " days " parameter can be altered to set the certificate validity as per your requirement. 

The above command generates a SHA1 certificate. If you would like to generate a SHA256 certificate, then the command would be:
# openssl req -x509 -sha256 -newkey rsa:3072 -keyform PEM -keyout avamar-1key.pem -nodes -outform PEM -out avamar-1cert.pem -subj "/C=US/ST=California/L=Irvine/O=Dell EMC, Inc./OU=Avamar/CN=vdp-hostname.customersite.com" -days 3654
4. The key and certificate will be written to the /home/admin directory and called "avamar-1key.pem" and "avamar-1cert.pem", respectively.

5. Follow Page 38 of this Avamar security Guide to perform the replace operation of certificates. 

6. Post restarting the gsan service as per the security guide, you can now go to " https://vdp-IP:29000 " and verify that the certificate is now renewed. 

That's pretty much it. Should be helpful if you receive warnings during security scans against VDP.

Monday, 22 August 2016

Installation Of SQL Agent For VDP Fails.


When trying to install the SQL agent for vSphere Data Protection the agent installation fails with the error:

"VMware VDP for SQL Server Setup Wizard ended prematurely"


Here, under Installation directory:\Program files, there will be a directory called avp that would be created. This folder will have the logs which will contain the details regarding the failure. However, with the failure and roll back of the agent installation, this folder is removed automatically. So, the logs are lost. To gather the logs, we will have to redirect the msi installer logs to a different location.

To perform this run the below command: 
msiexec /i "C:\MyPackage\Example.msi" /L*V "C:\log\example.log"

The /i parameter will launch the MSI package. After the installation is finished, the log is complete.

Once the installation now fails, the above log file will have the logging for the cause of failure.
In my case, the following was noticed:

Property(S): UITEXT_SUPPORT_DROPPED = This release of the VMware VDP client software does not support the operating system version on this computer. Installing the client on this computer is not recommended.

The "EMC Avamar Compatibility and Interoperability Matrix" on EMC Online Support at https://support.EMC.com provides client and operating system compatibility information.

So the failure here is due to unsupported Guest OS. The SQL version was 2014 which was very well supported. This SQL was running on a Windows 2012 R2 box.

From the VDP admin guide (Page 162), we see that SQL 2012 R2 is unsupported Guest OS for the client:

From the EMC Avamar Compatibility matrix (Page 38) we see this again is not supported. 

So, if you see such issues please verify if the Agent being installed is compatible with the Guest OS release. 

That's pretty much it!

Thursday, 18 August 2016

VDP Backup Fails: Failed To Download VM Metadata

A virtual machine backup was continuously failing with the error: VDP: Miscellaneous error.

When you look at the backup job log at the location:
# cd /usr/local/avamarclient/var

2016-08-18T08:24:59.099+07:00 avvcbimage Warning <40722>: The VM 'Win2012R2/' could not be located on the datastore:[FreeNAS-iSCSI] Win2012R2/Win2012R2.vmx
2016-08-18T08:24:59.099+07:00 avvcbimage Warning <40649>: DataStore/File Metadata download failed.
2016-08-18T08:24:59.099+07:00 avvcbimage Warning <0000>: [IMG0016] The datastore from VMX '[FreeNAS-iSCSI] Win2012R2/Win2012R2.vmx' could not be fully inspected.
2016-08-18T08:24:59.128+07:00 avvcbimage Info <0000>: initial attempt at downloading VM file failed, now trying by re-creating the HREF, (CURL) /folder/Win2012R2%2fWin2012R2.vmx?dcPath=%252fDatacenter&dsName=FreeNAS-iSCSI, (LEGACY) /fold
er/Win2012R2%2FWin2012R2.vmx?dcPath=%252FDatacenter&dsName=FreeNAS%252DiSCSI.

2016-08-18T08:25:29.460+07:00 avvcbimage Info <40756>: Attempting to Download file:/folder/Win2012R2%2FWin2012R2.vmx?dcPath=%252FDatacenter&dsName=FreeNAS%252DiSCSI
2016-08-18T08:25:29.515+07:00 avvcbimage Warning <15995>: HTTP fault detected, Problem with returning page from HTTP, Msg:'SOAP 1.1 fault: SOAP-ENV:Server [no subcode]
"HTTP Error"
Detail: HTTP/1.1 404 Not Found

2016-08-18T08:25:59.645+07:00 avvcbimage Warning <40650>: Download VM .vmx file failed.
2016-08-18T08:25:59.645+07:00 avvcbimage Error <17821>: failed to download vm metadata, try later
2016-08-18T08:25:59.645+07:00 avvcbimage Info <9772>: Starting graceful (staged) termination, failed to download vm metadata (wrap-up stage)
2016-08-18T08:25:59.645+07:00 avvcbimage Error <0000>: [IMG0009] createSnapshot: snapshot creation  or pre/post snapshot script failed
2016-08-18T08:25:59.645+07:00 avvcbimage Error <0000>: [IMG0009] createSnapshot: snapshot creation/pre-script/post-script failed
2016-08-18T08:25:59.645+07:00 avvcbimage Info <40654>: isExitOK()=157

Here we see that the backup is failing as it is unable to gather the vmx file details of the client it is trying to backup. It is unable to gather the vmx details is because it is unable to locate the client on the datastore. 

This issue occurs when the virtual machine files or the folder itself are manually moved to a different folder. 

If you look at the below screenshot, you see my virtual machine is under a folder called "do not power on"


Now, if you have a look the actual vmx file of this Windows2012R2 machine you see it is different.


From this, my virtual machine was initially under the FreeNAS-iSCSI / VM folder / VM Files and then this was moved to the "do not power on" folder. If you just move the virtual machine, the vmx file will not be updated with the new path. VDP during the backup will look at this path of the virtual machine and will fail because we do not have the vmx file in this location. 

To fix this, you will have to remove the virtual machine from the inventory and re-add it back to the inventory. Doing this will update the vmx file location as seen below. Once the correct location is populated you can run the backups for this virtual machine.


To register VM to inventory you can follow this KB article here.

That's pretty much it!