vCloud Director Appliance stops working – out of disk space

Team evoila
24. March 2013
Reading time: 7 min

Playing around with vCloud Director Appliance in my lab, I downloaded vApps to a CIFS share to make  b ackups of my vApp Templates. The first download worked perfectly – no complaints. The second download failed with a very vacuous message like “download failed” or something similar. Sorry for not knowing the exact wording but I didn’t expect that to become a blog post 😉

After trying a few more times, I decided to reboot the appliance. Nothing. vCloud Director didn’t even come back up now! A port scan showed that 80 and 443 were not listening:

Nmap done: 1 IP address (1 host up) scanned in 2.53 seconds
mathias@x220:~$ nmap 10.173.10.22

Starting Nmap 6.00 ( http://nmap.org ) at 2013-03-24 14:31 CET
Nmap scan report for 10.173.10.22
Host is up (0.089s latency).
Not shown: 997 closed ports
PORT     STATE SERVICE
22/tcp   open  ssh
111/tcp  open  rpcbind
3389/tcp open  ms-wbt-server

Nmap done: 1 IP address (1 host up) scanned in 2.53 seconds
mathias@x220:~$

So, one more time it was necessary to take a look at the log files:

mathias@x220:~$ ssh root@10.173.10.22
# This is a dummy banner. Replace this with your own banner, appropriate for
# the VA
root@10.173.10.22's password:
Last login: Sun Mar 24 09:22:57 2013
vcd:~ # cd /opt/vmware/vcloud-director/logs/
vcd:/opt/vmware/vcloud-director/logs #

In cell.log I found some good pieces of information:

java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(Unknown Source)
        at sun.nio.cs.StreamEncoder.writeBytes(Unknown Source)
        at sun.nio.cs.StreamEncoder.implFlushBuffer(Unknown Source)
        at sun.nio.cs.StreamEncoder.implFlush(Unknown Source)
        at sun.nio.cs.StreamEncoder.flush(Unknown Source)
        at java.io.OutputStreamWriter.flush(Unknown Source)
        ...
log4j:ERROR Failed to flush writer,
java.io.IOException: No space left on device
        at java.io.FileOutputStream.writeBytes(Native Method)
        at java.io.FileOutputStream.write(Unknown Source)
        at sun.nio.cs.StreamEncoder.writeBytes(Unknown Source)
        at sun.nio.cs.StreamEncoder.implFlushBuffer(Unknown Source)
        at sun.nio.cs.StreamEncoder.implFlush(Unknown Source)
        at sun.nio.cs.StreamEncoder.flush(Unknown Source)
        at java.io.OutputStreamWriter.flush(Unknown Source)
 ...

I replaced the rest of the Java stack trace for you as the interesting part is in the very first line: “No space left on device”. So, obviously for some reason the disk filled up not leaving any space for the creation of files necessary to completely power on the application (cloud even be the PID file, that cannot find space on disk).

vcd:~ # df -h
Filesystem      Size  Used Avail Use% Mounted on<
/dev/sda3        28G   27G  1.3M 100% /<
udev            1.3G  108K  1.3G   1% /dev
tmpfs           1.3G  663M  594M  53% /dev/shm
/dev/sda1       128M   21M  101M  17% /boot
vcd:~ #

The df command shows, that the root volume filled up entirely! Now, lets find out why that is: I used the du command to find heavily used folders:

vcd:/ # du -h --max-depth=1
0    ./proc
3.2G    ./u01
20K    ./srv
44M    ./var
4.0K    ./media
8.0K    ./mnt
565M    ./usr
133M    ./lib
15M    ./boot
4.0K    ./selinux
663M    ./dev
11M    ./lib64
22G    ./opt
264K    ./tmp
7.3M    ./etc
40K    ./home
152K    ./root
7.5M    ./bin
16K    ./lost+found
0    ./sys
8.7M    ./sbin
27G    .
vcd:/ #

As you can see, there is one extraordinarily big folder: /opt, so lets drill down to see where this comes from.

vcd:/ # cd /opt/
vcd:/opt # du -h --max-depth=1
8.0K    ./keystore
22G    ./vmware
22G    .
vcd:/opt # cd vmware/
vcd:/opt/vmware # du -h --max-depth=1
22G    ./vcloud-director ... 22G    . vcd:/opt/vmware # cd vcloud-director/ vcd:/opt/vmware/vcloud-director # cd vcloud-director/ vcd:/opt/vmware/vcloud-director # du -h --max-depth=1 ... 22G    ./data 22G    . vcd:/opt/vmware/vcloud-director # cd data/ vcd:/opt/vmware/vcloud-director/data # du -h --max-depth=1 ... 22G    ./transfer 22G    . vcd:/opt/vmware/vcloud-director/data #

The 22GB come from the /opt/vmware/vcloud-director/data/transfer folder. The transfer folder is used for activities like uploading and downloading media of VMs to and from vCloud Director and this folder filled up when I exported my vApp earlier:

vcd:/opt/vmware/vcloud-director/data/transfer # du -h --max-depth=1
...
22G    ./cb3f993c-ebee-4317-8067-d31ff2bf4bdb
22G    .
vcd:/opt/vmware/vcloud-director/data/transfer # cd cb3f993c-ebee-4317-8067-d31ff2bf4bdb/
vcd:/opt/vmware/vcloud-director/data/transfer/cb3f993c-ebee-4317-8067-d31ff2bf4bdb # ls -hl
total 22G
-rw------- 1 vcloud vcloud 8.1G Mar 24 05:41 vm-44998f9b-6ae0-4338-9f25-3f7026b4db75-disk-1.vmdk
-rw------- 1 vcloud vcloud 296M Mar 24 05:25 vm-5beaf927-2c35-4f6b-b60b-73112f8980fc-disk-0.vmdk
-rw------- 1 vcloud vcloud  11G Mar 24 05:25 vm-6d96a394-ab53-4ccd-bd5a-bce6ba33ee51-disk-0.vmdk
-rw------- 1 vcloud vcloud 1.3G Mar 24 05:04 vm-b0a03be7-8c51-4288-9340-60ce585c5783-disk-1.vmdk
-rw------- 1 vcloud vcloud 1.5M Mar 24 05:04 vm-b0a03be7-8c51-4288-9340-60ce585c5783-disk-2.vmdk
-rw------- 1 vcloud vcloud  92K Mar 24 05:04 vm-b0a03be7-8c51-4288-9340-60ce585c5783-disk-3.vmdk
-rw------- 1 vcloud vcloud 576M Mar 24 05:05 vm-b0a03be7-8c51-4288-9340-60ce585c5783-disk-4.vmdk
vcd:/opt/vmware/vcloud-director/data/transfer/cb3f993c-ebee-4317-8067-d31ff2bf4bdb #

In a subfolder I found a lot of VMDK files belonging to my exported virtual machines.

So what can we do? First, I decided to remove all files from the transfer folder:

vcd:~ # rm -fR /opt/vmware/vcloud-director/data/transfer/*

Next, we add a second hard disk with an appropriate size. After attaching the disk through vSphere (Web) Client, we need Linux to rescan for new disks:

vcd:~ # fdisk -l

Disk /dev/sda: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders, total 62914560 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000d874

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048      272383      135168   83  Linux
/dev/sda2          272384     4483071     2105344   82  Linux swap / Solaris
/dev/sda3   *     4483072    62914559    29215744   83  Linux<
vcd:~ #
vcd:~ # echo "- - -" > /sys/class/scsi_host/host0/scan
vcd:~ # echo "- - -" > /sys/class/scsi_host/host
host0/ host1/ host2/
vcd:~ # echo "- - -" > /sys/class/scsi_host/host1/scan
vcd:~ # echo "- - -" > /sys/class/scsi_host/host2/scan
vcd:~ #
vcd:~ # fdisk -l

Disk /dev/sda: 32.2 GB, 32212254720 bytes
255 heads, 63 sectors/track, 3916 cylinders, total 62914560 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x0000d874

   Device Boot      Start         End      Blocks   Id  System
/dev/sda1            2048      272383      135168   83  Linux
/dev/sda2          272384     4483071     2105344   82  Linux swap / Solaris
/dev/sda3   *     4483072    62914559    29215744   83  Linux

Disk /dev/sdb: 214.7 GB, 214748364800 bytes
255 heads, 63 sectors/track, 26108 cylinders, total 419430400 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Disk /dev/sdb doesn't contain a valid partition table
vcd:~ #

Now, lets format the disk with a filesystem: I put the file system directly on the disk without creating any partition. That works and I don’t need to resize the partition should I ever decide to increase the capacity on the disk.

vcd:~ # mkfs.ext3 /dev/sdb

Finally, we need to mount the file system to the transfer folder: Open /etc/fstab with the VIM text editor

vcd:~ # vim /etc/fstab

and add the following line:

/dev/sdb        /opt/vmware/vcloud-director/data/transfer/      ext3    defaults        0       0

Now mount the file system:

vcd:~ # mount -a
vcd:~ # mount
...
/dev/sdb on /opt/vmware/vcloud-director/data/transfer type ext3 (rw)
vcd:~ #

The last line shows that the file system was mounted successfully.Before we start vCloud Director we have to make sure the user “vcloud” has access to the file system. Otherwise, we get

Error starting application (com.vmware.vcloud.vdc.impl.TransferServerSpoolAreaVerifier@4f2938f): Transfer spooling area is not writable: /opt/vmware/vcloud-director/data/transfer

in cell.log instead of

Successfully verified transfer spooling area: /opt/vmware/vcloud-director/data/transfer
vcd:~ # chown -R vcloud:vcloud /opt/vmware/vcloud-director/data/transfer
vcd:~ #

So we should try and start the vcloud director service:

vcd:~ # /etc/init.d/vmware-vcd start
Starting vmware-vcd-watchdog:                                                                                                             done
Starting vmware-vcd-cell                                                                                                                  done
vcd:~ # /etc/init.d/vmware-vcd status
vmware-vcd-watchdog is running
vmware-vcd-cell is running
vcd:~ #

The last two line in cell.log look a lot better now

Successfully bound network port: 80 on host address: 10.173.10.22
Successfully bound network port: 443 on host address: 10.173.10.22

and vCloud Director is back online!

Hope that helped!