Connecting to Warden Container in Cloudfoundry

Johannes Hiemer
31. March 2015
Reading time: 6 min

There are often cases, where you might be interested in connecting into warden containers – for debugging or log insights for example. As warden containers run inside DEAs (Droplet Execution Agent) in Cloudfoundry it is a bit more extensive to get into the Warden container itself. To get into a Warden container you need to use different tools and steps:

  • cf cli (The Cloudfoundry customer CLI for pushing Apps and managing your organisation/spaces/users)
  • bosh (The deployment tool for a deployment of Cloudfoundry)
  • Some fundamental Unix skills (required anyway)

Let’s start by getting the relevant information regarding your application you are tracing. Prerequisite is that you set your API endpoint of CF (Cloudfoundry), logged in, chose your organisation and the respective space and that your able to see a list of applications when doing a `cf apps`.

The first step you need to do is getting the UUID of your application. To retrieve it you need to do a:

CF_TRACE=true cf app yourAppName

This will display a relatively large JSON output containing multiple REQUEST/RESPONSE sections. Try to locate the section containing something like:

REQUEST: [2015-03-31T08:04:18+02:00]
GET /v2/apps/50be0e27-79ae-439d-b0a9-16f173ce8907/summary HTTP/1.1

Both the response and the request contain the UUID we were looking for with `50be0e27-79ae-439d-b0a9-16f173ce8907`. Copy that UUID in separate editor, we will use it again in the next step. What we have done now, is the extraction of the internal id of each application in CF. While names of applications (NOT hosts, just the name) might occur multiple times, the correlated id is always unique.
For the next step it is important to know, that `cf cli`contains curl executable via `cf curl` which has some predefined options set, when a request is done (“Content-Type” : “application/json” e.g.).

cf curl /v2/apps/50be0e27-79ae-439d-b0a9-16f173ce8907/stats

What you might see is that if you are not running one instance of your application, all instances currently deployed are displayed. In this case we have for running instances:

{
   "2": {
      "state": "RUNNING",
      "stats": {
         "name": "web",
         "uris": [
            "web.192.168.204.16.xip.io"
         ],
         "host": "192.168.204.72",
         "port": 61004,
         "uptime": 399076,
         "mem_quota": 1073741824,
         "disk_quota": 1073741824,
         "fds_quota": 16384,
         "usage": {
            "time": "2015-03-31 06:14:31 +0000",
            "cpu": 0.0008331241332607463,
            "mem": 446562304,
            "disk": 184909824
         }
      }
   },
   "1": {
      "state": "RUNNING",
      "stats": {
         "name": "web",
         "uris": [
            "web.192.168.204.16.xip.io"
         ],
         "host": "192.168.204.73",
         "port": 61002,
         "uptime": 399076,
         "mem_quota": 1073741824,
         "disk_quota": 1073741824,
         "fds_quota": 16384,
         "usage": {
            "time": "2015-03-31 06:14:31 +0000",
            "cpu": 0.0009227747013077307,
            "mem": 556937216,
            "disk": 184905728
         }
      }
   },
   "0": {
      "state": "RUNNING",
      "stats": {
         "name": "web",
         "uris": [
            "web.192.168.204.16.xip.io"
         ],
         "host": "192.168.204.74",
         "port": 61002,
         "uptime": 399203,
         "mem_quota": 1073741824,
         "disk_quota": 1073741824,
         "fds_quota": 16384,
         "usage": {
            "time": "2015-03-31 06:14:31 +0000",
            "cpu": 0.0009330781593627726,
            "mem": 492703744,
            "disk": 184913920
         }
      }
   },
   "3": {
      "state": "RUNNING",
      "stats": {
         "name": "web",
         "uris": [
            "web.192.168.204.16.xip.io"
         ],
         "host": "192.168.204.74",
         "port": 61003,
         "uptime": 399076,
         "mem_quota": 1073741824,
         "disk_quota": 1073741824,
         "fds_quota": 16384,
         "usage": {
            "time": "2015-03-31 06:14:31 +0000",
            "cpu": 0.0009419625213388409,
            "mem": 429043712,
            "disk": 184913920
         }
      }
   }
}

The relevant aspect of the output for us in the context of finding the Warden container, is to take a look at the host property of each instance. Having multiple instances of an application running does not make it easier to debug. So if you want to trace some exceptions/logs and at the same time avoid having to connect to each instance, you should reduce the number of instances with `cf scale appName -i 1` before.

In the next step we need to connect to our environment via bosh. So ssh into your jumphost connected to the director (microbosh), and then run a:

bosh vms 

Normally, if your environment is running fine, you should see a list comparable to the one below.

bosh vms
Deployment `cf-ops'

Director task 80

Task 80 done

+------------------------------------+---------+---------------+-----------------+
| Job/index                          | State   | Resource Pool | IPs             |
+------------------------------------+---------+---------------+-----------------+
| api_worker_z1/0                    | running | small_z1      | 192.168.204.62  |
| api_z1/0                           | running | small_z1      | 192.168.204.52  |
| clock_global/0                     | running | small_z1      | 192.168.204.57  |
| etcd_z1/0                          | running | small_z1      | 192.168.204.27  |
| etcd_z1/1                          | running | small_z1      | 192.168.204.28  |
| ha_proxy_z1/0                      | running | small_z1      | 192.168.204.16  |
| hm9000_z1/0                        | running | small_z1      | 192.168.204.67  |
| loggregator_trafficcontroller_z1/0 | running | small_z1      | 192.168.204.107 |
| loggregator_z1/0                   | running | small_z1      | 192.168.204.93  |
| loggregator_z1/1                   | running | small_z1      | 192.168.204.94  |
| login_z1/0                         | running | small_z1      | 192.168.204.49  |
| mongodb_z1/0                       | running | small_svc_z1  | 192.168.204.124 |
| nats_z1/0                          | running | small_z1      | 192.168.204.21  |
| nfs_z1/0                           | running | small_z1      | 192.168.204.37  |
| postgres_z1/0                      | running | small_z1      | 192.168.204.42  |
| postgresql_z1/0                    | running | small_svc_z1  | 192.168.204.123 |
| rabbit_z1/0                        | running | small_svc_z1  | 192.168.204.121 |
| redis_z1/0                         | running | small_svc_z1  | 192.168.204.122 |
| router_z1/0                        | running | router_z1     | 192.168.204.112 |
| runner_z1/0                        | running | runner_z1     | 192.168.204.72  |
| runner_z1/1                        | running | runner_z1     | 192.168.204.73  |
| runner_z1/2                        | running | runner_z1     | 192.168.204.74  |
| stats_z1/0                         | running | small_z1      | 192.168.204.32  |
| uaa_z1/0                           | running | small_z1      | 192.168.204.45  |
+------------------------------------+---------+---------------+-----------------+

Now connect to the instance of the `host` IP you have identified previously by using `cf curl` and connecting to /stats. In our case that was e.g. 192.168.204.74. Track the IP from the list, and then do a

bosh ssh runner_z1 2

based on the mapping between IP and Job/Index. After you have successfully connected it is important to do a `sudo su` to gain the necessary rights for opening the DEA database.

vi /var/vcap/data/dea_next/db/instances.json

Search via /50be0e27-79ae-439d-b0a9-16f173ce8907 in vi for the UUID with identified in our first step. The output should look like this:

      "application_id": "50be0e27-79ae-439d-b0a9-16f173ce8907",
      "application_version": "cb6fb289-47b3-4ac7-9eee-98343392e9cf",
      "application_name": "web",
      "application_uris": [
        "web.192.168.204.16.xip.io"
      ],
      "droplet_sha1": "270084af0e5ada372804e8b240b69e0cc20276b4",
      "droplet_uri": null,
      "start_command": null,
      "state": "RUNNING",
      "warden_job_id": 20,
      "warden_container_path": "/var/vcap/data/warden/depot/18i6a19afbd",
      "warden_host_ip": "10.254.0.13",
      "warden_container_ip": "10.254.0.14",
      "instance_host_port": 61002,
      "instance_container_port": 61002,

Now you can see the first trace of Warden in the `warden_container_path`. Copy that your editor as we will use it in the next step.

Now that we have identified the container there is a nice command to connect into a warden container. We will use the information gained in the previous step here:

/var/vcap/packages/warden/warden/src/wsh/wsh --socket /var/vcap/data/warden/depot/18i6a19afbd/run/wshd.sock --user vcap
vcap@18i6a19afbd:~$ 

And that’s it. Now you are in the Warden container.