Datadog Network Log Collection in Kubernetes

Posted on

I recently needed to enable collecting logs over TCP/UDP using Datadog in a Kubernetes cluster. This is a bit different from the typical scenario since out of the box the Datadog agent does a great job of collecting anything sent to the stdout and stderr streams in each container running in the cluster. Nine out of ten times this is what you want anyway. It’s a very low barrier to entry for apps to get their logs streamed up to an aggregator and it’s going to work in pretty much any environment. However, sometimes you need to alter your approach if you have existing software that’s using something like Serilog to write messages to sinks other than the console.

I won’t go into laborious detail about the initial setup of Datadog in a Kubernetes cluster. They already have great blog entries and documentation for that.

I will say that this article assumes you are using the Datadog Helm Chart to run the agent on your cluster and that it’s already working in some fashion (e.g. any of your autodiscoverable integrations are showing data in Datadog).

Network Log Collection in General

You can read more about all of the possible ways in which the Datadog agent can collect logs in the official docs. We are specifically interested in having the agent listen on a TCP/UDP port and then push those logs up to Datadog for us.

As the documentation explains, to set this up in a standard environment (i.e. one in which the Datadog agent is running on a server that you have direct shell access to) you can simply create a file like:

<agent config directory>/<APP_NAME>.d/conf.yaml

On a linux box for an application named my-app that would mean:

/etc/datadog-agent/conf.d/myapp.d/conf.yaml

In that file you’d have something like this:

logs:
  - type: tcp
    port: 10518
    service: "my-app"
    source: "my-app"

After getting this set up, you’d restart the datadog agent and you should be golden. The agent will take any data sent to port 10518 and send it up to Datadog.

Network Log Collection in Kubernetes

The complication when working with Kubernetes is that you don’t typically have direct shell access to the containers, and even if you did, many of your changes would be ephemeral. If the container were to be recreated then they will simply disappear. Further, some things just don’t work the way you’d expect. For example, how would you restart the datadog agent in the container?

Hint: you can’t because the normal agent restart commands won’t actually kill the agent process running in the foreground. Even if it did, the container would stop because the container’s lifecycle is bound to the foreground process.

We need to solve the following problem: How do we get the custom configuration file (i.e. my-app.d/conf.yaml) “into” the container before the agent process starts?

It isn’t very well documented, but if you are using the Datadog Helm Chart then you may have noticed the following values:

agents.volumes
agents.volumeMounts

You would need to read through the source code of the Chart’s templates, but to save you the time, these values are simply interpreted as YAML and packed into the template (i.e volumes and volumeMounts respectively).

So, using what we know about Kubernetes generally, we can create a configmap with the correct configuration, and then feed that into the agents.volumes and agents.volumeMounts values.

Step-by-Step

First, create a file on your local system named conf.yaml with the following contents:

logs:
  - type: tcp
    port: 10518
    service: "my-app"
    source: "my-app"

Next, create a configmap in your cluster:

kubectl create configmap my-app-network-log-config --from-file conf.yaml

Next, make sure you have a values.yaml file that you can pass to your helm install command that looks similar to:

datadog:
  apiKey: ""
  targetSystem: linux

  ## @param logs - object - required
  ## Enable logs agent and provide custom configs
  #
  logs:
    ## @param enabled - boolean - optional - default: false
    ## Enables this to activate Datadog Agent log collection.
    #
    enabled: true

    ## @param containerCollectAll - boolean - optional - default: false
    ## Enable this to allow log collection for all containers.
    #
    containerCollectAll: true

agents:
  volumeMounts:
    - name: my-app-network-log-config-volume
      mountPath: /etc/datadog-agent/conf.d/my-app.d/conf.yaml
      subPath: conf.yaml
      readOnly: true
  volumes:
    - name: my-app-network-log-config-volume
      configMap:
        name: my-app-network-log-config

Make sure you pass this file when you run the helm install or helm upgrade command.

Note: The above values are for demo purposes. Chances are you will need to alter them to account for any namespace or other installation. Also, if you are using something like terraform, harness.io, etc to install the charts, then you’ll likely have a different method for creating the configmap and passing the values to the chart.

If you want to test this, then you should be able to spin up a Pod on the cluster and use a simple tool like telnet to connect to port 10518 of the Datadog pod. Send it a simple message and within a minute or two you should see that show up in the Datadog Logs UI.

Summary

Although in this post I focused on setting up Network Log Collection, the above technique can be used for any advanced configuration that you’d normally do by creating/editing files on the Datadog Agent server/host.

All you should need to do is create a configmap to container the configuration, and then make sure it gets mounted into the agent.