25 Minute ELK Stack With Docker - Part 4

I'm going to take a slightly different route with this article. Previously (Part 1 | Part 2 | Part 3), we set up an ELK stack almost anyone could use, providing they set up the right grok filters and figure out how to send data to it. This time I'm going to get into that "figure out how to send data to it" part, by connecting up an application I already have to my stack. In this case, it's a .NET application using Log4Net to output its log files. If you're using a different platform, the specifics here might not be so applicable, although the general ideas should apply.

In the beginning...

As my starting point, I've got a .NET application which does all its logging via Log4Net. The relevant section in the web.config is this:

log4net>  
    <appender name="RollingFileAppender" type="log4net.Appender.RollingFileAppender">
      <lockingModel type="log4net.Appender.FileAppender+MinimalLock" />
      <file value="App_Data\logging" />
      <appendToFile value="true" />
      <staticLogFileName value="false" />
      <rollingStyle value="Composite" />
      <datePattern value="-yyyy-MM-dd.\tx\t" />
      <maxSizeRollBackups value="10" />
      <maximumFileSize value="100MB" />
      <dateTimeStrategy type="log4net.Appender.RollingFileAppender+UniversalDateTime" />
      <layout type="log4net.Layout.PatternLayout">
        <conversionPattern value="%utcdate %-5level: [%logger] %message %exception %newline" />
      </layout>
    </appender>
    <root>
      <level value="ALL" />
      <appender-ref ref="RollingFileAppender" />
    </root>
  </log4net>

This just logs to a file - on this development/test config, straight into the AppData directory. (Don't do this in production!)

We can add Logstash support by adding a new appender which connects to our stack by sending a UDP packet to it for each log event - the UdpAppender.

Adding the UdpAppender

I'm going to add the following appender to my Log4Net config:

    <appender name="UdpAppender" type="log4net.Appender.UdpAppender">
      <remoteAddress value="52.31.235.252" />
      <remotePort value="5000" />
        <layout type="log4net.Layout.PatternLayout">
          <conversionPattern value="%utcdate %-5level: [%logger] %message %exception %newline" />
        </layout>
    </appender>

52.31.235.252 is the IP of the AWS box I'm using to prototype my stack on, and 5000 is the port we already use to send data to Logstash. I've used a security group to lock down access to only my application server; you'd want to do the same with your hosting environment to avoid the world being able to send you log data. Also be careful if your route involves a hop across the Internet - this will send the data in plain text! You might want to look at other options or secure tunnelling if your services aren't all in the same data centre or VPC.

I keep the same pattern layout, as this is what matches my Grok filter, but if you wanted to send different data to your dashboard vs. your log files you could set that here. For example, you might want to only send summary information if you're building a wallboard, to save filling ElasticSearch with stack traces that never get inspected.

Then to get Log4Net to actually use it, I add this to the list of appenders under my root:

    <root>
      <level value="ALL" />
      <appender-ref ref="RollingFileAppender" />
      <appender-ref ref="UdpAppender " />
    </root>

All I'm doing here is telling Log4Net to also use the appender with name "UdpAppender" whenever it outputs a log of any type. You can remove the file appender if you're willing to trust everything to Logstash and ElasticSearch - but it might be an idea to keep the log files around in case the stack falls over or you lose some events to UDP packets disappearing, especially if you're in a PCI/DSS or similar environment with strict log retention requirements.

Because we're using a UDP appender, I need to tell Logstash to input from UDP, rather than the TCP configuration we've previously been using. I change the input part of logstash/logstash.conf to the following:

input {  
        udp {
                port => 5000
        }
}

Logstash is waiting for some UDP packets, but Docker (which handles the NAT from the host to the container) still thinks we're talking over TCP. So we also need to update the docker-compose.yml file and change the logstash container to reflect this:

logstash:  
  image: logstash:2.0.0-1
  command: logstash -f /etc/logstash/conf.d/logstash.conf
  volumes:
    - ./logstash:/etc/logstash/conf.d
  ports:
    - "5000:5000/udp"
  links:
    - elasticsearch

Now I start the cluster - docker-compose up - and let my .net application generate some logs. Providing those UDP packets hit the box I'm running my stack on, Logstash will push the data into ElasticSearch and it'll be ready and waiting for analysis in Kibana.

UDP vs TCP vs Other

Previously, we used a TCP input for Logstash. This is fine for just netcatting a few files to it, but in production we might want to think about the overhead of sending our logs like this. There's a significant overhead involved in negotiating and handshaking with TCP, and in ensuring packet order. If all you're creating is a firehose of log events you want to analyse en masse, and can tolerate the odd missed or duplicated event, do you really want your application server tied up negotiating packet delivery with your logging server?

(This is a genuine question, not a rhetorical one. There are situations, especially around compliance, where TCP's guarantees of completeness and at-most-once delivery will be more important to you than raw performance. If you're in a situation where you absolutely need guaranteed delivery, you may even end up pushing your logs to a message bus with strict delivery guarantees and having your ELK stack pull from that queue to minimise the chance of an event being lost in transit.)

If all you're doing is creating a dashboard for your team area, or looking to analyse general trends, then sending your logs over UDP is probably going to be Good EnoughTM. This may sound cavalier, but I've seen more projects to improve logging fail through analysis paralysis (worrying about ultra-reliably getting every last event from every last system) than I've ever seen fail by starting out with something basic and improving as needed. If you're getting 4000 errors a day you need that on a wallboard, not stalled in planning because you're worried you might only see 3998 of them on a graph!

Where next?

We've got an ELK stack that we can repeatably set up anywhere that has Docker and Docker Compose installed, and we've seen that connecting applications which use Log4Net is trivial. A similar setup can be used for Log4J, and there are numerous options for Bunyan in the Javascript world.

There are a few further things you may want to consider beyond this, though:

  • Configuring Logstash to accept logs from multiple applications/sources
  • Adding a container for scheduled ElasticSearch Curator tasks, e.g. cleaning up old log files
  • Setting up the nginx proxy to log its own events into the ELK stack
  • Adding in metrics from your build environment to the dashboard

Whatever you choose to do, enjoy - and if you found this series useful or did something interesting with the stack, get in touch and let me know about it!