Elasticsearch at Mailgun

by Rackspace

Ralph Meijer@ralphm

Press “s” for speaker notes

Mailgun

  • API for email
  • Optimized Delivery
  • Receiving, Parsing & Storage
  • Tracking
  • Events

New customer logs

  • Full text search.
  • Filtering on selected properties (API).
  • Storing events for 30+ days, limited retention.
  • Extendable by adding nodes.
  • Resilient to failing nodes.

Index Design

  • One big index
  • Per day
  • Per customer
  • TTLs
  • Dropping indices

Mappings

  • Field types
  • Templates
  • index.mapping.ignore_malformed

Analysis

Shipping to Elasticsearch

Monitor all the things

Shipping to Elasticsearch

API authorization

Filtering proxy vulcan

Mailgun stats

  • 60-80 million events per day
  • 30 days log retention
  • 30 indexes
  • 5 shards, 1 replica
  • 2x 90-110 GB per index
  • 6TB total

Previous setup

  • Data: 8x 30GB RAM, 8vCPUs, 1TB disk
  • Logstash: 3x 8GB RAM, 4vCPUs
  • Vulcan/API: 2x 4GB RAM, 2vCPUs

Current setup

  • Data+Logstash: 4x 64GB RAM, 24 cores, 2TB disk RAID 0
  • Vulcan/API: 2x 4GB RAM, 2vCPUs

ES config

  • Cluster name
  • Discovery
  • Lock 32GB JVM heap
  • indices.fielddata.cache.size: 40%
  • gateway.expected_nodes
  • discovery.zen.minimum_master_nodes

ES cluster settings

  • c.r.a.node_concurrent_recoveries (2)
  • c.r.a.cluster_concurrent_rebalance (2)
  • indices.recovery.max_bytes_per_sec (20M)
  • indices.recovery.concurrent_streams (3)

Routing

Connectivity failure

Future

Application logging

  • ssh
  • Find the log file
  • tail -f
  • sudo tail -F

udplog

Python logging facility


import logging
import socket
import warnings
from udplog.udplog import UDPLogger, UDPLogHandler

logger = logging.getLogger(__name__)
logging.basicConfig(level=logging.DEBUG)
logging.captureWarnings(True)

udplogger = UDPLogger(defaultFields={
                          'appname': 'example',
                          'hostname': socket.gethostname(),
                          })
root = logging.getLogger()
root.setLevel(logging.DEBUG)
root.addHandler(UDPLogHandler(udplogger, category="python_logging"))
          

Python logging facility


logger.debug("Starting!")
logger.info("This is a simple message")
logger.info("This is a message with %(what)s", {'what': 'variables'})

extra_logger = logging.LoggerAdapter(logger, {'bonus': 'extra data'})
extra_logger.info("Bonus ahead!")

a = {}
try:
    print a['something']
except:
    logger.exception("Oops!")

warnings.warn("Don't do foo, do bar instead!", stacklevel=2)
          

Python logging facility


{
  "appname": "example",
  "category": "python_logging",
  "excText": "Traceback (most recent call last):\nFile \"doc/examples/python_logging.py\", line 39, in main\nprint a['something']\nKeyError: 'something'",
  "excType": "exceptions.KeyError",
  "excValue": "'something'",
  "filename": "doc/examples/python_logging.py",
  "funcName": "main",
  "hostname": "localhost",
  "lineno": 41,
  "logLevel": "ERROR",
  "logName": "__main__",
  "message": "Oops!",
  "timestamp": 1379508311.437895
}
          

Thoughts

  • Reputation scoring
  • Elasticsearch for time series
  • Aggregations

The End

Ralph Meijer

mailgun.comralphm.net@ralphm