Logs are the answer that when something goes wrong. When you work on an enterprise scale, you need a centralized logging mechanism. (You can’t jump one server to other and tail that streams)
For the central log management, you need something like Graylog, logstash, ELK… the list goes on.
Setting up Graylog Server
Life was tough before docker, thankfully we can setup Graylog server through docker, follow the instructions on docker page.
I prefer to use docker-compose, and something like this should work.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
|# MongoDB: https://hub.docker.com/_/mongo/|
|# Elasticsearch: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/docker.html|
|# Disable X-Pack security: https://www.elastic.co/guide/en/elasticsearch/reference/5.6/security-settings.html#general-security-settings|
|– "ES_JAVA_OPTS=-Xms512m -Xmx512m"|
|# Graylog: https://hub.docker.com/r/graylog/graylog/|
|# CHANGE ME!|
|# Password: admin|
|# Graylog web interface and REST API|
|# Syslog TCP|
|# Syslog UDP|
|# GELF TCP|
|# GELF UDP|
In most cases, it just works out of the box.
Protip: Graylog use MongoDB for settings, configuration, etc. and it holds log data on the Elasticsearch. So, be careful when setting RAM usage for the JVM.
Sending the logs
There are many ways to send logs. We use monolog from the sending custom logs or WordPress level logs.
For the syslog;
(probably rsyslog pre-installed),
- Create new file under the “/etc/rsyslog.d” directory – 90-graylog2.conf
- *.* @SERVER_IP_ADDRESS:PORT;RSYSLOG_SyslogProtocol23Format
- restart the service > “service rsyslog restart”
We are holding more than 1.5 billion logs on the single machine, that logs data about 1.1 TB and the logs come from WordPress, custom logs, syslog, HAProxy across the ~20 servers.
Protip: If the logs are not that critical and you don’t have high IO rate. You might avoid using SSD disks. That will reduce the cost, and you can hold a lot of indices on the same machine.