

EFK fluent部署安装

1. 环境介绍

Centos 6.5 64bit

2. 安装


fluent 考虑到灵活可扩展性,使用Ruby编写,部分功能考虑性能使用C语言编写。普通用户安装操作Ruby daemon还是有一定难度的。
考虑到flunt的上手难度, fluent专门发布了稳定发布包,就也是所谓的td-agent. td-agent和fluent的区别如下。 新手建议使用td-agent

Alt text

Step0: 安装准备

I. 优化 File Descriptors

设置 ulimit 执行 ulimit -n,返回如下:

# ulimit -n
>    2014

如果显示的内容是 1024 ,则需要修改 /etc/security/limits.conf添加如下内容

root soft nofile 65536
root hard nofile 65536
* soft nofile 65536
* hard nofile 65536


# ulimit  -n
# ulimit -SHn 65535

II. 优化 Network Kernel 参数

编辑/etc/sysctl.conf , 并执行 sysctl -w 使其生产。如果环境遇到过 TCP_WAIT 有问题,刚不需要设置如下配置

net.ipv4.tcp_tw_recycle = 1
net.ipv4.tcp_tw_reuse = 1
net.ipv4.ip_local_port_range = 10240    65535

Step1: 从rpm源安装

现在支持CentOS , RHEL5,6,7。
下载并执行 install-redhat-td-agent2.sh 。shell会做两件事情:

  • 安装 /etc/yum.repos.d/td.repo

  • 安装 td-agent rpm包

curl -L https://toolbelt.treasuredata.com/sh/install-redhat-td-agent2.sh | sh

Step2: 启动td-agent服务

/etc/init.d/td-agent 提供 start,stop,restart功能

# /etc/init.d/td-agent start
Starting td-agent:                                         [确定]
# /etc/init.d/td-agent status
td-agent is running
#ps aux | grep td-agent
td-agent  3318  0.0  0.5 223132 20324 ?        Sl   11:39   0:00 /opt/td-agent/embedded/bin/ruby /usr/sbin/td-agent --log /var/log/td-agent/td-agent.log --use-v1-config --group td-agent --daemon /var/run/td-agent/td-agent.pid
td-agent  3321  2.5  1.2 269684 47288 ?        Sl   11:39   0:00 /opt/td-agent/embedded/bin/ruby /usr/sbin/td-agent --log /var/log/td-agent/td-agent.log --use-v1-config --group td-agent --daemon /var/run/td-agent/td-agent.pid


Step3: 验证测试(HTTP方式)

默认 /etc/td-agent/td-agent.conf默认是从HTTP取日志并把日志输出到 /var/log/td-agent/td-agent.log。我们可以通过 curl命令做简单测试

3. 配置td-agent

如果我们已经准备好收集日志了,下面的部分是需要关注的地方 。


如果Fluentd是通过 td-agent 安装的,config 配置默认存放 /etc/td-agent/td-agent.conf 通过执行如下命令使配置生效

/etc/init.d/td-agent reload


  1. source directives determine the input sources.

  2. match directives determine the output destinations.

  3. filter directives determine the event processing pipelines.

  4. system directives set system wide configuration.

  5. label directives group the output and filter for internal routing

  6. @include directives include other files.

(1) “source”: where all the data come from
Fluentd’s input sources are enabled by selecting and configuring the desired input plugins using source directives. Fluentd’s standard input plugins include http and forward. http turns fluentd into an HTTP endpoint to accept incoming HTTP messages whereas forward turns fluentd into a TCP endpoint to accept TCP packets. Of course, it can be both at the same time (You can add as many sources as you wish):

# Receive events from 24224/tcp
# This is used by log forwarding and the fluent-cat command
 @type forward
 port 24224

# http://this.host:9880/myapp.access?json={"event":"data"}
 @type http
 port 9880

Each source directive must include a type parameter. The type parameter specifies which input plugin to use.

Interlude: Routing

The source submits events into the Fluentd’s routing engine. An event consists of three entities: tag, time and record. The tag is a string separated by ‘.’s (e.g. myapp.access), and is used as the directions for Fluentd’s internal routing engine. The time field is specified by input plugins, and it must be in the Unix time format. The record is a JSON object.

Fluentd accepts all non-period characters as a part of a tag. However, since the tag is sometimes used in a different context by output destinations (e.g., table name, database name, key name, etc.), it is strongly recommended that you stick to the lower-case alphabets, digits and underscore, e.g., ^[a-z0-9_]+$.

In the example above, the HTTP input plugin submits the following event::

# generated by http://this.host:9880/myapp.access?json={"event":"data"}
tag: myapp.access
time: (current time)
record: {"event":"data"}

(2) “match”: Tell fluentd what to do!
The “match” directive looks for events with matching tags and processes them. The most common use of the match directive is to output events to other systems (for this reason, the plugins that correspond to the match directive are called “output plugins”). Fluentd’s standard output plugins include file and forward. Let’s add those to our configuration file.

# Receive events from 24224/tcp
# This is used by log forwarding and the fluent-cat command
 @type forward
 port 24224

# http://this.host:9880/myapp.access?json={"event":"data"}
 @type http
 port 9880

# Match events tagged with "myapp.access" and
# store them to /var/log/fluent/access.%Y-%m-%d
# Of course, you can control how you partition your data
# with the time_slice_format option.
<match myapp.access>
 @type file
 path /var/log/fluent/access

Each match directive must include a match pattern and a type parameter. Only events with a tag matching the pattern will be sent to the output destination (in the above example, only the events with the tag “myapp.access” is matched). The type parameter specifies the output plugin to use.

Just like input sources, you can add new output destinations by writing your own plugins. For further information regarding Fluentd’s output destinations, please refer to the Output Plugin Overview article.

Match Pattern: how you control the event flow inside fluentd

The following match patterns can be used for the <match> tag.

  • * matches a single tag part.

    • For example, the pattern a.* matches a.b, but does not match a or a.b.c

  • ** matches zero or more tag parts.

    • For example, the pattern a.** matches a, a.b and a.b.c

  • {X,Y,Z} matches X, Y, or Z, where X, Y, and Z are match patterns.

    • For example, the pattern {a,b} matches a and b, but does not match c

    • This can be used in combination with the * or ** patterns. Examples include a.{b,c}.* and a.{b,c.**}

  • When multiple patterns are listed inside one <match> tag (delimited by one or more whitespaces), it matches any of the listed patterns. For example:

    • The patterns <match a b> match a and b.

    • The patterns <match a.** b.*> match a, a.b, a.b.c. (from the first pattern) and b.d (from the second pattern).

Match Order

Fluentd tries to match tags in the order that they appear in the config file. So if you have the following configuration:

# ** matches all tags. Bad :(
<match **>
 @type blackhole_plugin

<match myapp.access>
 @type file
 path /var/log/fluent/access

then myapp.access is never matched. Wider match patterns should be defined after tight match patterns.

<match myapp.access>
 @type file
 path /var/log/fluent/access

# Capture all unmatched tags. Good :)
<match **>
 @type blackhole_plugin

If you want to send events to multiple outputs, consider out_copy plugin.

<match myevent.file_and_mongo>
 @type copy
   @type file
   path /var/log/fluent/myapp
   time_slice_format %Y%m%d
   time_slice_wait 10m
   time_format %Y%m%dT%H%M%S%z
   compress gzip
   @type mongo
   host fluentd
   port 27017
   database fluentd
   collection test

(3) “filter”: Event processing pipeline
The “filter” directive has same syntax as “match” but “filter” could be chained for processing pipeline. Using filters, event flow is like below:

Input -> filter 1 -> ... -> filter N -> Output

Let’s add standard record_transformer filter to “match” example.

# http://this.host:9880/myapp.access?json={"event":"data"}
 @type http
 port 9880

<filter myapp.access>
 @type record_transformer
   host_param "#{Socket.gethostname}"

<match myapp.access>
 @type file
 path /var/log/fluent/access

Received event,{"event":"data"}, goes to record_transformer filter first. record_transformer adds “host_param” field to event and filtered event, {"event":"data","host_param":"webserver1"}, goes to file output.

You can also add new filters by writing your own plugins. For further information regarding Fluentd’s filter destinations, please refer to the `Filter Plugin Overview article.

Filter’s match order is same as Output and we should put before .

(4) Set system wide configuration: the “system” directive

Following configurations are set by system directive. You can set same configurations by fluentd options::

  • log_level

  • suppress_repeated_stacktrace

  • emit_error_log_interval

  • suppress_config_dump

  • without_source

Here is an example::

 # equal to -qq option
 log_level error
 # equal to --without-source option
 # ...

(5) Group filter and output: the “label” directive

The “label” directive groups filter and output for internal routing. “label” reduces the complexity of tag handling.

Here is a configuration example. “label” is built-in plugin parameter so @ prefix is needed.

 @type forward

 @type tail
 @label @SYSTEM

<filter access.**>
 @type record_transformer
   # ...
<match **>
 @type elasticsearch
 # ...

<label @SYSTEM>
 <filter var.log.middleware.**>
   @type grep
   # ...
 <match **>
   @type s3
   # ...

In this configuration, forward events are routed to record_transformer filter / elasticsearch output and in_tail events are routed to grep filter / s3 output inside @SYSTEM label.

“label” is useful for event flow separation without tag prefix.

ERROR label

@ERROR label is a built-in label used for error record emitted by plugin’s emit_error_event API.

If you set <label @ERROR> in the configuration, events are routed to this label when emit related error, e.g. buffer is full or invalid record.

(6) Re-use your config: the “@include” directive
Directives in separate configuration files can be imported using the @include directive::

# Include config files in the ./config.d directory
@include config.d/*.conf

The @include directive supports regular file path, glob pattern, and http URL conventions::

# absolute path
@include /path/to/config.conf

# if using a relative path, the directive will use
# the dirname of this config file to expand the path
@include extra.conf

# glob match pattern
@include config.d/*.conf

# http
@include http://example.com/fluent.conf

Note for glob pattern, files are expanded in the alphabetical order. If you have a.conf and b.conf, fluentd parses a.conf first. But you should not write the configuration depends on this order. It is so error prone. Please separate @include for safety.

# If you have a.conf,b.conf,...,z.conf and a.conf / z.conf are important...

# This is bad
@include *.conf

# This is good
@include a.conf
@include config.d/*.conf
@include z.conf

Supported Data Types for Values

Each parameter’s type should be documented. If not, please let the plugin author know.

Common plugin parameter

  • @type: Specify plugin type

  • @id: Specify plugin id. in_monitor_agent uses this value for plugin_id field

  • @label: Specify label symbol. See label section

  • @log_level: Specify per plugin log level. See Per Plugin Log section

Format tips
This section describes useful features in configuration format.

Multi line support for array and hash values

You can write multi line value for array and hash values.

array_param [
 "a", "b"
hash_param {

Fluentd assumes [ or { is a start of array / hash. So if you want to set [ or { started but non-json parameter, please use ’ or “.
Example1: mail plugin::

<match **>
 @type mail
 subject "[CRITICAL] foo's alert system"

Example2: map plugin::

<match tag>
 @type map
 map '[["code." + tag, time, { "code" => record["code"].to_i}], ["time." + tag, time, { "time" => record["time"].to_i}]]'
 multi true

We will remove this restriction with configuration parser improvement.

"foo" is interpreted as foo, not "foo"

" is a quote character of string value. It causes the different behaviour between v0.12 and old format in v0.10.

str_param "foo"
  • In v0.12, str_param is foo

  • In v0.10 without --use-v1-config, str_param is "foo"

Embedded Ruby code

Embedded Ruby code

You can evaluate the Ruby code with #{} in " quoted string. This is useful for setting machine information like hostname.

host_param "#{Socket.gethostname}" # host_param is actual hostname like `webserver1`.

config-xxx mixins use “${}”, not “#{}”. These embedded configurations are two different things.

In double quoted string literal, \ is escape character

\ is interpreted as escape character. You need \ for setting ", \r, \n, \t, \ or several characters in double-quoted string literal.

str_param "foo\nbar" # \n is interpreted as actual LF character



上一篇 2016-03-22
下一篇 2016-03-24


  • Linux程序包管理方式

    Linux程序包安装和管理方式共计三种:          一、[yum|dnf],通过官网或者其他开源网站提供的文件服务器,本机镜像源等途径进行安装。         二、rpm,通过官网或者其他开源网站通过…

    Linux干货 2016-08-29
  • 正则表达式的概念和用法

    概念 正则表达式是对字符串操作的一种逻辑表达方式,很多情况下我们需要在茫众多的文件中找到我们需要的文件时,就需要用到正则表达式了 正则表达式就如同一个过滤器,能够筛选出希望得到的字符串。它可以检索、替换符合我们自己规定格式的所有文本。 正则表达式分两类: 基本正则表达式 扩展正则表达式 正则表达式的用法和选项 在Linux中,正则表达式通常会配合文本过滤工具…

  • blog test

    just test

    Linux干货 2017-02-28
  • CentOS环境下,ab性能测试功具介绍及使用

    网站性能压力测试是服务器网站性能调优过程中必不可缺少的一环。只有让服务器处在高压情况下,才能真正体现出软件、硬件等各种设置不当所暴露出的问题。 性能测试工具目前最常见的有以下几种:ab、http_load、webbench、siege。 ab是apache自带的压力测试工具。ab非常实用,它不仅可以对apache服务器进行网站访问压力测试,也可以对或其它类型…

    Linux干货 2017-07-22
  • 进程和计划任务

    一、知识整理 1、网络客户端工具:lftp,ftp,lftget,wget 子命令:get,mget,ls,help等 wget [opt] …[url] -q 静默模式 -c 断点续传 -O 保存位置 –limit-rates=  指定传输速率 登录ftp之后:lcd 在本机切换目录;get下载单个文件 !ls 查看本机文…

    Linux干货 2016-09-13
  • linux文本编辑器,vim编辑器

    定义:文本(纯文本信息,必须是不加任何修饰的文本信息)编辑器 文本编辑器种类:                 行文本编辑器 :sed       &…

    Linux干货 2016-08-10