linuxea:logstash6.3.2与redis+filebeat示例(三)

在之前的一篇中提到使用redis作为转发思路

在前面两篇中写的都是elk的安装,这篇叙述在6.3.2中的一些filebeat收集日志和处理的问题,以nginx为例,后面的可能会有,也可能不会有

filebeat安装和配置

filebeat会将日志发送到reids,在这期间包含几个配置技巧,在配置文件出会有一些说明

下载和安装
[root@linuxea-VM_Node-113 ~]# wget https://artifacts.elastic.co/downloads/beats/filebeat/filebeat-6.3.2-x86_64.rpm -O $PWD/filebeat-6.3.2-x86_64.rpm
[root@linuxea-VM_Node_113 ~]# yum localinstall $PWD/filebeat-6.3.2-x86_64.rpm -y

启动

[root@linuxea-VM_Node-113 /etc/filebeat/modules.d]# systemctl start filebeat.service 

查看日志

[root@linuxea-VM_Node-113 /etc/filebeat/modules.d]# tail -f /var/log/filebeat/filebeat 
2018-08-03T03:13:32.716-0400    INFO    pipeline/module.go:81   Beat name: linuxea-VM-Node43_241_158_113.cluster.com
2018-08-03T03:13:32.717-0400    INFO    instance/beat.go:315    filebeat start running.
2018-08-03T03:13:32.717-0400    INFO    [monitoring]    log/log.go:97   Starting metrics logging every 30s
2018-08-03T03:13:32.717-0400    INFO    registrar/registrar.go:80   No registry file found under: /var/lib/filebeat/registry. Creating a new registry file.
2018-08-03T03:13:32.745-0400    INFO    registrar/registrar.go:117  Loading registrar data from /var/lib/filebeat/registry
2018-08-03T03:13:32.745-0400    INFO    registrar/registrar.go:124  States Loaded from registrar: 0
2018-08-03T03:13:32.745-0400    INFO    crawler/crawler.go:48   Loading Inputs: 1
2018-08-03T03:13:32.745-0400    INFO    crawler/crawler.go:82   Loading and starting Inputs completed. Enabled inputs: 0
2018-08-03T03:13:32.746-0400    INFO    cfgfile/reload.go:122   Config reloader started
2018-08-03T03:13:32.746-0400    INFO    cfgfile/reload.go:214   Loading of config files completed.
2018-08-03T03:14:02.719-0400    INFO    [monitoring]    log/log.go:124  Non-zero metrics in the last 30s
配置文件

在此配中paths下的是写日志的路径,可以使用通配符,但是如果你使用通配符后就意味着目录下的日志写在一个fields的id中,这个id会传到redis中,在传递到logstash中,最终以一个id的形式传递到kibana
当然,这里测试用两个来玩,如下

filebeat.prospectors:
- type: log
  enabled: true
  paths:
  - /data/wwwlogs/1015.log
  fields:
    list_id: 113_1015_nginx_access
- input_type: log
  paths:
    - /data/wwwlogs/1023.log
  fields:
    list_id: 113_1023_nginx_access
filebeat.config.modules:
  path: ${path.config}/modules.d/*.yml
  reload.enabled: false
setup.template.settings:
  index.number_of_shards: 3
output.redis:
  hosts: ["IP:PORT"]
  password: "OTdmOWI4ZTM4NTY1M2M4OTZh"
  db: 2
  timeout: 5
  key: "%{[fields.list_id]:unknow}"

在output中的key: "%{[fields.list_id]:unknow}"意思是如果[fields.list_id]有值就匹配,如果没有就unknow,最终传递给redis中

redis安装

在我意淫的这套里面,redis用来转发数据的,他可以说集群也可以说单点,取决于数据量的大小
按照我以往的骚操作,redis当然要用docker来跑,运行一下命令进行安装

curl -Lks4 https://raw.githubusercontent.com/LinuxEA-Mark/docker-alpine-Redis/master/Sentinel/install_redis.sh|bash

安装完成在/data/rds下有一个docker-compose.yaml文件,如下:

[root@iZ /data/rds]# cat docker-compose.yaml 
version: '2'
services:
  redis:
    build:
      context:  https://raw.githubusercontent.com/LinuxEA-Mark/docker-alpine-Redis/master/Sentinel/Dockerfile
    container_name: redis
    restart: always
    network_mode: "host"
    privileged: true
    environment:
    - REQUIREPASSWD=OTdmOWI4ZTM4NTY1M2M4OTZh
    - MASTERAUTHPAD=OTdmOWI4ZTM4NTY1M2M4OTZh
    volumes:
    - /etc/localtime:/etc/localtime:ro
    - /data/redis-data:/data/redis:Z
    - /data/logs:/data/logs
redis查看写入情况
[root@iZ /etc/logstash/conf.d]# redis-cli -h 127.0.0.1 -a OTdmOWI4ZTM4NTY1M2M4OTZh 
127.0.0.1:6379> select 2
OK
127.0.0.1:6379[2]> keys *
1) "113_1015_nginx_access"
2) "113_1023_nginx_access"
127.0.0.1:6379[2]> lrange 113_1023_nginx_access  0 -1
  1) "{\"@timestamp\":\"2018-08-04T04:36:26.075Z\",\"@metadata\":{\"beat\":\"\",\"type\":\"doc\",\"version\":\"6.3.2\"},\"beat\":{\"name\":\"linuxea-VM-Node43_13.cluster.com\",\"hostname\":\"linuxea-VM-Node43_23.cluster.com\",\"version\":\"6.3.2\"},\"host\":{\"name\":\"linuxea-VM-Node43_23.cluster.com\"},\"offset\":863464,\"message\":\"IP - [\xe\xe9\x9797\xb4:0.005 [200] [200] \xe5\x9b4:[0.005] \\\"IP:51023\\\"\",\"source\":\"/data/wwwlogs/1023.log\",\"fields\":{\"list_id\":\"113_1023_nginx_access\"}}"

logstash安装和配置

logstash在内网进行安装和配置,用来抓取公网redis的数据,抓到本地后发送es,在到看kibana

[root@linuxea-VM-Node117 ~]# curl -Lk https://artifacts.elastic.co/downloads/logstash/logstash-6.3.2.tar.gz|tar xz -C /usr/local && useradd elk && cd /usr/local/ && ln -s logstash-6.3.2 logstash && mkdir /data/logstash/{db,logs} -p && chown -R elk.elk /data/logstash/ /usr/local/logstash-6.3.2 && cd logstash/config/ && mv logstash.yml logstash.yml.bak 
配置文件

在这个配置文件之前下载ip库,在地图中会用到,稍后配置到配置文件

  • 准备工作

安装GeoLite2-City

[root@linuxea-VM-Node117 ~]# curl -Lk http://geolite.maxmind.com/download/geoip/database/GeoLite2-City.tar.gz|tar xz -C /usr/local/logstash-6.3.2/config/

在之前5.5版本也做过nginx的格式化,直接参考

grok

nginx log_format准备

log_format upstream2  '$proxy_add_x_forwarded_for $remote_user [$time_local] "$request" $http_host'
        '[$body_bytes_sent] $request_body "$http_referer" "$http_user_agent" [$ssl_protocol] [$ssl_cipher]'
        '[$request_time] [$status] [$upstream_status] [$upstream_response_time] [$upstream_addr]';

nginx patterns准备,将日志和patterns可以放在kibana grok检查,也可以在grokdebug试试,不过6.3.2的两个结果并不相同

[root@linuxea-VM-Node117 /usr/local/logstash-6.3.2/config]# cat patterns.d/nginx 
NGUSERNAME [a-zA-Z\.\@\-\+_%]+
NGUSER %{NGUSERNAME}
NGINXACCESS %{IP:clent_ip} (?:-|%{USER:ident}) \[%{HTTPDATE:log_date}\] \"%{WORD:http_verb} (?:%{PATH:baseurl}\?%{NOTSPACE:params}(?: HTTP/%{NUMBER:http_version})?|%{DATA:raw_http_request})\" (%{IPORHOST:url_domain}|%{URIHOST:ur_domain}|-)\[(%{BASE16FLOAT:request_time}|-)\] %{NOTSPACE:request_body} %{QS:referrer_rul} %{GREEDYDATA:User_Agent} \[%{GREEDYDATA:ssl_protocol}\] \[(?:%{GREEDYDATA:ssl_cipher}|-)\]\[%{NUMBER:time_duration}\] \[%{NUMBER:http_status_code}\] \[(%{BASE10NUM:upstream_status}|-)\] \[(%{NUMBER:upstream_response_time}|-)\] \[(%{URIHOST:upstream_addr}|-)\]

配置文件如下:
在input中的key写的是reids中的key
其中在filebeat的 key是"%{[fields.list_id]:unknow}",这里进行匹配[fields.list_id],在其中表现的是if [fields][list_id] 如果等于113_1015_nginx_access,匹配成功则进行处理
grok部分是nginx的patterns
geoip中的database需要指明,source到clent_ip
对useragent也进行处理
ooutput中需要填写 用户和密码以便于链接到es,当然如果你没有破解或者使用正版,你是不能使用验证的,但是你可以参考x-pack的破解

input {
    redis {
         host => "47"
         port => "6379"
         key => "113_1015_nginx_access"
         data_type => "list"
         password => "I4ZTM4NTY1M2M4OTZh"
         threads => "5"
         db => "2"
       }
    }
filter {
 if [fields][list_id] == "113_1023_nginx_access" {
    grok {
        patterns_dir => [ "/usr/local/logstash-6.3.2/config/patterns.d/" ]
        match => { "message" => "%{NGINXACCESS}" }
        overwrite => [ "message" ]
        }
    geoip {
        source => "clent_ip"
        target => "geoip"
        database => "/usr/local/logstash-6.3.2/config/GeoLite2-City.mmdb"
         }
    useragent {
        source => "User_Agent"
        target => "userAgent"
        }
    urldecode {
        all_fields => true
        }
     mutate {
            gsub => ["User_Agent","[\"]",""]        #将user_agent中的 " 换成空
            convert => [ "response","integer" ]
            convert => [ "body_bytes_sent","integer" ]
            convert => [ "bytes_sent","integer" ]
            convert => [ "upstream_response_time","float" ]
            convert => [ "upstream_status","integer" ]
            convert => [ "request_time","float" ]
            convert => [ "port","integer" ]
       }
    date {
    match => [ "timestamp" , "dd/MMM/YYYY:HH:mm:ss Z" ]
        }
        }
      }
output {
    if [fields][list_id] == "113_1023_nginx_access" {
    elasticsearch {
        hosts => ["10.10.240.113:9200","10.10.240.114:9200"]
        index => "logstash-113_1023_nginx_access-%{+YYYY.MM.dd}"
        user => "elastic"
        password => "linuxea"
    }
    }
    stdout {codec => rubydebug} 
}
json

但是也不是很骚,于是这次加上json,像这样

log_format json '{"@timestamp":"$time_iso8601",'
                    '"clent_ip":"$proxy_add_x_forwarded_for",'
                    '"user-agent":"$http_user_agent",'
                    '"host":"$server_name",'
                    '"status":"$status",'
                    '"method":"$request_method",'
                    '"domain":"$host",'
                   '"domain2":"$http_host",'
                    '"url":"$request_uri",'
                   '"url2":"$uri",'
                    '"args":"$args",'
                    '"referer":"$http_referer",'
                   '"ssl-type":"$ssl_protocol",'
                   '"ssl-key":"$ssl_cipher",'
                    '"body_bytes_sent":"$body_bytes_sent",'
                    '"request_length":"$request_length",'
                    '"request_body":"$request_body",'
                    '"responsetime":"$request_time",'
                    '"upstreamname":"$upstream_http_name",'
                    '"upstreamaddr":"$upstream_addr",'
                    '"upstreamresptime":"$upstream_response_time",'
                    '"upstreamstatus":"$upstream_status"}';

在nginx.conf中添加后,在主机段进行修改,但是这样一来,你日志的可读性就低了。但是,你的lostash性能会提升,因为logstash不会处理grok,直接将收集的日子转发到es
这里需要说明的是,我并没有使用json,是因为他不能将useragent处理好,我并没有找到可行的方式,如果你知道,你可以告诉我
但是,你可以这样。比如说使用*.log输入所有到redis,一直到kibana,然后通过kibana来做分组显示
启动:

nohup  sudo -u elk /usr/local/logstash-6.3.2/bin/logstash -f ./conf.d/*.yml  >./nohup.out 2>&1 &

如果不出意外,你会在kibana中看到以logstash-113_1023_nginx_access-%{+YYYY.MM.dd}的索引

1 分享

您可以选择一种方式赞助本站

支付宝扫码赞助

支付宝扫码赞助

日期: 2018-08-08分类: ELK Stack

标签: elk

发表评论