0

我想使用Logstash解析nginx日志,一切看起来不错,除了得到这个_grokparsefailure标签的行含有Nginx $ remote_user。当$ REMOTE_USER是“ - ”(默认指定$ REMOTE_USER当值),Logstash做的工作,但与真正的$ REMOTE_USER像[email protected]失败,并把_grokparsefailure标签:解析Nginx日志时Logstash _grokparsefailure

127.0.0.1 - - [17/Feb/2017:23:14:08 +0100]“GET /favicon.ico HTTP/1.1”302 169“http://training-hub.tn/trainer/”“Mozilla/5.0(X11; Linux x86_64)AppleWebKit/537.36(KHTML,如Gecko )铬/ 56.0.2924.87 Safari浏览器/ 537.36"

=====>作品细

127.0.0.1 - [email protected] [17/Feb/2017:23:14:07 +0100]“GET /trainer/templates/home.tmpl.html HTTP/1.1”304 0 “http://training-hub.tn/trainer/”“Mozilla /5.0(X11; Linux的x86_64的) 为AppleWebKit/537.36(KHTML,例如Gecko)浏览器/ 56.0.2924.87 的Safari/537.36"

=====>_grokparsefailure标签和无法解析日志行

我使用这个配置文件:

input {  
    file {  
     path => "/home/dev/node/training-hub/logs/access_log"  
     start_position => "beginning"  
     sincedb_path => "/dev/null" 
     ignore_older => 0 
     type => "logs" 
    } 
} 

filter {  
    if[type] == "logs" {   
     mutate {    
      gsub => ["message", "::ffff:", ""]   
     }  
     grok {   
      match=> [ 
       "message" , "%{COMBINEDAPACHELOG}+%{GREEDYDATA:extra_fields}", 
       "message" , "%{COMMONAPACHELOG}+%{GREEDYDATA:extra_fields}" 
      ] 
      overwrite=> [ "message" ] 
     } 

     mutate { 
      convert=> ["response", "integer"] 
      convert=> ["bytes", "integer"] 
      convert=> ["responsetime", "float"] 
     } 
     geoip { 
      source => "clientip" 
      target => "geoip" 
      database => "/etc/logstash/GeoLite2-City.mmdb" 
      add_field => [ "[geoip][coordinates]", "%{[geoip][longitude]}" ] 
      add_field => [ "[geoip][coordinates]", "%{[geoip][latitude]}" ] 
     } 
     mutate { 
      convert => [ "[geoip][coordinates]", "float"] 
     } 

     date { 
      match=> [ "timestamp", "dd/MMM/YYYY:HH:mm:ss Z" ] 
      remove_field=> [ "timestamp" ] 
     } 

     useragent { 
      source=> "agent" 
     } 
    } 
} 

output { elasticsearch {   hosts => "localhost:9200" } } 

回答

0

许多值测试输出后,我意识到,Logstash无法解析登录含有此类$remote_user,因为它不是一个有效的用户名(电子邮件地址)线,所以我已经添加了一个mutate gsub筛选以删除@和邮件地址的其余部分以生成有效的$remote_user

GSUB => [ “消息”, “@ + A-Z0-9(:(?:一个-Z0-9?)|?[(:(?: 25 [O- 5] | 2 [0-4] [0-9] | [01] [0-9] [0-9])){3}(?: 25 [0-5] | 2 [O- 4] [0-9] | [01] [0-9] [0-9] | [A-Z0-9 - ] * [A-Z0-9]:???(:[\ x01- \ X08 \ x0b \ x0c \ x0e- \ x1f \ x21- \ x5a \ x53- \ x7f] | \ [\ x01- \ x09 \ x0b \ x0c \ x0e- \ x7f])+)]) [“,”[ ]

而现在,它工作正常