filter chaining question

Description

I am attempting to filter through grok with named fields, then filter through a custom filter.

After grok when I call the custom filter the event seems to be the same as before grok.

I am confused on how to chain filters and maintain the changes as the event trickles through them.

input { stdin { type => "stash-auth" } } filter { if "stash-auth" in [type] { grok { patterns_dir => "patterns" pattern => "%{STASH_CAPTCHA}" add_tag => ["stash-captcha"] } #event being passed to next block is still same as before grok if "stash-captcha" in [tags] { mutate { add_field => { "userName" => "%{user1}" } } crowd { crowdURL => "https://crowd/rest/usermanagement/1/user?username=" timeout => 2 } } } } output { stdout { codec => rubydebug } }

I took the basic DNS filter and started with that.

class LogStash::Filters::Crowd < LogStash::Filters::Base config_name "crowd" milestone 1 # Lookup email address of username. #config :userName, :validate => :string # Determine what action to do: append or replace the values in the field # specified under "username" config :action, :validate => [ "append", "replace" ], :default => "append" # Atlassian Crowd REST API URL config :crowdURL, :validate => :string # RestClient timeout config :timeout, :validate => :number, :default => 2 public def register require "json" require "rest_client" @resource = RestClient::Resource.new(@crowdURL, :user => "secret", :password => "secret", :timeout => @timeout, :accept => 'application/json') end # def register public def filter(event) puts event[userName] # <--- NOTHING prints here puts event.inspect # <--- The original message from before grok prints here return unless filter?(event) @response = @resource[event[userName]].get # <--- undefined local variable or method `userName' @responseHash = JSON.parse(@response) @email = @responseHash["email"] filter_matched(event) end end # class LogStash::Filters::Crowd

The grok filter works, if I run it without the second filter for crowd I get the results I want in output.

Gliffy Diagrams

Activity

Show:

Zachary Buckholz July 8, 2014 at 4:45 PM

Thank you very much Philippe, your input has been very valuable.

I will document my experience and hopefully others will be able to learn from my confusion.

All is working now, I implemented your recommended changes.

It clicked in my head this morning that event["something"] is completely different than event[something] <--- without the surrounding double quotes.

I also got the params working for rest-client. I think my first failed attempt as shown at the beginning of this ticket was due to my lack of understanding how event[something] is different than event["something"].

Thanks again!

Philippe Weber July 8, 2014 at 4:48 AM

Yes you're right for the resource creation, but you should be able to put it as a field variable and declare it once in the register method
Have a (maybe second?) look to rest_client readme, https://github.com/rest-client/rest-client
you should be able to pass your param only when calling the get method

Should look more like this:

public def register require "json" require "rest_client" @resource = RestClient::Resource.new(@crowdURL, :user => @crowdUsername, :password => @crowdPassword, :timeout => timeout) end # def register public def filter(event) return unless filter?(event) if @username_field ##As you declare the required=>true, this will never fail event["username_field"] = @username_field ## why do you store in each event the filter configuration ? username = event[@username_field] ## However this can be null if the event does not contains the field so should be tested response = @resource.get(:accept => 'json', :params => {:username => username}) responseHash = JSON.parse(response) email = responseHash["email"] filter_matched(event) #This should be the last call of a successful execution of the filter event['email'] = email end end

Zachary Buckholz July 7, 2014 at 10:21 PM

I got it working through a lot of trial and error. But it's still very confusing on how logstash handles an event when passed through the filter chain.

The documentation is not very clear; it's very high-level. Maybe I was looking for more details than needed.

Here is what I ended up doing: I don't like the solution, and will have to refactor.

logstash.conf

if "stash-auth" in [type] { grok { patterns_dir => "patterns" pattern => "%{STASH_CAPTCHA}" add_tag => ["stash-captcha"] } if "stash-captcha" in [tags] { crowd { crowdURL => "https://crowd/rest/usermanagement/1/user" crowdUsername => "username" crowdPassword => "password" timeout => 2 username_field => "user1" } } }

crowd.rb

# Atlassian Crowd Filter # # This filter will lookup user email from Crowd REST API using username. # Before using this you must create an application account in Crowd and # allow the IP of your logstash indexing server access. # require "logstash/filters/base" require "logstash/namespace" # The Atlassian Crowd filter performs a lookup of a user email address # if given a username. # # The config should look like this: # # filter { # crowd { # crowdURL => "https:/crowd/rest/usermanagement/1/user" # crowdUsername => "username" # crowdPassword => "password" # timeout => 2 # username_field => "user1" # } # } # class LogStash::Filters::Crowd < LogStash::Filters::Base config_name "crowd" milestone 1 # Username field that contains look up value, in my grok filter I parse the stash logs and assign the user having a captcha problem to user1 config :username_field, :validate => :string, :required => true # Determine what action to do: append or replace the values in the field # specified under "username" config :action, :validate => [ "append", "replace" ], :default => "append" # Atlassian Crowd REST API URL config :crowdURL, :validate => :string, :required => true # Atlassian Crowd REST API Username config :crowdUsername, :validate => :string, :required => true # Atlassian Crowd REST API Password config :crowdPassword, :validate => :string, :required => true # RestClient timeout config :timeout, :validate => :number, :default => 2 public def register require "json" require "rest_client" end # def register public def filter(event) return unless filter?(event) if @username_field event["username_field"] = @username_field username = event[username_field] completeURL = crowdURL + "?username" + "=" + username resource = RestClient::Resource.new(completeURL, :user => crowdUsername, :password => crowdPassword, :timeout => timeout) response = resource.get(:accept => 'json') responseHash = JSON.parse(response) email = responseHash["email"] filter_matched(event) event['email'] = email end end end # class LogStash::Filters::Crowd

I am not happy with this because I am under the impression I am supposed to create the rest-client object in the def register than update the params in the def filter as it's needed.

I wasn't able to get the rest-client params to work with GET, the Atlassian REST service uses ? parameters instead of /user/username/value type.

So when logstash is first started it loads the crowd filter into memory, but creates a new rest-client object every time it's called. This is my impression.... Is this what is happening?

Thanks for your help Philippe!

Philippe Weber July 4, 2014 at 7:18 PM

As told, just think of the event as an associative array.
Let's take 2 case

1. You decide that your filter will only lookup the field username, so you "hardcode"

username = event["username"] response = @resource[username] responseHash = JSON.parse(response) email = responseHash["email"]

2. You need the field name to be a parameter of the filter, so you do

config :username_field, :validate => :string ... username = event[@username_field] ...

and you invoke two instance of your filter

filter { crowd { username_field => "user1" ... (rest of the config) } crowd { username_field => "user2" ... (rest of the config) }

3 Or you change the filter to accept an array of value to convert now that you understand the flow

Zachary Buckholz July 4, 2014 at 6:04 PM

Thanks Philippe, I spent an hour Friday after your comment, and again this morning.

I still seem to be misunderstanding the flow of data from logstash input -> filter (grok -> custom filter)

event before grok filter

filter received {:event=>#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.0.0.1,127.0.0.1 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash"}>, :level=>:info, :file=>"(eval)", :line=>"18"}

event after grok filter

filters/LogStash::Filters::Grok: adding tag {:tag=>"stash-captcha", :level=>:debug, :file=>"/home/lsapp/logstash-1.3.2-flatjar.jar!/logstash/filters/base.rb", :line=>"146"} Event now: {:event=>#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.0.0.1,127.0.0.1 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash", "proxy"=>"127.0.0.1", "client"=>"10.0.0.1", "error"=>["AuthenticationFailureEvent", "{\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"}"], "user1"=>"MICKEY", "epoch_time"=>"1404402178021", "user2"=>"mickey", "minuteinday"=>"642", "reqnumsincerestart"=>"2336498", "concurrentreqs"=>"9", "tags"=>["stash-captcha"]}>, :level=>:debug, :file=>"/home/lsapp/logstash/build/ruby/logstash/filters/grok.rb", :line=>"307"}

So at this point grok has successfully parsed the event

logstash conf

input { stdin { type => "stash-auth" } } filter { if "stash-auth" in [type] { grok { patterns_dir => "patterns" pattern => "%{STASH_CAPTCHA}" add_tag => ["stash-captcha"] } if "stash-captcha" in [tags] { crowd { crowdURL => "https://crowd/rest/usermanagement/1/user?username=" crowdUsername => "username" crowdPassword => "password" timeout => 2 } } } } output { stdout { codec => rubydebug } }

In my custom filter I have event.inspect and can see the data I want is in the event.

event.inspect output

#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.32.145.121,10.32.149.88 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash", "proxy"=>"127.0.0.1", "client"=>"10.0.0.1", "error"=>["AuthenticationFailureEvent", "{\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"}"], "user1"=>"MICKEY", "epoch_time"=>"1404402178021", "user2"=>"mickey", "minuteinday"=>"642", "reqnumsincerestart"=>"2336498", "concurrentreqs"=>"9", "tags"=>["stash-captcha"]}>

Above the field user1 or user2 would provide the info.

In my custom filter how do I reference these fields?

crowd.rb filter

def filter(event) return unless filter?(event) puts event.inspect <--- output shown above response = @resource[what would I use here??] responseHash = JSON.parse(response) email = responseHash["email"] filter_matched(event) end

Details

Assignee

Reporter

Created July 3, 2014 at 7:12 PM
Updated July 8, 2014 at 4:48 PM