filter chaining question
Description
Gliffy Diagrams
Activity

Zachary Buckholz July 8, 2014 at 4:45 PM
Thank you very much Philippe, your input has been very valuable.
I will document my experience and hopefully others will be able to learn from my confusion.
All is working now, I implemented your recommended changes.
It clicked in my head this morning that event["something"] is completely different than event[something] <--- without the surrounding double quotes.
I also got the params working for rest-client. I think my first failed attempt as shown at the beginning of this ticket was due to my lack of understanding how event[something] is different than event["something"].
Thanks again!

Philippe Weber July 8, 2014 at 4:48 AM
Yes you're right for the resource creation, but you should be able to put it as a field variable and declare it once in the register method
Have a (maybe second?) look to rest_client readme, https://github.com/rest-client/rest-client
you should be able to pass your param only when calling the get method
Should look more like this:
public
def register
require "json"
require "rest_client"
@resource = RestClient::Resource.new(@crowdURL,
:user => @crowdUsername,
:password => @crowdPassword,
:timeout => timeout)
end # def register
public
def filter(event)
return unless filter?(event)
if @username_field ##As you declare the required=>true, this will never fail
event["username_field"] = @username_field ## why do you store in each event the filter configuration ?
username = event[@username_field] ## However this can be null if the event does not contains the field so should be tested
response = @resource.get(:accept => 'json', :params => {:username => username})
responseHash = JSON.parse(response)
email = responseHash["email"]
filter_matched(event) #This should be the last call of a successful execution of the filter
event['email'] = email
end
end

Zachary Buckholz July 7, 2014 at 10:21 PM
I got it working through a lot of trial and error. But it's still very confusing on how logstash handles an event when passed through the filter chain.
The documentation is not very clear; it's very high-level. Maybe I was looking for more details than needed.
Here is what I ended up doing: I don't like the solution, and will have to refactor.
logstash.conf
if "stash-auth" in [type] {
grok {
patterns_dir => "patterns"
pattern => "%{STASH_CAPTCHA}"
add_tag => ["stash-captcha"]
}
if "stash-captcha" in [tags] {
crowd {
crowdURL => "https://crowd/rest/usermanagement/1/user"
crowdUsername => "username"
crowdPassword => "password"
timeout => 2
username_field => "user1"
}
}
}
crowd.rb
# Atlassian Crowd Filter
#
# This filter will lookup user email from Crowd REST API using username.
# Before using this you must create an application account in Crowd and
# allow the IP of your logstash indexing server access.
#
require "logstash/filters/base"
require "logstash/namespace"
# The Atlassian Crowd filter performs a lookup of a user email address
# if given a username.
#
# The config should look like this:
#
# filter {
# crowd {
# crowdURL => "https:/crowd/rest/usermanagement/1/user"
# crowdUsername => "username"
# crowdPassword => "password"
# timeout => 2
# username_field => "user1"
# }
# }
#
class LogStash::Filters::Crowd < LogStash::Filters::Base
config_name "crowd"
milestone 1
# Username field that contains look up value, in my grok filter I parse the stash logs and assign the user having a captcha problem to user1
config :username_field, :validate => :string, :required => true
# Determine what action to do: append or replace the values in the field
# specified under "username"
config :action, :validate => [ "append", "replace" ], :default => "append"
# Atlassian Crowd REST API URL
config :crowdURL, :validate => :string, :required => true
# Atlassian Crowd REST API Username
config :crowdUsername, :validate => :string, :required => true
# Atlassian Crowd REST API Password
config :crowdPassword, :validate => :string, :required => true
# RestClient timeout
config :timeout, :validate => :number, :default => 2
public
def register
require "json"
require "rest_client"
end # def register
public
def filter(event)
return unless filter?(event)
if @username_field
event["username_field"] = @username_field
username = event[username_field]
completeURL = crowdURL + "?username" + "=" + username
resource = RestClient::Resource.new(completeURL,
:user => crowdUsername,
:password => crowdPassword,
:timeout => timeout)
response = resource.get(:accept => 'json')
responseHash = JSON.parse(response)
email = responseHash["email"]
filter_matched(event)
event['email'] = email
end
end
end # class LogStash::Filters::Crowd
I am not happy with this because I am under the impression I am supposed to create the rest-client object in the def register than update the params in the def filter as it's needed.
I wasn't able to get the rest-client params to work with GET, the Atlassian REST service uses ? parameters instead of /user/username/value type.
So when logstash is first started it loads the crowd filter into memory, but creates a new rest-client object every time it's called. This is my impression.... Is this what is happening?
Thanks for your help Philippe!

Philippe Weber July 4, 2014 at 7:18 PM
As told, just think of the event as an associative array.
Let's take 2 case
1. You decide that your filter will only lookup the field username, so you "hardcode"
username = event["username"]
response = @resource[username]
responseHash = JSON.parse(response)
email = responseHash["email"]
2. You need the field name to be a parameter of the filter, so you do
config :username_field, :validate => :string
...
username = event[@username_field]
...
and you invoke two instance of your filter
filter {
crowd {
username_field => "user1"
... (rest of the config)
}
crowd {
username_field => "user2"
... (rest of the config)
}
3 Or you change the filter to accept an array of value to convert now that you understand the flow

Zachary Buckholz July 4, 2014 at 6:04 PM
Thanks Philippe, I spent an hour Friday after your comment, and again this morning.
I still seem to be misunderstanding the flow of data from logstash input -> filter (grok -> custom filter)
event before grok filter
filter received {:event=>#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.0.0.1,127.0.0.1 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash"}>, :level=>:info, :file=>"(eval)", :line=>"18"}
event after grok filter
filters/LogStash::Filters::Grok: adding tag {:tag=>"stash-captcha", :level=>:debug, :file=>"/home/lsapp/logstash-1.3.2-flatjar.jar!/logstash/filters/base.rb", :line=>"146"}
Event now: {:event=>#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.0.0.1,127.0.0.1 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash", "proxy"=>"127.0.0.1", "client"=>"10.0.0.1", "error"=>["AuthenticationFailureEvent", "{\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"}"], "user1"=>"MICKEY", "epoch_time"=>"1404402178021", "user2"=>"mickey", "minuteinday"=>"642", "reqnumsincerestart"=>"2336498", "concurrentreqs"=>"9", "tags"=>["stash-captcha"]}>, :level=>:debug, :file=>"/home/lsapp/logstash/build/ruby/logstash/filters/grok.rb", :line=>"307"}
So at this point grok has successfully parsed the event
logstash conf
input {
stdin { type => "stash-auth" }
}
filter {
if "stash-auth" in [type] {
grok {
patterns_dir => "patterns"
pattern => "%{STASH_CAPTCHA}"
add_tag => ["stash-captcha"]
}
if "stash-captcha" in [tags] {
crowd {
crowdURL => "https://crowd/rest/usermanagement/1/user?username="
crowdUsername => "username"
crowdPassword => "password"
timeout => 2
}
}
}
}
output {
stdout { codec => rubydebug }
}
In my custom filter I have event.inspect and can see the data I want is in the event.
event.inspect output
#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.32.145.121,10.32.149.88 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash", "proxy"=>"127.0.0.1", "client"=>"10.0.0.1", "error"=>["AuthenticationFailureEvent", "{\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"}"], "user1"=>"MICKEY", "epoch_time"=>"1404402178021", "user2"=>"mickey", "minuteinday"=>"642", "reqnumsincerestart"=>"2336498", "concurrentreqs"=>"9", "tags"=>["stash-captcha"]}>
Above the field user1 or user2 would provide the info.
In my custom filter how do I reference these fields?
crowd.rb filter
def filter(event)
return unless filter?(event)
puts event.inspect <--- output shown above
response = @resource[what would I use here??]
responseHash = JSON.parse(response)
email = responseHash["email"]
filter_matched(event)
end
I am attempting to filter through grok with named fields, then filter through a custom filter.
After grok when I call the custom filter the event seems to be the same as before grok.
I am confused on how to chain filters and maintain the changes as the event trickles through them.
input { stdin { type => "stash-auth" } } filter { if "stash-auth" in [type] { grok { patterns_dir => "patterns" pattern => "%{STASH_CAPTCHA}" add_tag => ["stash-captcha"] } #event being passed to next block is still same as before grok if "stash-captcha" in [tags] { mutate { add_field => { "userName" => "%{user1}" } } crowd { crowdURL => "https://crowd/rest/usermanagement/1/user?username=" timeout => 2 } } } } output { stdout { codec => rubydebug } }
I took the basic DNS filter and started with that.
class LogStash::Filters::Crowd < LogStash::Filters::Base config_name "crowd" milestone 1 # Lookup email address of username. #config :userName, :validate => :string # Determine what action to do: append or replace the values in the field # specified under "username" config :action, :validate => [ "append", "replace" ], :default => "append" # Atlassian Crowd REST API URL config :crowdURL, :validate => :string # RestClient timeout config :timeout, :validate => :number, :default => 2 public def register require "json" require "rest_client" @resource = RestClient::Resource.new(@crowdURL, :user => "secret", :password => "secret", :timeout => @timeout, :accept => 'application/json') end # def register public def filter(event) puts event[userName] # <--- NOTHING prints here puts event.inspect # <--- The original message from before grok prints here return unless filter?(event) @response = @resource[event[userName]].get # <--- undefined local variable or method `userName' @responseHash = JSON.parse(@response) @email = @responseHash["email"] filter_matched(event) end end # class LogStash::Filters::Crowd
The grok filter works, if I run it without the second filter for crowd I get the results I want in output.