filter chaining question

Description

I am attempting to filter through grok with named fields, then filter through a custom filter.

After grok when I call the custom filter the event seems to be the same as before grok.

I am confused on how to chain filters and maintain the changes as the event trickles through them.

input {
  stdin { type => "stash-auth" }
}
filter {

        if "stash-auth" in [type] {
                grok { 
                        patterns_dir => "patterns"
                        pattern => "%{STASH_CAPTCHA}"
                        add_tag => ["stash-captcha"]
                }
                #event being passed to next block is still same as before grok
                if "stash-captcha" in [tags] {
                        mutate {
                                add_field => { "userName" => "%{user1}" }
                        }
                        crowd {
                                crowdURL => "https://crowd/rest/usermanagement/1/user?username="
                                timeout => 2
                        }
                }
        }

}
output {
  stdout { codec => rubydebug }
}

I took the basic DNS filter and started with that.

class LogStash::Filters::Crowd < LogStash::Filters::Base

  config_name "crowd"
  milestone 1

  # Lookup email address of username.
  #config :userName, :validate => :string

  # Determine what action to do: append or replace the values in the field
  # specified under "username"
  config :action, :validate => [ "append", "replace" ], :default => "append"

  # Atlassian Crowd REST API URL
  config :crowdURL, :validate => :string

  # RestClient timeout
  config :timeout, :validate => :number, :default => 2

  public
  def register
    require "json"
    require "rest_client"
    @resource = RestClient::Resource.new(@crowdURL,
                        :user => "secret",
                        :password => "secret",
                        :timeout => @timeout,
                        :accept => 'application/json')
  end # def register

  public
  def filter(event)
        puts event[userName] # <--- NOTHING prints here
        puts event.inspect   # <--- The original message from before grok prints here
    return unless filter?(event)
    @response = @resource[event[userName]].get # <--- undefined local variable or method `userName'
    @responseHash = JSON.parse(@response)
    @email = @responseHash["email"]


    filter_matched(event)
  end


end # class LogStash::Filters::Crowd

The grok filter works, if I run it without the second filter for crowd I get the results I want in output.

Gliffy Diagrams

Activity

Show:

Zachary Buckholz July 8, 2014 at 4:45 PM

Thank you very much Philippe, your input has been very valuable.

I will document my experience and hopefully others will be able to learn from my confusion.

All is working now, I implemented your recommended changes.

It clicked in my head this morning that event["something"] is completely different than event[something] <--- without the surrounding double quotes.

I also got the params working for rest-client. I think my first failed attempt as shown at the beginning of this ticket was due to my lack of understanding how event[something] is different than event["something"].

Thanks again!

Philippe Weber July 8, 2014 at 4:48 AM

Yes you're right for the resource creation, but you should be able to put it as a field variable and declare it once in the register method
Have a (maybe second?) look to rest_client readme, https://github.com/rest-client/rest-client
you should be able to pass your param only when calling the get method

Should look more like this:

public
  def register
    require "json"
    require "rest_client"
    @resource = RestClient::Resource.new(@crowdURL,
                        :user => @crowdUsername,
                        :password => @crowdPassword,
                        :timeout => timeout)
  end # def register

  public
  def filter(event)
    return unless filter?(event)
        if @username_field ##As you declare the required=>true, this will never fail
                event["username_field"] = @username_field ## why do you store in each event the filter configuration ?
                username = event[@username_field] ## However this can be null if the event does not contains the field so should be tested
                response = @resource.get(:accept => 'json', :params => {:username => username})
                responseHash = JSON.parse(response)
                email = responseHash["email"]
                filter_matched(event)  #This should be the last call of a successful execution of the filter
                event['email'] = email
        end
  end

Zachary Buckholz July 7, 2014 at 10:21 PM

I got it working through a lot of trial and error. But it's still very confusing on how logstash handles an event when passed through the filter chain.

The documentation is not very clear; it's very high-level. Maybe I was looking for more details than needed.

Here is what I ended up doing: I don't like the solution, and will have to refactor.

logstash.conf

if "stash-auth" in [type] {
        grok {
                patterns_dir => "patterns"
                pattern => "%{STASH_CAPTCHA}"
                add_tag => ["stash-captcha"]
        }
        if "stash-captcha" in [tags] {
                crowd { 
                        crowdURL => "https://crowd/rest/usermanagement/1/user"
                        crowdUsername => "username"
                        crowdPassword => "password"
                        timeout => 2
                        username_field => "user1" 
                }       
        }       
}

crowd.rb

# Atlassian Crowd Filter
#
# This filter will lookup user email from Crowd REST API using username.
# Before using this you must create an application account in Crowd and
# allow the IP of your logstash indexing server access.
#

require "logstash/filters/base"
require "logstash/namespace"

# The Atlassian Crowd filter performs a lookup of a user email address
# if given a username.
#
# The config should look like this:
#
#     filter {
#       crowd {
#               crowdURL => "https:/crowd/rest/usermanagement/1/user"
#               crowdUsername => "username"
#               crowdPassword => "password"
#               timeout => 2
#               username_field => "user1"
#       }
#     }
#
class LogStash::Filters::Crowd < LogStash::Filters::Base

  config_name "crowd"
  milestone 1

  # Username field that contains look up value, in my grok filter I parse the stash logs and assign the user having a captcha problem to user1
  config :username_field, :validate => :string, :required => true

  # Determine what action to do: append or replace the values in the field
  # specified under "username"
  config :action, :validate => [ "append", "replace" ], :default => "append"

  # Atlassian Crowd REST API URL
  config :crowdURL, :validate => :string, :required => true

  # Atlassian Crowd REST API Username
  config :crowdUsername, :validate => :string, :required => true

  # Atlassian Crowd REST API Password
  config :crowdPassword, :validate => :string, :required => true

  # RestClient timeout
  config :timeout, :validate => :number, :default => 2


  public
  def register
    require "json"
    require "rest_client"
  end # def register

  public
  def filter(event)
    return unless filter?(event)
        if @username_field
                event["username_field"] = @username_field
                username = event[username_field]
                completeURL = crowdURL + "?username" + "=" + username
                resource = RestClient::Resource.new(completeURL,
                        :user => crowdUsername,
                        :password => crowdPassword,
                        :timeout => timeout)
                response = resource.get(:accept => 'json')
                responseHash = JSON.parse(response)
                email = responseHash["email"]
                filter_matched(event)
                event['email'] = email
        end
  end


end # class LogStash::Filters::Crowd

I am not happy with this because I am under the impression I am supposed to create the rest-client object in the def register than update the params in the def filter as it's needed.

I wasn't able to get the rest-client params to work with GET, the Atlassian REST service uses ? parameters instead of /user/username/value type.

So when logstash is first started it loads the crowd filter into memory, but creates a new rest-client object every time it's called. This is my impression.... Is this what is happening?

Thanks for your help Philippe!

Philippe Weber July 4, 2014 at 7:18 PM

As told, just think of the event as an associative array.
Let's take 2 case

1. You decide that your filter will only lookup the field username, so you "hardcode"

username = event["username"]

 response = @resource[username]
    responseHash = JSON.parse(response)
    email = responseHash["email"]

2. You need the field name to be a parameter of the filter, so you do

config :username_field, :validate => :string
 ...
  username = event[@username_field]
...

and you invoke two instance of your filter

filter {
  crowd {
    username_field => "user1"
    ... (rest of the config)
  }
   crowd {
    username_field => "user2"
    ... (rest of the config)
}

3 Or you change the filter to accept an array of value to convert now that you understand the flow

Zachary Buckholz July 4, 2014 at 6:04 PM

Thanks Philippe, I spent an hour Friday after your comment, and again this morning.

I still seem to be misunderstanding the flow of data from logstash input -> filter (grok -> custom filter)

event before grok filter

filter received {:event=>#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.0.0.1,127.0.0.1 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash"}>, :level=>:info, :file=>"(eval)", :line=>"18"}

event after grok filter

filters/LogStash::Filters::Grok: adding tag {:tag=>"stash-captcha", :level=>:debug, :file=>"/home/lsapp/logstash-1.3.2-flatjar.jar!/logstash/filters/base.rb", :line=>"146"}
Event now:  {:event=>#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.0.0.1,127.0.0.1 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash", "proxy"=>"127.0.0.1", "client"=>"10.0.0.1", "error"=>["AuthenticationFailureEvent", "{\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"}"], "user1"=>"MICKEY", "epoch_time"=>"1404402178021", "user2"=>"mickey", "minuteinday"=>"642", "reqnumsincerestart"=>"2336498", "concurrentreqs"=>"9", "tags"=>["stash-captcha"]}>, :level=>:debug, :file=>"/home/lsapp/logstash/build/ruby/logstash/filters/grok.rb", :line=>"307"}

So at this point grok has successfully parsed the event

logstash conf

input {
  stdin { type => "stash-auth" }
}
filter {

        if "stash-auth" in [type] {
                grok { 
                        patterns_dir => "patterns"
                        pattern => "%{STASH_CAPTCHA}"
                        add_tag => ["stash-captcha"]
                }
             
                if "stash-captcha" in [tags] {

                        crowd {
                                crowdURL => "https://crowd/rest/usermanagement/1/user?username="
                                crowdUsername => "username"
                                crowdPassword => "password"
                                timeout => 2
                        }
                }
        }

}
output {
  stdout { codec => rubydebug }
}

In my custom filter I have event.inspect and can see the data I want is in the event.

event.inspect output

#<LogStash::Event:0x298b2a7 @cancelled=false, @data={"message"=>"10.32.145.121,10.32.149.88 | AuthenticationFailureEvent | MICKEY | 1404402178021 | mickey | {\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"} | 642x2336498x9 | -", "@version"=>"1", "@timestamp"=>"2014-07-04T17:48:58.986Z", "type"=>"stash-auth", "host"=>"stash", "proxy"=>"127.0.0.1", "client"=>"10.0.0.1", "error"=>["AuthenticationFailureEvent", "{\"authentication-method\":\"basic\",\"error\":\"For security reasons you must answer a CAPTCHA question.\"}"], "user1"=>"MICKEY", "epoch_time"=>"1404402178021", "user2"=>"mickey", "minuteinday"=>"642", "reqnumsincerestart"=>"2336498", "concurrentreqs"=>"9", "tags"=>["stash-captcha"]}>

Above the field user1 or user2 would provide the info.

In my custom filter how do I reference these fields?

crowd.rb filter

def filter(event)
    return unless filter?(event)
    puts event.inspect <--- output shown above
    
    response = @resource[what would I use here??]
    responseHash = JSON.parse(response)
    email = responseHash["email"]


    filter_matched(event)
  end

Details

Assignee

Logstash Developers

Reporter

Zachary Buckholz

Labels

atlassianchainingfiltergrokstash

Created July 3, 2014 at 7:12 PM

Updated July 8, 2014 at 4:48 PM

Configure