%{pattern:subname:type} has some issue

Description

i tried this, when my pattern defined as:

And my /var/log/messages contains content as:
Mar 28 01:20:05 MyVM ntpd[2262]: time reset -15.714794 s

However the result comes out as:
{"@source"=>"file://MyVM/var/log/messages", "@tags"=>[], "@fields"=>{"min"=>[20]}, "@timestamp"=>"2013-03-28T01:24:48.314Z", "@source_host"=>"MyVM", "@source_path"=>"/var/log/messages", "@message"=>"Mar 28 01:20:05 MyVM ntpd[2262]: time reset -15.714794 s", "@type"=>"tyler-syslog"}

Referring to the codes: grok.rb, line 362-371:
if !value.nil? && (!value.empty? rescue true)

  1. Store fields as an array unless otherwise instructed with the

  2. 'singles' config option
    if !event.fields.include?(key) and @singles
    event.fields[key] = value
    else
    event.fields[key] ||= []
    event.fields[key] << value
    end
    end

Please note that, the output is: "min"=>[20], actually, i thought it should be "min"=>["20",20], right? Please correct me if i miss anything.

Gliffy Diagrams

Activity

Show:

logstash beginner April 11, 2013 at 12:07 PM

Not a bug, just misunderstanding.

logstash beginner April 11, 2013 at 12:05 PM

Thanks so much for your patient explanation!
i found my fault, i thought the event fields should be filled with the string value on line 377:

But actually, event.fields[key] is initialized as nil here.
Only when i defined the same key more than once as:

Then for the second key "min", event.fields[key] would be "[20]", and finally event.fields["min"]=[20,05].
Thanks again, and sorry for bothering you!

Philippe Weber April 11, 2013 at 4:44 AM

Hey, that's my previous comment about how coerce is handled
Let's trace the code "together"
0. You have defined one pattern:

that means grok will match this regex against the @message field.
1. Grok does the regular expression matching work on line 330
At this point the match variable is an array containing one hash with the string value {min => "20"}.
Nothing is put inside the event fields yet!
2. Then this match is processed in the loop of line 342
your variables are key="min" and value="20"
3. As you defined coerce to int,line 359
change the content of the value variable from "20" to 20
4.You get to the line you focus on, line 393
your variables are key="min" and value=20,
and event.fields[key] is nil, so it is initialized with an empty array
5. Line 394 we finally push << the value variable to this array so the final expected result.
event.fields[key] = [20]
That's the only moment where information is added to the event.
As a general rule, you won't get TWO values when your GROKPATTERN %{MINUTE} is defined only ONCE

I hope I was clear enough to help you.
(Pretty difficult to do this by non-interactively)

logstash beginner April 11, 2013 at 1:39 AM

Yes, what i did is just your case3:

and i truly got the result as "min => [20]" as you mentioned.
The question is: when i see around the code
https://github.com/logstash/logstash/blob/master/lib/logstash/filters/grok.rb#L393
i thought it should be "min => ["20", 20]", since the value(20) should be append to event.field[key], and not replacing it for code:

i just feel confusing why it come back with "min => [20]". Maybe i miss any other code which dropping the string value?
Thanks.

Philippe Weber April 7, 2013 at 11:49 AM

Maybe better than looking at the code, you should look at the spec tests to understand the expected behavior of the filter https://github.com/logstash/logstash/blob/master/spec/filters/grok.rb
. For sure in this simple case grok filter will not store both string and coerced value.
A good way for you would be to write your own spec to validate your config.
To sum it up, coerce and single are two different features.

Case 1: simple grok

Results is

Case 2: simple grok with singles => true

Results is

Case 3: simple grok with coerce

Results is

Case 3: simple grok with coerce and singles => true

Results is

Fixed

Details

Assignee

Reporter

Labels

Created March 28, 2013 at 3:22 AM
Updated April 19, 2013 at 8:38 PM
Resolved April 11, 2013 at 12:07 PM