urldecode filter doesn't recognize + as a space

Description

The urldecode filter doesn't seem to treat + characters as spaces.

[vagrant@localhost ~]$ java -jar /opt/logstash/logstash-1.3.3-flatjar.jar agent -e 'filter {urldecode{ field => "message" }}'
Using milestone 2 filter plugin 'urldecode'. This plugin should be stable, but if you see strange behavior, please let us know! For more information on plugin milestones, see http://logstash.net/docs/1.3.3/plugin-milestones {:level=>:warn}
hello%20world
{
"message" => "hello world",
"@version" => "1",
"@timestamp" => "2014-03-09T22:09:21.454Z",
"type" => "stdin",
"host" => "localhost.localdomain"
}
hello+world
{
"message" => "hello+world",
"@version" => "1",
"@timestamp" => "2014-03-09T22:09:27.304Z",
"type" => "stdin",
"host" => "localhost.localdomain"
}

Gliffy Diagrams

Activity

Show:

Darren Foo March 10, 2014 at 7:40 AM

A setting to have the filter treat + as spaces seems reasonable. My use case is the user-agent field in IIS logs which I don't believe I can change unfortunately.

Jordan Sissel March 10, 2014 at 6:09 AM
Edited

To be honest, I'm not sure what the correct behavior is. Some systems interpret "+" to mean space (0x20) and some see a "+" and read it as a plus sign (0x2b).

Best I can offer is that maybe we add a setting to urldecode to allow plus-to-space conversion.

Details

Assignee

Reporter

Affects versions

Created March 9, 2014 at 10:10 PM
Updated March 25, 2015 at 6:03 AM