elasticsearch_http fail to handle utf-8 characters in log message
Description
Gliffy Diagrams
Activity

Jie Pan March 14, 2013 at 7:08 AM
Problem resolved in the 1.1.10 dev package
Thank you guys for your quick response _
Jordan Sissel March 14, 2013 at 6:25 AM
This bug has been fixed in master. It was a bug in the http library logstash used.
This build should have the fix:
http://r.logstash.net/jenkins/job/logstash.jar.daily/190/artifact/build/logstash-1.1.10.dev-monolithic.jar

Philippe Weber March 14, 2013 at 6:20 AMEdited
It seems your issue is related to this fix in FTW: https://github.com/jordansissel/ruby-ftw/pull/7
The content length specified in the header would be to short thus truncating the output to elasticsearch.
Jordan has just included a new version of FTW in logstash dependencies, https://github.com/logstash/logstash/commit/4cf6a680816ceaed043bfb797efb78cf32792ac0
Can you please update your custom logstash build and check again if the problem is resolved

Jie Pan March 14, 2013 at 6:00 AM
btw, my locale is "en_US.UTF-8"
Details
Details
Assignee

Reporter

elasticsearch version: 0.20.5
Starting logstash using:
java -jar logstash-1.1.9-monolithic.jar agent -f logstash.conf
logstash.conf:
input {
file {
path => "/tmp/testdoc"
type => "php_error"
}
}
filter {
multiline {
type => "php_error"
pattern => "^["
negate => true
what => "previous"
}
}
output {
elasticsearch_http {
host => "192.168.8.10"
flush_size => 1
}
file {
path => "/tmp/logstash-input.%{@type}.log"
}
}
At the beginning, /tmp/testdoc is a empty file. After I execute:
echo "[2] 你好,熊猫" >> /tmp/testdoc
es throw an exception:
[2013-03-14 10:49:28,004][DEBUG][action.index ] [Earthquake] [logstash-2013.03.14][0], node[JhAc_GZeRsqWneXm_1OVWQ], [P], s[STARTED]: Failed to execute [index {[logstash-2013.03.14][php_error][cFNqAKgfTYCdf5HGeRsTtw], source[{"@source":"file://DeVServer/tmp/testdoc","@tags":[],"@fields":{},"@timestamp":"2013-03-14T02:49:27.291Z","@source_host":"DeVServer","@source_path":"/tmp/testdoc","@message":"[2] 你好,熊猫","@type":"p]}]
org.elasticsearch.index.mapper.MapperParsingException: Failed to parse [@type]
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:320)
at org.elasticsearch.index.mapper.object.ObjectMapper.serializeValue(ObjectMapper.java:587)
at org.elasticsearch.index.mapper.object.ObjectMapper.parse(ObjectMapper.java:459)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:486)
at org.elasticsearch.index.mapper.DocumentMapper.parse(DocumentMapper.java:430)
at org.elasticsearch.index.shard.service.InternalIndexShard.prepareCreate(InternalIndexShard.java:297)
at org.elasticsearch.action.index.TransportIndexAction.shardOperationOnPrimary(TransportIndexAction.java:211)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction.performOnPrimary(TransportShardReplicationOperationAction.java:532)
at org.elasticsearch.action.support.replication.TransportShardReplicationOperationAction$AsyncShardOperationAction$1.run(TransportShardReplicationOperationAction.java:430)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
at java.lang.Thread.run(Thread.java:722)
Caused by: org.elasticsearch.common.jackson.core.JsonParseException: Unexpected end-of-input in VALUE_STRING
at [Source: [B@403ecbf2; line: 1, column: 413]
at org.elasticsearch.common.jackson.core.JsonParser._constructError(JsonParser.java:1378)
at org.elasticsearch.common.jackson.core.base.ParserMinimalBase._reportError(ParserMinimalBase.java:599)
at org.elasticsearch.common.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:532)
at org.elasticsearch.common.jackson.core.base.ParserMinimalBase._reportInvalidEOF(ParserMinimalBase.java:526)
at org.elasticsearch.common.jackson.core.base.ParserBase.loadMoreGuaranteed(ParserBase.java:432)
at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser._finishString2(UTF8StreamJsonParser.java:2111)
at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser._finishString(UTF8StreamJsonParser.java:2092)
at org.elasticsearch.common.jackson.core.json.UTF8StreamJsonParser.getText(UTF8StreamJsonParser.java:275)
at org.elasticsearch.common.xcontent.json.JsonXContentParser.text(JsonXContentParser.java:83)
at org.elasticsearch.common.xcontent.support.AbstractXContentParser.textOrNull(AbstractXContentParser.java:106)
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:281)
at org.elasticsearch.index.mapper.core.StringFieldMapper.parseCreateField(StringFieldMapper.java:46)
at org.elasticsearch.index.mapper.core.AbstractFieldMapper.parse(AbstractFieldMapper.java:307)
... 11 more
and logstash will keep retrying to post the log to es, result in full screen of message(es just denied them after the first post):
Error writing to elasticsearch {:response=>#<FTW::Response:0x819634c @headers=FTW::HTTP::Headers <{"content-type"=>"text/plain; charset=UTF-8", "content-length"=>"74"}>, @body=<FTW::Connection(@3924) @destinations=["192.168.8.10:9200"] @connected=true @remote_address="192.168.8.10" @secure=false >, @status=400, @logger=#<Cabin::Channel:0x85d044f @subscriber_lock=#<Mutex:0x86027a9>, @metrics=#<Cabin::Metrics:0x86027a6 @channel=#<Cabin::Channel:0x85d044f ...>, @metrics={}, @metrics_lock=#<Mutex:0x86027bb>>, @data={}, @subscribers={}, @level=:info>, @reason="Bad Request", @version=1.1>, :response_body=>"No handler found for uri /logstash-2013.03.14/php_error and method [GET]", :level=>:error}
It's all ok with the "file" output, the utf-8 characters were successfully captured in /tmp/logstash-input.php_error.log.
But, if I change the output from elasticsearch_http to elasticsearch, with configuration below(shouldn't the plugin be used with es version 0.20.2 ?
):
elasticsearch {
host => "192.168.8.10"
}
the utf-8 problem was gone, everything fine. I could see those log messages in Kibana, es head...etc.