We're updating the issue view to help you get more done. 

Issue with elasticsearch input

Description

I configured logstash to retreive all data of one index from elasticsearch using elasticsearch plugin -> http://logstash.net/docs/1.2.2/inputs/elasticsearch

The output is to a file. But, some strange behaviour ocurred: Instead of logstash write all documents collected from that index to the output file, it wrote to the index itself, duplicating all of the documents (4 millions). Now, I dont know how can I search for the duplicate entries because it copied all of the fields exactly as they are, changing only the _id field.

My logstash instance is running loading only one config file:

1 2 3 root@LogServer-6-27:/opt/logstash/conf# ps aux |grep logstash root 12791 12.1 16.6 4534288 1361804 ? SNl 06:44 107:43 /usr/bin/java -jar /opt/logstash/bin/logstash.jar agent -f /opt/logstash/conf/logstash.conf --log /var/log/logstash.log root 19782 0.0 0.0 9392 932 pts/2 R+ 21:33 0:00 grep --color=auto logstash

/opt/logstash/logstash.conf

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 input { tcp { port => 5140 type => "ms_eventlog" codec => line } # tcp { # port => 5141 # type => "ms_eventlog_santander" # codec => line # } tcp { port => 5142 type => "ms_dhcp_auditlog" codec => line } elasticsearch { host => "localhost" index => "logstash-2013.11.07" type => "es_reindex" } file { start_position => "beginning" sincedb_path => "/opt/logstash/sincedb" path => [ "/var/log/logstash.conf" ] codec => rubydebug type => "localhost" } redis { type => "redis_evt" host => "127.0.0.1" data_type => "list" key => "logstash" port => 6379 codec => json } } filter { if [type] == "ms_eventlog" or [type] == "ms_eventlog_santander" or [type] == "ms_dhcp_auditlog" or [type] == "es_reindex" { json { source => "message" remove_field => [ "EventReceivedTime", "SourceModuleName", "SourceModuleType", "message" ] } grok { match => [ "host", "%{IP:source_host}" ] } mutate { remove_field => [ "host" ] } } if [type] == "ms_dhcp_auditlog" or [type] == "es_reindex" { date { match => [ "EventTime", "dd/MM/YY HH:mm:ss" ] target => "EventTime" } } } output { if [type] == "ms_eventlog" or [type] == "redis_evt" { elasticsearch { host => localhost } } if [type] == "es_reindex" { # elasticsearch { # host => localhost # index => "logstash-2013.11.07_new" # } file { path => [ '/elasticsearch_data/json.log' ] } } if [type] == "ms_eventlog_santander" { elasticsearch { host => localhost index => "logstash-santander-%{+YYYY.MM.dd}" } } if [type] == "ms_dhcp_auditlog" { elasticsearch { host => localhost index => "logstash-dhcp-%{+YYYY.MM.dd}" } # file { # path => [ '/tmp/json.log' ] # } } if [type] == "localhost" { elasticsearch { host => localhost index => "logstash-localhost-%{+YYYY.MM.dd}" } } }

What I did wrong?

Environment

None

Status

Assignee

Logstash Developers

Reporter

Bruno Galindro da Costa

Affects versions

1.2.2

Priority