We're updating the issue view to help you get more done. 

Duplicated documents elasticsearch embedded

Description

In my Windows Server, I've NxLog configured to send Setup eventviewer logs to logstash through JSON. Setup eventviewer has a total of 1991 logs.

Logstash is configured to send that logs to two different destinations: File and elastic search.

Here is my config:

logstash.conf

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 input { tcp { port => 5140 type => "ms_eventlog" codec => line } } filter { if [type] == "ms_eventlog" { json { source => "message" remove_field => [ "EventReceivedTime", "SourceModuleName", "SourceModuleType", "message" ] } } } output { file { path => [ "/elasticsearch_data/json.log" ] } elasticsearch { embedded => true } }

All the 1991 logs is sent to the file, perfectly. But they are duplicated when inserted into elasticsearch. The duplicated events has diferent _id

See example bellow:

I've search for a single event, registered as a RecordNumber of 1991 (my last event). RecordNumber is an eventlog unique identifier incremented by Windows every time an event is generated by the system in a particular log scope (in my case Setup).

/elasticsearch_data/json.log

1 2 3 # grep '"RecordNumber":1991' /elasticsearch_data/json.log {"@timestamp":"2013-11-06T10:39:25.788Z","@version":"1","type":"ms_eventlog","host":"x.x.x.x:56833","EventTime":"2013-11-04 11:50:45","Hostname":"xxxx.xxxx.xx","Keywords":-9223372036854775808,"EventType":"INFO","SeverityValue":2,"Severity":"INFO","EventID":2,"SourceName":"Microsoft-Windows-Servicing","ProviderGuid":"{BD12F3B8-FC40-4A61-A307-B7A013A069C1}","Version":0,"Task":1,"OpcodeValue":0,"RecordNumber":1991,"ProcessID":428,"ThreadID":392,"Channel":"Setup","Domain":"NT AUTHORITY","AccountName":"SYSTEM","UserID":"SYSTEM","AccountType":"User","Message":"Package KB2871777 was successfully changed to the Installed state.","Opcode":"Info"}

In elasticsearch, I have did this query:

elasticsearch query

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 GET /logstash-2013.11.06/_search?pretty { "query": { "filtered": { "query": { "bool": { "should": [ { "query_string": { "query": "*" } } ] } }, "filter": { "bool": { "must": [ { "match_all": {} }, { "range": { "@timestamp": { "from": 1383648840642, "to": "now" } } }, { "fquery": { "query": { "field": { "RecordNumber": { "query": "1991" } } }, "_cache": true } }, { "bool": { "must": [ { "match_all": {} } ] } } ] } } } }, "highlight": { "fields": {}, "fragment_size": 2147483647, "pre_tags": [ "@start-highlight@" ], "post_tags": [ "@end-highlight@" ] }, "size": 500, "sort": [ { "@timestamp": { "order": "desc" } } ] }

An this is the result:

elastic search query results

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 { "took": 12, "timed_out": false, "_shards": { "total": 4, "successful": 4, "failed": 0 }, "hits": { "total": 2, "max_score": null, "hits": [ { "_index": "logstash-2013.11.06", "_type": "logs", "_id": "rhZQM3heR2m5MJGpH-Jm0w", "_score": null, "_source": { "@timestamp": "2013-11-06T10:39:25.788Z", "@version": "1", "type": "ms_eventlog", "host": "xx.xx.xx.xx:56833", "EventTime": "2013-11-04 11:50:45", "Hostname": "xxxx.xxxx.xx", "Keywords": -9223372036854776000, "EventType": "INFO", "SeverityValue": 2, "Severity": "INFO", "EventID": 2, "SourceName": "Microsoft-Windows-Servicing", "ProviderGuid": "{BD12F3B8-FC40-4A61-A307-B7A013A069C1}", "Version": 0, "Task": 1, "OpcodeValue": 0, "RecordNumber": 1991, "ProcessID": 428, "ThreadID": 392, "Channel": "Setup", "Domain": "NT AUTHORITY", "AccountName": "SYSTEM", "UserID": "SYSTEM", "AccountType": "User", "Message": "Package KB2871777 was successfully changed to the Installed state.", "Opcode": "Info" }, "sort": [ 1383734365788 ] }, { "_index": "logstash-2013.11.06", "_type": "logs", "_id": "ByGda_aSTq6BkfN-O5zS-Q", "_score": null, "_source": { "@timestamp": "2013-11-06T10:39:25.788Z", "@version": "1", "type": "ms_eventlog", "host": "xx.xx.xx.xx:56833", "EventTime": "2013-11-04 11:50:45", "Hostname": "xxxx.xxx.xx", "Keywords": -9223372036854776000, "EventType": "INFO", "SeverityValue": 2, "Severity": "INFO", "EventID": 2, "SourceName": "Microsoft-Windows-Servicing", "ProviderGuid": "{BD12F3B8-FC40-4A61-A307-B7A013A069C1}", "Version": 0, "Task": 1, "OpcodeValue": 0, "RecordNumber": 1991, "ProcessID": 428, "ThreadID": 392, "Channel": "Setup", "Domain": "NT AUTHORITY", "AccountName": "SYSTEM", "UserID": "SYSTEM", "AccountType": "User", "Message": "Package KB2871777 was successfully changed to the Installed state.", "Opcode": "Info" }, "sort": [ 1383734365788 ] } ] } }

Is you see, the _id is different for each one.

I already cleaned up entire logstash with the bellow comand, purged elasticsearch index directory (/elasticsearch_data/*) and restarted logstash, but no sucess...

clean elasticsearch

1 # curl -XDELETE 'http://localhost:9200/'

Any idea what might be happening?

Environment

None

Status

Assignee

Logstash Developers

Reporter

Bruno Galindro da Costa

Affects versions

1.2.2

Priority