Issues
2 of 2
Logstash may emit duplicates when restarted
Description
Attachments
1
Gliffy Diagrams
Details
Details
Assignee
Logstash Developers
Logstash DevelopersReporter
Zdenek Pavlas
Zdenek PavlasAffects versions
Created May 29, 2014 at 11:44 AM
Updated May 29, 2014 at 11:51 AM
Activity
Show:
Logstash bundes a custom version of the filewatch gem, which is buggy. Its reads data in 16kB chunks, and tries to sync the sincedb after each chunk. However, this is flawed, because 1) mid-line file offsets are being saved 2) when all chunks including the last one are processed in less than 10s, the last position (eof) is not saved. When logstash is killed and restarted, duplicate entries are emitted.
sincedb should not be updated in the read loop, but only at the end. https://github.com/jordansissel/ruby-filewatch has the correct code.