Issues

Select view

Select search mode

 

SQS Input - slow performance

Description

Logstash SQS input performance seems to be very slow. The best numbers we observed from the Metrics filter output are:
~80-90 messages per second (use_ssl => true) - we do need to use SSL..
~200 messages per second (use_ssl => false)
I'd expect these numbers to be significantly higher...

Here's what I've tried to increase throughput:

  • Increased JVM heap (to 1G, 3G, 6G...)

  • Increased "threads" in SQS input configuration for Logstash (to 50, 100, 300, 500...)

  • Increased filter thread pool via the "-w" flag (to 20, 50, 100, 300, 500...)

  • Increased output thread pool (workers/threads settings) (to 20, 50, 100...)

  • Even tried to only have the SQS input defined (no filter, no output) - SQS queue counts seem to decrease at approximately the same rate

Environment:
m3.xlarge EC2 instance on AWS (64-bit, vCPU: 4, ECU: 13, 15 GB RAM, 2 x 40 SSD instance storage, Network Performance: High)
CentOS_6.3_x64_v5.8.8
JDK 1.7.0_51
Logstash: 1.3.3

Logstash agent command:

logstash.conf:

Attachments

3

Gliffy Diagrams

Details

Assignee

Reporter

Affects versions

Created March 6, 2014 at 9:42 PM
Updated February 14, 2015 at 6:00 AM

Activity

Show:

Al Belsky November 10, 2014 at 10:47 PM

Updated sqs_jsdk input for Logstash 1.4.2

Al Belsky November 10, 2014 at 10:44 PM

Logstash 1.4.2 packaging format is different - here's a link to logstash-1.4.2 tar, which contains the updated sqs_jsdk.rb file, as well as Java AWS SDK and its dependencies: http://goo.gl/WRjPrS

Here's the list of changes we made in order to add the sqs_jsdk.rb input to Logstash 1.4.2 tar:

  • Add sqs_jsdk.rb to: logstash-1.4.2\lib\logstash\inputs

  • Add AWS SDK for Java, along with its dependencies, to logstash-1.4.2\extra_deps. Here's the list of files we added to this folder:
    aws-java-sdk-1.7.9.jar
    commons-codec-1.6.jar
    commons-logging-1.1.1.jar
    httpclient-4.2.jar
    httpcore-4.2.jar
    jackson-annotations-2.1.1.jar
    jackson-core-2.1.1.jar
    jackson-databind-2.1.1.jar
    joda-time-2.5.jar

Al Belsky March 17, 2014 at 4:51 PM

I wrote an SQS input that uses Java AWS SDK. With this implementation, I observed the rates of 6,000+ messages per second (the rate drops to 2,200 with our Grok filter, but that's another story...) :

Here's what I did to add the new SQS input and the Java AWS SDK to logstash.jar:

  • added aws-java-sdk.jar and all of its jar dependencies to logstash-1.3.3-flatjar.jar/ext

  • deleted logstash-1.3.3-flatjar.jar/org/apache/http package, as its version (4.1) wasn't compatible with the http client the Java AWS SDK required (4.2)

  • added sqs_jsdk.rb to logstash-1.3.3-flatjar.jar/logstash/inputs

Since it looks like Amazon's primary focus is on the Java AWS SDK, rather than on Ruby SDK, it seems that Logstash would greatly benefit from using the Java AWS SDK, in my opinion.

Al Belsky March 10, 2014 at 6:28 PM

It looks like the SQS client in Ruby AWS SDK is the bottleneck. I attached the sqs.rb file that tests the client throughput directly (outside of Logstash), and the best throughput I got was 215 messages per second with 25 threads.

The Java AWS SDK seems to have a lot of optimization options in terms of pooling and available clients (such as AmazonSQSAsyncClient), which Ruby AWS SDK doesn't seem to have. This page states that, when using batch API from the Java SDK, the expected throughput is 2,500 messages per second.

I'll try to integrate Logstash with the Java AWS SDK and see what kind of throughput we get with the Java SQS client...

Loading...