Issues
SQS Input - slow performance
Description
Attachments
Gliffy Diagrams
Details
Details
Assignee

Reporter

Affects versions
Activity

Al Belsky November 10, 2014 at 10:47 PM
Updated sqs_jsdk input for Logstash 1.4.2

Al Belsky November 10, 2014 at 10:44 PM
Logstash 1.4.2 packaging format is different - here's a link to logstash-1.4.2 tar, which contains the updated sqs_jsdk.rb file, as well as Java AWS SDK and its dependencies: http://goo.gl/WRjPrS
Here's the list of changes we made in order to add the sqs_jsdk.rb input to Logstash 1.4.2 tar:
Add sqs_jsdk.rb to: logstash-1.4.2\lib\logstash\inputs
Add AWS SDK for Java, along with its dependencies, to logstash-1.4.2\extra_deps. Here's the list of files we added to this folder:
aws-java-sdk-1.7.9.jar
commons-codec-1.6.jar
commons-logging-1.1.1.jar
httpclient-4.2.jar
httpcore-4.2.jar
jackson-annotations-2.1.1.jar
jackson-core-2.1.1.jar
jackson-databind-2.1.1.jar
joda-time-2.5.jar

Al Belsky March 17, 2014 at 4:51 PM
I wrote an SQS input that uses Java AWS SDK. With this implementation, I observed the rates of 6,000+ messages per second (the rate drops to 2,200 with our Grok filter, but that's another story...) :
Here's what I did to add the new SQS input and the Java AWS SDK to logstash.jar:
added aws-java-sdk.jar and all of its jar dependencies to logstash-1.3.3-flatjar.jar/ext
deleted logstash-1.3.3-flatjar.jar/org/apache/http package, as its version (4.1) wasn't compatible with the http client the Java AWS SDK required (4.2)
added sqs_jsdk.rb to logstash-1.3.3-flatjar.jar/logstash/inputs
Since it looks like Amazon's primary focus is on the Java AWS SDK, rather than on Ruby SDK, it seems that Logstash would greatly benefit from using the Java AWS SDK, in my opinion.

Al Belsky March 10, 2014 at 6:28 PM
It looks like the SQS client in Ruby AWS SDK is the bottleneck. I attached the sqs.rb file that tests the client throughput directly (outside of Logstash), and the best throughput I got was 215 messages per second with 25 threads.
The Java AWS SDK seems to have a lot of optimization options in terms of pooling and available clients (such as AmazonSQSAsyncClient), which Ruby AWS SDK doesn't seem to have. This page states that, when using batch API from the Java SDK, the expected throughput is 2,500 messages per second.
I'll try to integrate Logstash with the Java AWS SDK and see what kind of throughput we get with the Java SQS client...
Logstash SQS input performance seems to be very slow. The best numbers we observed from the Metrics filter output are:
~80-90 messages per second (use_ssl => true) - we do need to use SSL..
~200 messages per second (use_ssl => false)
I'd expect these numbers to be significantly higher...
Here's what I've tried to increase throughput:
Increased JVM heap (to 1G, 3G, 6G...)
Increased "threads" in SQS input configuration for Logstash (to 50, 100, 300, 500...)
Increased filter thread pool via the "-w" flag (to 20, 50, 100, 300, 500...)
Increased output thread pool (workers/threads settings) (to 20, 50, 100...)
Even tried to only have the SQS input defined (no filter, no output) - SQS queue counts seem to decrease at approximately the same rate
Environment:
m3.xlarge EC2 instance on AWS (64-bit, vCPU: 4, ECU: 13, 15 GB RAM, 2 x 40 SSD instance storage, Network Performance: High)
CentOS_6.3_x64_v5.8.8
JDK 1.7.0_51
Logstash: 1.3.3
Logstash agent command:
logstash.conf: