Skip to content

Performance Tuning

Ambud edited this page Jan 1, 2017 · 3 revisions

Performance tuning is an area of continuous enhacement in Linea

Tuning Basics

Linea is basically a Producer - Consumer problem. The maximum throughput impact is gained by tuning the Number of Producers / Number of Consumers ratio.

If producers are producing too much data and the consumers aren't able to keep up, throughput will drop. Throughput will also be impacted due to context switching if producers aren't producing enough data for the number of consumers running.

These ratios are represented in terms of Spout and Bolt parallelism as well as the number of workers running. If your Spout is too quick compared to the downstream Bolts then adding more Bolts will help process more Events given that your Bolts are not locking onto an external resource or have resource contention on external resources.

Profiler is the best tool to help with detecting if your ratios are mis-configured. If you are producer bottlenecked by not having enough consumers you will see the MultiProducer sequencer.get method consuming CPU cycles. If consumers aren't getting enough data you will see BlockWaitStrategy.wait method consuming CPU cycles.

Linea is designed to have workers run on multiple nodes, running more workers on the same node will provide lower performance since networking requires serialization of data.

It's not recommended to go distributed unless you are running to severe scaling problems on a single node.

Tuning Strategy

The best way to tune performance in Linea is to start with a single worker topology and use profiler to benchmark Spouts and Bolts and identify bottleneck areas.

Since Linea is just a simple Java process you can launch topologies directly from you IDE and also run JVisualVM or other profiling tools to analyze performance.

Benchmarks

These numbers were captured by running the Example topology over several runs. Here are the numbers achieved:

  • Single node: ~1M EPS (parallelism 2 for each of spout, acker and bolt)
  • 2 node: ~300K EPS (parallelism of 1 for each of spout, acker and bolt; 2 workers)

EPS = Events per second

Since Linea requires acking on all events, these numbers are for end-to-end processing throughput as measured from Spout.

Benchmark Environment:
Hardware: Intel Core i7 6850K (6C 12T, 15MB L3 Cache) 3.6GHz, 32GB DDR4 3000MHz RAM
Software: Java 8 u111, 4GB Heap / worker

References

Here are a few useful references in further understanding multi-threading in Java:

LMAX Disruptor: https://lmax-exchange.github.io/disruptor/files/Disruptor-1.0.pdf

Tools

JVisualVM: http://docs.oracle.com/javase/6/docs/technotes/tools/share/jvisualvm.html

Clone this wiki locally