LogIsland is an event mining platform based on Spark and Kafka to handle a huge amount of log files. (https://github.com/Hurence/logisland). This framework alleviates the burden of deploying and managing complex stream processing applications at the big data level. It works especially well in conjunction with Apache NIFI which can be used to route the raw data into Kafka topic, then Logisland streams all the data into its distributed processors (parsers, complex analysers, aggregators, alerters) which generates events to go into other Kafka topics for further async processing. The strength of the solution is the high throughput of events within a few nodes and the ability to write complex distributed processing plugins in a few lines. The presentation will show the framework at work on an Hadoop cluster with a stream search percolator and an outlier detection processor.
9 months ago passed
.. image:: https://readthedocs.org/projects/logisland/badge/?version=latest :target: https://logisland.readthedocs.io/en/latest/?badge=latest :alt: Documentation Status
<a href='https://logisland.readthedocs.io/en/latest/?badge=latest'> <img src='https://readthedocs.org/projects/logisland/badge/?version=latest' alt='Documentation Status' /> </a>