Elastic Search (ELK)

Published Wed, Jan 2, 2019

Estimated reading time: 3 min

Introduction

Elastic Search is used when you want quickly search through lots of data in near real time. You can find it on https://www.elastic.co/. It provides distributed system for quering and filtering stored data through API and is compatible with JSON out of the box. I used Apache Solr in the past when I was working on projects that needed full text search capabilities. But because because Elastic Search is able not to just search text but easily query, filter and group data I quickly switched. It also much easily able to scale and expand the cluster (automatic shard-rebalancing).

Elastic Search is designed to run in cluster which consists from Nodes and has clross-cluster replication feature which is used when main cluster is down and you need a hot backup. A database in RDBMS view is called Index which in short means a collection of documents with similar characteristics. The Document is like row in RDBMS, and is basically a JSON file.

Shards

When you create an index, you can simply define the number of shards you want. Each shard is fully-functional “index” that can be hosted on different nodes in the cluster which allows to horizontally split data volume and make parallelization/high availability to happen.

ELK

Search engine is just a core, what you also commonly get is the full stack - Logstash and Kibana and these combination is called ELK.

Kibana

From visualization point of view - it is similiar to Grafana which I quite heavily use almost on every project.

It lets you visualize Elasticsearch data - there are number of classic charts (histograms, line and pie charts etc.) and it also contains Geo data vizualization. You can also chart data with relationships (drill-downs). Please find more details on https://www.elastic.co/kibana.

Logstash

Server-side data processing pipeline which is able to ingest data from multiple data sources, transform it and then send it to Elastic Search to store it - and query it. There are bundle of filters available making a job easy for you - like for example anonymization of data.

Please find more details on https://www.elastic.co/logstash

Summary

In a way, Logstash is similar to Prometheus when you want to store metrics and monitor the application. But when you need something more than just a metrics - like searching, it is better to start using ELK. And once you do, it makes sense to use it also for metrics gathering. The benefit of Prometheus-Kibana solution is low HW requirements in comparison to ELK.

Usecases

Fast Searchable datastore
Vizualization for existing databases or services buses (Apache Kafka)
Subscriptions to event - alerting and conditional watch capabilities
Metrics, monitoring and analytics capabilities (keep in mind a bit lightweight Prometheus)