Introduction

ES|QL is a new query language for Elasticsearch. It is the unified language for all kinds of use cases like simple queries, aggregations, performing correlations, finding logs, etc. It provides simple easy syntax to perform complex queries. If you come from SQL background, You going to find this very handy.

It is a piped separated langugage with a combination of source commands and process commands. The Elasticsearch Query Language (ES|QL) makes use of “pipes” (|) to manipulate and transform data in a step-by-step fashion. This means output of the first step will go as an input for second step.

ES|QL is more than langugage. The execution engine is developed by considering performance in mind. Here ES|QL is not going to convert into Query DSL instead it will be directly executed within Elasticsearch. It operates on blocks at a time instead of per row, targets vectorization and cache locality, and embraces specialization and multi-threading.

ES|QL - Filter, Transform and Analyze

Example

Below is a few examples of ES|QL. I am considering you have an Elasticsearch and kibana is installed and running. Please import the sample dataset (Sample web logs) from kibana. Navigate in side menu -> Management -> Dev Tools to perform the below query.

Source commands

FROM

# Format

POST /_query?format=csv
{
  "query": """
    from kibana_sample_data_logs
  """
}

ROW

POST _query?format=txt
{
  "query":"""
    row a = "Mozilla/5.0 (X11; Linux x86_64; rv:6.0a1) Gecko/20110421 Firefox/6.0a1"
    | dissect a "%{browser}/%{version}"
    | keep browser
  """
}

SHOW

POST /_query?format=txt
{
  "query": """
    show info
  """
}

Processing commands

keep

POST _query?format=txt
{
  "query": """
    from kibana_sample_data_logs
    | keep @timestamp, clientip, host, tags, bytes
  """
}

where, limit, sort

POST _query?format=txt
{
  "query": """
    from kibana_sample_data_logs
    | keep @timestamp, clientip, host, tags, bytes
    | where bytes > 1000
    | sort bytes desc
    | limit 5
  """
}

like

POST _query?format=txt
{
  "query":"""
    FROM sample_data
    | where message like "*error*"
  """
}

GROK, STATS…BY

POST _query?format=txt
{
  "query":"""
    from kibana_sample_data_logs
    | grok agent "%{WORD:browser}/%{NUMBER:version}"
    | keep browser, version, @timestamp
  """
}

POST _query?format=txt
{
  "query":"""
    from kibana_sample_data_logs
    | grok agent "%{WORD:browser}/%{NUMBER:version}"
    | keep browser, version, @timestamp
    | stats count(*) by version
  """
}

DISSECT

POST _query?format=txt
{
  "query": """
    from kibana_sample_data_logs 
    | dissect message "%{ip} - - %{time} %{web_call} "
    | keep ip, time, web_call
  """
}

EVAL

POST /_query?format=txt
{
  "query": """
    from kibana_sample_data_logs
    | eval t = replace(agent, "Mozilla", "Chrome")
    | eval l = length(agent)
    | dissect agent "%{browser}/%{version} "
    | eval lt = left(t, 6)
    | keep lt, browser, version, t, l
  """
}

You can check more string functions.

Data enrichment (ENRICH)


# Create mappings

PUT lang
{
  "mappings": {
    "properties": {
      "lang_id": {
        "type": "keyword"
      },
      "name": {
        "type": "keyword"
      }
    }
  }
}

PUT devs
{
  "mappings": {
    "properties": {
      "lang_id": {
        "type": "keyword"
      },
      "name": {
        "type": "keyword"
      }
    }
  }
}

# Create index "lang"

PUT lang/_bulk
{ "index" : {}}
{ "lang_id": "1x", "name": "java" }
{ "index" : {}}
{ "lang_id": "2x", "name": "php" }
{ "index" : {}}
{ "lang_id": "3x", "name": "node" }
{ "index" : {}}
{ "lang_id": "4x", "name": "python" }
{ "index" : {}}
{ "lang_id": "5x", "name": "ruby" }

# create index "devs"

PUT devs/_bulk
{ "index" : {}}
{ "lang_id": "5x", "developer": "bob" }
{ "index" : {}}
{ "lang_id": "3x", "developer": "mark" }
{ "index" : {}}
{ "lang_id": "1x", "developer": "max" }
{ "index" : {}}
{ "lang_id": "2x", "developer": "david" }
{ "index" : {}}
{ "lang_id": "4x", "developer": "ashish" }


# Create enrich policy

PUT /_enrich/policy/dev_lang
{
  "match": {
    "indices": "lang",
    "match_field": "lang_id",
    "enrich_fields": ["name"]
  }
}

PUT /_enrich/policy/dev_lang/_execute?wait_for_completion=true

POST _query?format=txt
{
  "query":"""
    from devs
    | keep lang_id, name, developer
    | enrich dev_lang on lang_id with name
  """
}