Skip to content

tontinton/miso

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

A query engine over semi-structured (JSON) logs.

Similar to trino, but doesn't require a table's schema (column & types) before executing a query.

While trino receives SQL and starts returning results once the entire query finishes (batch ETL), miso's query API receives a sort of "ast" of the query plan (this was done to allow for any query language on the frontend), and streams back the results using SSE (stream ETL).

It supports the same optimization based predicate pushdown mechanism in trino, so a query transpiles as many query steps as its connector supports into the connector's query language, returning fewer documents over the network (which is usually the bottleneck), making queries return much faster.

Here's an example of a query supported today by miso (localqw is a Quickwit connector to localhost:7280/):

# scan localqw.hdfs1
# | union (scan localqw.hdfs2)
# | summarize
#     min_tenant = min(tenant_id)
#     max_tenant = max(tenant_id)
#     count = count()
#   by timestamp, severity_text 
# | join (
#     scan localqw.stackoverflow
#     | where questionId > 80
#   ) on min_tenant, questionId
# | order by count desc;

# curl supports SSE by adding the -N flag.
curl -N -H 'Content-Type: application/json' localhost:8080/query -d '{
  "query": [
    { "scan": ["localqw", "hdfs1"] },
    { "union": [{ "scan": ["localqw", "hdfs2"] }] },
    {
      "summarize": {
        "aggs": {
          "min_tenant": {"min": "tenant_id"},
          "max_tenant": {"max": "tenant_id"},
          "count": "count"
        },
        "by": ["timestamp", "severity_text"]
      }
    },
    {
      "join": [
        {"on": ["min_tenant", "questionId"]},
        [
          { "scan": ["localqw", "stackoverflow"] },
          { "filter": {"gt": ["questionId", "80"]} }
        ]
      ]
    }
    { "sort": [{"by": "count", "order": "desc"}] }
  ]
}'

About

A query engine over semi-structured logs

Resources

Stars

Watchers

Forks

Languages