-
Notifications
You must be signed in to change notification settings - Fork 198
Description
Look this test I´ve performed in Elassandra with Python.
I created a function to query data using Cassandra driver:
def process_query_cassandra(query, fetch_size = 5000, consistency_level=ConsistencyLevel.LOCAL_ONE):
start = timer()
paging_state = None
rows = []
while True:
statement = SimpleStatement(query, fetch_size = fetch_size, consistency_level=consistency_level)
results = session.execute(statement, paging_state=paging_state)
paging_state = results.paging_state
for row in results.current_rows:
rows.append(row)
if paging_state == None:
break
df = pd.DataFrame(rows)
end = timer()
return df, timedelta(seconds=end-start)
Table f0101 has 872390 rows.
When I query using CQL only, results are OK:
query1 = """
select *
from "dlfinjdep"."f0101"
ALLOW FILTERING
"""
Running Cassandra #1
(22-06-01 12:43) Rows: 872390 seconds: 0:03:17.609349
Running Cassandra #2
(22-06-01 12:46) Rows: 872390 seconds: 0:03:04.289089
However, when I use the option to query ElasticSearch index through CQL, I get different results:
query2 = """
select *
from "dlfinjdep"."f0101"
WHERE es_query='{"query":{"match_all":{}}}'
AND es_options='indices=dlfinjdep-f0101-index'
ALLOW FILTERING
"""
Running Elastic #1
(22-06-01 12:50) Rows: 841350 seconds: 0:03:49.136313
Running Elastic #2
(22-06-01 12:54) Rows: 834372 seconds: 0:03:33.985948