KV-STK is an efficient and scalable spatio-temporal keywords query framework.
- Efficient, KV-STK propose a hybrid index that combining in-memory index with on-disk index to answer STK queries based on key-value data stores. The hybrid index effectively ensures the querying and insertion efficiency.
- Scalable: KV-STK design a framework for managing in-memory filters to support the acquisition and eviction of filters within a given memory threshold. The framework ensures excellent scalability of KV-STK.
KV-STK consists of three main modules, Key Generator, Filter and Filter Manager.
Key Generator is used to convert object into a one-dimensional key and convert query into multi key ranges. The main code is in the org.urbcomp.startdb.stkq.keyGenerator package. The implementation of Hilbert Curve is from https://github.com/davidmoten/hilbert-curve.
Filter is used to narrow down the key ranges obtained from Key Generator. The main code is in the org.urbcomp.startdb.stkq.filter package. The implementation of InfiniFilter is referred to the original author's github: https://github.com/nivdayan/FilterLibrary.
Filter Manager is used to evict the “cold” filters out of the memory and load necessary filters from the key-value data stores. The main code is in the org.urbcomp.startdb.stkq.filter.manager package.
The default query parameters are set according to the table below and can be adjusted for different data sets.
Parameter | Range | Default |
---|---|---|
Spatial Query Range(KM^2) | 1x1,2x2,3x3,4x4,5x5 | 3x3 |
Temporal Query Range(h) | 1,2,3,4,5 | 3 |
Keywords Count | 1,2,3,4,5 | 3 |
The following resources need to be downloaded and installed:
- Java 8 download: https://www.oracle.com/java/technologies/downloads/#java8
- Scala download: https://www.scala-lang.org/download/2.12.4.html
- IntelliJ IDEA download: https://www.jetbrains.com/idea/
- git download:https://git-scm.com/download
- maven download: https://archive.apache.org/dist/maven/maven-3/
Download and install jdk-8, IntelliJ IDEA and git. IntelliJ IDEA's maven project comes with maven, you can also use your own maven environment, just change it in the settings.
-
Open IntelliJ IDEA, find the git column, and select Clone...
-
In the Repository URL interface, Version control selects git
-
URL filling: https://anonymous.4open.science/r/StreamingTrajSegment-683E.git
File -> Project Structure -> Project -> Project SDK -> add SDK
Click JDK to select the address where you want to download jdk-8
Please refer to https://www.jetbrains.com/help/idea/get-started-with-scala.html#new-scala-project-sbt
The main code for experiments are in the org.urbcomp.startdb.stkq.experiments package. Before testing, the following works need to be done in advance:
Please edit the zookeeper address in org.urbcomp.startdb.stkq.io.HBaseUtil;
Please run org.urbcomp.startdb.stkq.preProcessing.QueryGenerator to get the queries. You can adjust the path where the query is written for later testing.
Please run org.urbcomp.startdb.stkq.preProcessing.BatchWrite to load the data set to the HBase.
Yelp dataset can be found in following link,tweet dataset is not publicly available for the time being.