Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
248 changes: 125 additions & 123 deletions docs-overrides/main.html
Original file line number Diff line number Diff line change
Expand Up @@ -165,7 +165,7 @@ <h2 class="section-title">
Install SedonaDB, or run Sedona on distributed systems when you need additional scale.
</div>
<div class="bth-group">
<a href="https://github.com/apache/sedona-db" class="btn btn-red">
<a href="sedonadb" class="btn btn-red">
<span class="caption">
Install SedonaDB
</span>
Expand All @@ -176,6 +176,130 @@ <h2 class="section-title">
</div>
</section>

<!-- Section Deploy -->
<section class="section-deploy">
<div class="container">
<h2 class="section-title">Deploy Sedona where you need it</h2>
<div class="section-description editor">Choose the right runtime for your infrastructure, from local setups to distributed and cloud-native systems.</div>
<div class="info-grid">

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/desktop.svg" alt="" class="">
<div class="caption">
Local
</div>
</div>
<h3 class="info-item__title">SedonaDB</h3>
<div class="info-item__description editor">Standalone runtime for local processing and development.</div>
<div class="info-item__cta">
<a href="sedonadb" class="btn-link">
<span class="caption">SedonaDB</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/database.svg" alt="" class="">
<div class="caption">
Batch
</div>
</div>
<h3 class="info-item__title">SedonaSpark</h3>
<div class="info-item__description editor">Distributed batch processing on Apache Spark clusters.</div>
<div class="info-item__cta">
<a href="sedonaspark" class="btn-link">
<span class="caption">SedonaSpark</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/signal.svg" alt="" class="">
<div class="caption">
Streaming
</div>
</div>
<h3 class="info-item__title">SedonaFlink</h3>
<div class="info-item__description editor">Real-time spatial analytics using Apache Flink.</div>
<div class="info-item__cta">
<a href="sedonaflink" class="btn-link">
<span class="caption">SedonaFlink</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/snowflake.svg" alt="" class="">
<div class="caption">
Batch
</div>
</div>
<h3 class="info-item__title">SedonaSnow</h3>
<div class="info-item__description editor">
Native spatial support inside Snowflake environments.
</div>
<div class="info-item__cta">
<a href="sedonasnow" class="btn-link">
<span class="caption">SedonaSnow</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item info-item--wide">
<div class="info-item__tag">
<img src="image/home/deploy/cloud.svg" alt="" class="">
<div class="caption">
Cloud
</div>
</div>
<h3 class="info-item__title">Sedona in the Cloud</h3>
<div class="info-item__description editor">Integrated spatial support in your preferred cloud environment</div>
<div class="info-item__cta">
<a href="setup/overview/" class="btn-link">
<span class="caption">Explore the ecosystem</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

</div>
</div>
</section>


<!-- Section Industries Tabs -->
<section class="section-industries-tabs">
<div class="container">
Expand Down Expand Up @@ -378,128 +502,6 @@ <h3 class="feature-item__title">Familiar</h3>
</section>
-->

<!-- Section Deploy -->
<section class="section-deploy">
<div class="container">
<h2 class="section-title">Deploy Sedona where you need it</h2>
<div class="section-description editor">Choose the right runtime for your infrastructure, from local setups to distributed and cloud-native systems.</div>
<div class="info-grid">

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/desktop.svg" alt="" class="">
<div class="caption">
Local
</div>
</div>
<h3 class="info-item__title">SedonaDB</h3>
<div class="info-item__description editor">Standalone runtime for local processing and development.</div>
<div class="info-item__cta">
<a href="https://github.com/apache/sedona-db" class="btn-link">
<span class="caption">SedonaDB</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/database.svg" alt="" class="">
<div class="caption">
Batch
</div>
</div>
<h3 class="info-item__title">SedonaSpark</h3>
<div class="info-item__description editor">Distributed batch processing on Apache Spark clusters.</div>
<div class="info-item__cta">
<a href="setup/overview/" class="btn-link">
<span class="caption">SedonaSpark</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/signal.svg" alt="" class="">
<div class="caption">
Streaming
</div>
</div>
<h3 class="info-item__title">SedonaFlink</h3>
<div class="info-item__description editor">Real-time spatial analytics using Apache Flink.</div>
<div class="info-item__cta">
<a href="setup/flink/install-scala/" class="btn-link">
<span class="caption">SedonaFlink</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item">
<div class="info-item__tag">
<img src="image/home/deploy/snowflake.svg" alt="" class="">
<div class="caption">
Batch
</div>
</div>
<h3 class="info-item__title">SedonaSnow</h3>
<div class="info-item__description editor">
Native spatial support inside Snowflake environments.
</div>
<div class="info-item__cta">
<a href="setup/snowflake/install/" class="btn-link">
<span class="caption">SedonaSnow</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

<div class="info-item info-item--wide">
<div class="info-item__tag">
<img src="image/home/deploy/cloud.svg" alt="" class="">
<div class="caption">
Cloud
</div>
</div>
<h3 class="info-item__title">Sedona in the Cloud</h3>
<div class="info-item__description editor">Integrated spatial support in your preferred cloud environment</div>
<div class="info-item__cta">
<a href="setup/overview/" class="btn-link">
<span class="caption">Explore the ecosystem</span>
<span class="icon">
<svg width="19" height="19" viewBox="0 0 19 19" fill="none" xmlns="http://www.w3.org/2000/svg">
<path d="M3.95831 9.5H15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
<path d="M9.5 3.95825L15.0417 9.49992L9.5 15.0416" stroke="#CA463A" stroke-width="1.5" stroke-linecap="round" stroke-linejoin="round" />
</svg>
</span>
</a>
</div>
</div>

</div>
</div>
</section>

<!-- Section Community -->
<section class="section-community">
Expand Down
Binary file added docs/image/nyc_base_water.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
111 changes: 111 additions & 0 deletions docs/sedonaflink.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,111 @@
<!--
Licensed to the Apache Software Foundation (ASF) under one
or more contributor license agreements. See the NOTICE file
distributed with this work for additional information
regarding copyright ownership. The ASF licenses this file
to you under the Apache License, Version 2.0 (the
"License"); you may not use this file except in compliance
with the License. You may obtain a copy of the License at

http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing,
software distributed under the License is distributed on an
"AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
KIND, either express or implied. See the License for the
specific language governing permissions and limitations
under the License.
-->

# SedonaFlink

SedonaFlink integrates geospatial functions into Apache Flink, making it an excellent option for streaming pipelines that utilize geospatial data.

Here are some example SedonaFlink use cases:

* Read geospatial data from Kafka and write to Iceberg
* [Analyze real-time traffic density](https://www.alibabacloud.com/help/en/flink/realtime-flink/use-cases/analyze-traffic-density-with-flink-and-apache-sedona)
* Real-time network planning and optimization for telecommunication

Here are some example code snippets:

=== "Java"

```java
sedona.createTemporaryView("myTable", tbl)
Table geomTbl = sedona.sqlQuery("SELECT ST_GeomFromWKT(geom_polygon) as geom_polygon, name_polygon FROM myTable")
geomTbl.execute().print()
```

=== "PyFlink"

```python
table_env.sql_query("SELECT ST_ASBinary(ST_Point(1.0, 2.0))").execute().collect()
```

## Key features

* **Real-time geospatial stream processing** for low-latency processing needs.
* **Scalable** processing suitable for large streaming pipelines.
* **Event time processing** with Flink’s time-windowing.
* **Exactly once** processing guarantees.
* **Portable** and easy to run in any Flink runtime.
* **Open source** and managed according to the Apache Software Foundation's guidelines.

## Why Sedona on Flink?

Flink is built for streaming data, and Sedona enhances it with geospatial functionality.

Most geospatial processing occurs in batch systems such as Spark or PostGIS, which is fine for lower-latency use cases.

Sedona on Flink shines when you need to process geospatial data in real-time.

Flink can deliver millisecond-level latency for geospatial queries.

Flink has solid fault tolerance, so your geospatial pipelines won't lose data, even when things break.

Sedona on Flink runs anywhere Flink runs, including Kubernetes, YARN, and standalone clusters.

## How It Works

Sedona integrates directly into Flink's Table API and SQL engine.

You register Sedona's spatial functions when you set up your Flink environment. Then, you can use functions such as `ST_Point`, `ST_Contains`, and `ST_Distance` in your SQL queries.

Sedona works with both Flink's DataStream API and Table API. Use whichever fits your workflow.

The spatial operations run as part of Flink's distributed execution, so your geospatial computations are automatically parallelized across your cluster.

Sedona stores geometries as binary data in Flink's internal format. This keeps memory usage low and processing fast.

When you perform spatial joins, Sedona utilizes spatial indexing under the hood, enabling it to execute queries quickly.

Flink's checkpointing system handles fault tolerance. If a node crashes, your geospatial state is restored from the last checkpoint.

You read geospatial data from sources such as Kafka or file systems, process it using Sedona's spatial functions, and write the results to sinks such as Iceberg.

The entire SedonaFlink pipeline runs continuously, allowing new events to flow through your spatial transformations in real-time.

## Comparison with alternatives

For small datasets, you may not need a distributed cluster and can use SedonaDB.

For large batch pipelines, you can use SedonaSpark.

Here are some direct comparisons of SedonaFlink vs. streaming alternatives.

**SedonaFlink vs. Sedona on Spark Structured Streaming**

Spark Streaming uses micro-batches, whereas Flink processes events one at a time. This can provide Flink with lower latency for some workflows.

Flink's state management is also more sophisticated.

Use Spark if you're already invested in the Spark ecosystem and the Spark Structured Streaming latency is sufficiently low for your use case. Use Flink if you need very low latency.

**Sedona on Flink vs. PostGIS**

PostGIS is great for storing and querying geospatial data for OLTP workflows. But it's not built for streaming.

If you use PostGIS for streaming workflows, you need to constantly query the database from your stream processor, which adds latency and puts load on your database.

SedonaFlink processes geospatial data in-flight, eliminating the need for database round-trips.
Loading