Skip to content

Conversation

@Mehak3010
Copy link

Fixes #177

This PR adds a PostGIS integration guide for SedonaDB.

The documentation is written as a Jupyter notebook and rendered to Markdown
using the existing docs build pipeline.

The page covers:

  • A GeoPandas-based workflow for simplicity and exploratory use
  • A high-performance ADBC-based workflow using adbc_ingest() and fetch_arrow()
    to avoid row-wise iteration and intermediate Pandas DataFrames

This PR only updates documentation and does not modify any source code.

@paleolimbot
Copy link
Member

Thank you for opening!

I see a few CI issues (pre-commit run --all-files should take care of the formatting; you'll have to add a comment to the first cell of your notebook with a license for the other. You should be able to copy this pattern from one of the other notebooks.

I'm sorry I didn't get to this today...I will take a look Monday!

@Mehak3010
Copy link
Author

Thanks for the review and for the pointers!

I’ll add the Apache license header to the first cell of the notebook (following the pattern from the existing docs) and run pre-commit run --all-files to address the formatting issues.

I’ll push the updates shortly — thanks, and no worries at all. Looking forward to your review on Monday!

@Mehak3010
Copy link
Author

I’ve added the Apache license header to the first cell of the notebook and addressed the formatting concerns. The updates are pushed.
Looking forward to your review—thanks!

docs/postgis.md Outdated
Comment on lines 51 to 54
> Note:
> SedonaDB is not currently distributed via PyPI.
> To run the SedonaDB examples in this notebook, you must install SedonaDB
> from source or use a development environment where SedonaDB is available.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can remove this part (SedonaDB is distributed via PyPI)

docs/postgis.md Outdated
Comment on lines 57 to 77
## PostGIS Setup

Keep SQL static(do NOT execute).

### Preparing a PostGIS table

```md

The following SQL creates a simple PostGIS table that SedonaDB can read.

```sql
CREATE TABLE my_places (
id SERIAL PRIMARY KEY,
name VARCHAR(100),
geom GEOMETRY(Point, 4326)
);

INSERT INTO my_places (name, geom) VALUES
('New York', ST_SetSRID(ST_MakePoint(-74.006, 40.7128), 4326)),
('Los Angeles', ST_SetSRID(ST_MakePoint(-118.2437, 34.0522), 4326)),
('Chicago', ST_SetSRID(ST_MakePoint(-87.6298, 41.8781), 4326));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of leaving this as unexecuted, I think it would be better to (1) use the built-in PostGIS container we have in the repo (you can start it via docker compose up --detach) and (2) start the tutorial by writing a simple GeoDataFrame into PostGIS. That way when this tutorial needs editing somebody else can easily recreate the data required.

Comment on lines +110 to +113
sd = sedona.db.connect()
df = sd.create_data_frame(gdf)
df.show()
df.schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think outputs like df.show() and df.schema should be shown. If you implement the suggestion about making the notebook reproducible above, it should be easy to execute all cells in the notebook and have the output rendered automatically.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run this notebook such that the cells render?

Comment on lines +199 to +201
```python
df.head(5).show()
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is another output that would be great to show.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run this notebook so that the cells render?

@Mehak3010
Copy link
Author

just a quick note that I’ve pushed the latest updates addressing the feedback. Thanks!

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for continuing to work on this!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you revert the changes to this file?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure!

Comment on lines +110 to +113
sd = sedona.db.connect()
df = sd.create_data_frame(gdf)
df.show()
df.schema
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run this notebook such that the cells render?

Comment on lines +199 to +201
```python
df.head(5).show()
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you run this notebook so that the cells render?

Comment on lines +149 to +150

```python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you briefly explain these steps? Like:

First, use adbc_ingest to insert the values into a temporary table with geometry encoded as WKB. Then, use a CREATE TABLE AS query to create a table with the appropriate geometry column.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good suggestion — I’ll add a brief step-by-step explanation in the notebook describing the adbc_ingest → temporary table → CREATE TABLE AS flow and push an update shortly

```

### Reading data from PostGIS into SedonaDB using ADBC

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you describe the steps here? Like:

First, write a query that returns geometry as well-known binary (WKB). Then, use SedonaDB to transform that column into a geometry column.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that makes sense — I’ll add a brief explanation in the notebook clarifying that we first query PostGIS to return geometries as WKB, then use SedonaDB to convert that WKB column back into a geometry column.

- `adbc-driver-postgresql`

### Optional: Installing dependencies in a Jupyter environment

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can these two sections be merged?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, sure — I’ll merge these two sections to avoid duplication.


````bash
pip install geopandas sqlalchemy psycopg2-binary adbc-driver-postgresql

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These two bash blocks are missing the closing backticks.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, thanks! I’ll fix the missing closing backticks.

@Mehak3010
Copy link
Author

Regarding running the notebook so the cells render: I attempted to run all cells locally, but rendering the outputs requires a working SedonaDB runtime (which isn’t available via PyPI) and a fully configured PostGIS + SedonaDB development environment.

To keep the notebook reproducible for contributors, I’ve kept all cells executable and focused on correctness rather than committing environment-specific outputs. Please let me know if you’d prefer placeholder outputs or screenshots instead.

@paleolimbot
Copy link
Member

docker compose up + pip install "apache-sedona[db]" should have you covered for this one (you would need to install a nightly or development version if you were using recent features but I don't think that you are).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

epic: new docs pages

3 participants