Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement TripsLayer for animating moving objects and connect to MovingPandas #292

Merged
merged 50 commits into from
Oct 7, 2024
Merged
Show file tree
Hide file tree
Changes from 16 commits
Commits
Show all changes
50 commits
Select commit Hold shift + click to select a range
c582438
bump deck version
kylebarron Dec 5, 2023
1bd1ad3
Add trips layer
kylebarron Dec 5, 2023
8075fbb
Add movingpandas import
kylebarron Dec 5, 2023
8e5a479
Merge branch 'main' into kyle/trips-layer
kylebarron Mar 25, 2024
96ac30b
update lockfile
kylebarron Mar 25, 2024
acaaddb
Restore TimestampAccessor trait
kylebarron Mar 25, 2024
daae1b4
fmt
kylebarron Mar 25, 2024
702a853
Merge branch 'main' into kyle/trips-layer
kylebarron Sep 19, 2024
159ff0b
Use isDefined on the model
kylebarron Sep 19, 2024
d3154b6
Add dev dep
kylebarron Sep 19, 2024
0fb89c4
Update trips layer
kylebarron Sep 19, 2024
72e0bdb
Unpack typing extensions
kylebarron Sep 19, 2024
c707aad
Update timestamp accessor trait
kylebarron Sep 19, 2024
7247452
fix default parameters
kylebarron Sep 19, 2024
fa65b87
fix test
kylebarron Sep 19, 2024
1ced621
Implement generic MovingPandas to GeoArrow
kylebarron Sep 22, 2024
fff32f6
Update lonboard/_geoarrow/movingpandas_interop.py
kylebarron Sep 23, 2024
0dcf3fe
lint
kylebarron Sep 23, 2024
a4bc5b9
fix timestamp type
kylebarron Sep 23, 2024
dd56ef3
Handle timezone in timestamp dtype
kylebarron Sep 23, 2024
9944c66
Merge branch 'main' into kyle/trips-layer
kylebarron Sep 25, 2024
b32939d
trait updates for timestamp accessor
kylebarron Sep 25, 2024
ca9dcd1
Manage precision reduction
kylebarron Sep 27, 2024
37a64c6
Validate list offsets
kylebarron Sep 27, 2024
6addb2e
Store min timestamp on the layer trait
kylebarron Sep 27, 2024
b346810
Add custom timestamp accessor serialization
kylebarron Sep 27, 2024
b08965a
Add trips layer to docs
kylebarron Sep 30, 2024
d2a5202
Update layer docs
kylebarron Sep 30, 2024
150c46b
Update arro3
kylebarron Sep 30, 2024
b94194a
Add ship-data example
kylebarron Sep 30, 2024
2bbcd3e
Improved docs
kylebarron Oct 1, 2024
ddb64f8
Update fiona, arro3
kylebarron Oct 1, 2024
9391062
Mapping back to real time
kylebarron Oct 1, 2024
3297403
wip
kylebarron Oct 2, 2024
6cca53a
Merge branch 'main' into kyle/trips-layer
kylebarron Oct 3, 2024
28938c8
update lockfile
kylebarron Oct 3, 2024
837bb1d
fix import
kylebarron Oct 3, 2024
49b20fc
Fix test
kylebarron Oct 3, 2024
7d262bf
Merge branch 'main' into kyle/trips-layer
kylebarron Oct 4, 2024
4b0ee7b
WIP: air traffic control example
kylebarron Oct 4, 2024
61c5923
Add from_geopandas and from_duckdb methods to TripsLayer
kylebarron Oct 7, 2024
e859ce1
Bump arro3 to 0.4.1
kylebarron Oct 7, 2024
5a026eb
Change interval to fps
kylebarron Oct 7, 2024
3e75895
Change top-level description
kylebarron Oct 7, 2024
79cc520
Update air traffic control notebook
kylebarron Oct 7, 2024
186c419
Update animate
kylebarron Oct 7, 2024
c790753
print tz info
kylebarron Oct 7, 2024
b64b330
Updated ship data example
kylebarron Oct 7, 2024
7f59af9
Add gif to atc notebook
kylebarron Oct 7, 2024
5247af0
Add examples to docs
kylebarron Oct 7, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion lonboard/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
Python library for fast, interactive geospatial vector data visualization in Jupyter.
"""

from . import colormap, controls, layer_extension, traits
from . import colormap, controls, experimental, layer_extension, traits
from ._layer import (
BaseArrowLayer,
BaseLayer,
Expand Down
199 changes: 199 additions & 0 deletions lonboard/_geoarrow/movingpandas_interop.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,199 @@
from __future__ import annotations

import json
from typing import TYPE_CHECKING, Dict, List, Literal, Tuple

import numpy as np
from arro3.core import (
Array,
ChunkedArray,
DataType,
Field,
RecordBatch,
Schema,
Table,
fixed_size_list_array,
list_array,
)

if TYPE_CHECKING:
import movingpandas as mpd
import pyarrow as pa
from movingpandas import TrajectoryCollection


# TODO (lonboard-specific):
# - update timestamp serialization to cast to float32 at that point
# # offset by earliest timestamp
# timestamps -= timestamps.min()

# # Cast to float32
# timestamps = timestamps.astype(np.float32)


def movingpandas_to_geoarrow(
traj_collection: TrajectoryCollection,
) -> Tuple[Table, ChunkedArray]:
"""Convert a MovingPandas TrajectoryCollection to GeoArrow

Args:
traj_collection: _description_

Returns:
_description_
"""
import pyarrow as pa
import shapely

crs = traj_collection.get_crs()
crs_json = crs.to_json_dict() if crs is not None else None

num_coords = 0
num_trajectories = len(traj_collection)
offsets = np.zeros(num_trajectories + 1, dtype=np.int32)
datetime_dtypes = set()
attr_schemas: List[pa.Schema] = []

# Loop the first time to infer offsets for each trajectory
for i, traj in enumerate(traj_collection.trajectories):
traj: mpd.Trajectory

num_coords += traj.size()
offsets[i + 1] = num_coords
datetime_dtypes.add(traj.df.index.dtype)

geom_col_name = traj.get_geom_col()
df_attr = traj.df.drop(columns=[geom_col_name])

# Explicitly drop index because the index is a DatetimeIndex that we convert
# manually later.
arrow_schema = pa.Schema.from_pandas(df_attr, preserve_index=False)
attr_schemas.append(arrow_schema)

assert (
len(datetime_dtypes) == 1
), "Expected only one datetime dtype across all trajectories."
datetime_dtype = list(datetime_dtypes)[0]

# We currently always provision space for XYZ coordinates, and then only use 2d if
# the Z dimension is always NaN
coords = np.zeros((num_coords, 3), dtype=np.float64)

# Infer an arrow time unit from the numpy
time_unit, time_arrow_dtype = infer_time_unit(datetime_dtype)

# TODO: switch this to just using `time_arrow_dtype.bit_width` once
# https://github.com/kylebarron/arro3/pull/190 is released
if time_unit in {"s", "ms"}:
timestamps = np.zeros(num_coords, dtype=np.int32)
elif time_unit in {"us", "ns"}:
timestamps = np.zeros(num_coords, dtype=np.int64)
else:
raise ValueError(f"Unexpected time unit: {time_unit}.")
kylebarron marked this conversation as resolved.
Show resolved Hide resolved

attr_schema = pa.unify_schemas(attr_schemas, promote_options="permissive")
attr_tables: List[pa.Table] = []

# Loop second time to fill timestamps and coords
for i, traj in enumerate(traj_collection.trajectories):
start_offset = offsets[i]
end_offset = offsets[i + 1]

timestamps[start_offset:end_offset] = traj.df.index
coords[start_offset:end_offset, 0] = shapely.get_x(traj.df.geometry)
coords[start_offset:end_offset, 1] = shapely.get_y(traj.df.geometry)
coords[start_offset:end_offset, 2] = shapely.get_z(traj.df.geometry)
kylebarron marked this conversation as resolved.
Show resolved Hide resolved

geom_col_name = traj.get_geom_col()
df_attr = traj.df.drop(columns=[geom_col_name])

attr_table = pa.Table.from_pandas(
traj.df, schema=attr_schema, preserve_index=False
)
attr_tables.append(attr_table)

attr_table = pa.concat_tables(attr_tables, promote_options="none")
attr_table = Table.from_arrow(attr_table)

offsets = Array.from_numpy(offsets)

nested_attr_table = apply_offsets_to_table(attr_table, offsets=offsets)

if np.alltrue(np.isnan(coords[:, 2])):
coord_list_size = 2
# Cast to 2D coords
coords = coords[:, :2]
else:
assert not np.any(
np.isnan(coords[:, 2])
), "Mixed 2D and 3D coordinates not currently supported"
coord_list_size = 3

coords_arr = Array.from_numpy(coords.ravel("C"))
coords_fixed_size_list = fixed_size_list_array(coords_arr, coord_list_size)
linestrings_arr = list_array(offsets, coords_fixed_size_list)

extension_metadata: Dict[str, str] = {"ARROW:extension:name": "geoarrow.linestring"}
if crs_json is not None:
extension_metadata["ARROW:extension:metadata"] = json.dumps({"crs": crs_json})

linestrings_field = Field(
"geometry",
linestrings_arr.type,
nullable=True,
metadata=extension_metadata,
)

timestamp_values = Array.from_numpy(timestamps).cast(time_arrow_dtype)
timestamp_arr = list_array(offsets, timestamp_values)
timestamp_col = ChunkedArray([timestamp_arr])

table = nested_attr_table.append_column(
linestrings_field, ChunkedArray([linestrings_arr])
)
return table, timestamp_col


def infer_time_unit(dtype: np.dtype) -> Tuple[Literal["s", "ms", "us", "ns"], DataType]:
"""Infer an arrow time unit from the numpy data type

Raises:
ValueError: If not a known numpy datetime dtype
"""

if dtype.name == "datetime64[s]":
kylebarron marked this conversation as resolved.
Show resolved Hide resolved
code = "s"
return code, DataType.timestamp(code)

if dtype.name == "datetime64[ms]":
code = "ms"
return code, DataType.timestamp(code)

if dtype.name == "datetime64[us]":
code = "us"
return code, DataType.timestamp(code)

if dtype.name == "datetime64[ns]":
code = "ns"
return code, DataType.timestamp(code)

raise ValueError(f"Unexpected datetime type: {dtype}")


def apply_offsets_to_table(table: Table, offsets: Array) -> Table:
batch = table.combine_chunks().to_batches()[0]

new_fields = []
new_arrays = []

for field_idx in range(batch.num_columns):
field = batch.schema.field(field_idx)
new_field = field.with_type(DataType.list(field))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on arro3's list(..) to convert the pyarrow field through the arrow pycapsule interface, and then pyarrow's with_type() convert again in the other direction?
Nice ;)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, I see that you converted the pyarrow Table to am arro3 one (attr_table = Table.from_arrow(attr_table)) before calling this function ;)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here batch is actually an arro3.RecordBatch, the conversion back to arro3 happened here:

attr_table = Table.from_arrow(attr_table)

But yes, these lines should work with pyarrow input as well!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, no, I see that you converted the pyarrow Table to am arro3 one (attr_table = Table.from_arrow(attr_table)) before calling this function ;)

Right. I could've left it as a pyarrow table; the main difference is mostly for type hinting. It's nice to get IDE completions and type checking and pyarrow doesn't have that yet.

new_array = list_array(offsets, batch[field_idx], type=new_field)

new_fields.append(new_field)
new_arrays.append(new_array)

new_schema = Schema(new_fields, metadata=batch.schema.metadata)
new_batch = RecordBatch(new_arrays, schema=new_schema)
return Table.from_batches([new_batch])
2 changes: 1 addition & 1 deletion lonboard/experimental/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,4 +4,4 @@
unexpected behavior when using them.
"""

from ._layer import ArcLayer, TextLayer
from ._layer import ArcLayer, TextLayer, TripsLayer
Loading