Skip to content

Conversation

prantogg
Copy link
Member

@prantogg prantogg commented Sep 29, 2025

  • Update markdown cells to be concise and follow Jeff's standard
  • Confirm it runs on Tiny GPU
  • Add relevant doc links
  • rename file to Raster_Text_To_Segments.ipynb

Copy link

gitnotebooks bot commented Sep 29, 2025

Found 2 changed notebooks. Review the changes at https://app.gitnotebooks.com/wherobots/wherobots-examples/pull/88

"Before running inference, it’s useful to explore the imagery itself. \n",
"We’ll visualize the footprints of the raster tiles with SedonaKepler and preview a few raw images using SedonaUtils. This gives us confidence that the data is being read correctly and aligned spatially before applying any models.\n",
"\n",
"Tip: You can also save the Kepler map as an interactive HTML file with `kepler_map.save_to_html()`."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this tip could be indented like the others

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this name better; we need to make sure to update the solution notebook deep link and any places it's used (homepage and maybe the SAM2 blog post?)

"\n",
"We will demonstrate how to apply these models to NAIP imagery of Miami Airport, using `RS_Text_to_BBoxes` (OWLv2) for object detection and `RS_Text_to_Segments` (SAM2) for segmentation. \n",
"\n",
"[Read more about Wherobots Raster Inference in the documentation](https://docs.wherobots.com/latest/tutorials/wherobotsai/wherobots-inference/raster-inference-overview/?h=raster+inference). "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I find the highlighting in the docs distracting. Consider removing the query string.

"\n",
"We will use the native raster reader to load Geotiffs as out-of-database or \"out-db\" rasters and perform dyanamic tiling on read.\n",
"Spliting the large GeoTIFF into small tiles improves the distribution of workload across the cluster."
]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typos

  • Geotiffs → GeoTIFs
  • Maimi → Miami

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pettiross GeoTIFs is not correct. "Geotiffs" is more common, technically it is "GeoTIFFs"

"cell_type": "markdown",
"metadata": {},
"source": [
"## Running Segementation using SAM2 model\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Typo: Segementation → segmentation

"Now we’ll run inference over the raster tiles using Wherobots’ SQL function `RS_Text_to_Segments()`.\n",
"\n",
"For this example, we specify the follwing parameters -\n",
"- Model: `\"sam2\"`\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be helpful to add some info about the model and confidence parameters. Why 0.5? Why not 0.9 or 10? There's nothing in the docs about confidence, either, afaict.

Typos:

  • Wherobots' → Wherobots (no apostrophe)
  • follwing → following

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

could say that the model is not very robust, most detections that are positive are at most .7 confident. so for this particular model, a .5 confidence score will show us more actual positive and false positive detections so we get a sense of how the model performs

"metadata": {},
"outputs": [],
"source": [
"model_id = \"sam2\"\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need to put the parameters in constants, but I'm willing to be talked out of this. My two cents is that we've put this info in the markdown and it will make the code more concise.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it is helpful since our SQL docs for inference functions call this parameter model_id and it highlights that this is configurable. Otherwise the reader (particularly one unfamiliar with SQL) could miss where this parameter is passed in the SQL statement and that it is configurable. Readable code > concise code imo.

"metadata": {},
"outputs": [],
"source": [
"exploded = sedona.sql(\"\"\"\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add a markdown cell above this to explain why explode is used? I don't want to re-explain query techniques in every notebook, but this notebook might attract the attention of modelers and other AI-centered folx who aren't as conversant with this.

"cell_type": "markdown",
"metadata": {},
"source": [
"## <span style=\"color: #7b73e2;\">Next Steps with Raster Inference</span>\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One snuck by! Remove the color here. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants