To further aid in making sense of the data points and patterns in a scatter plot, Jupyter Scatter supports a tooltip that can show a point's properties and related media to facilitate sense making.
scatter.tooltip(True)
Each row in the tooltip corresponds to a property. From left to right, each property features the:
- visual property (like
x
,y
,color
,opacity
, orsize
) or data property - name as specified by the column name in the bound DataFrame
- actual data value
- histogram or treemap of the property distribution
For numerical properties, the histogram is visualized as a bar chart. For categorical properties, the histogram is visualized as a treemap where the rectangles represents the proportion of categories compared to the whole. Treemaps are useful in scenarios with a lot of categories as shown below.
In both cases, the highlighted bar / rectangle indicates how the hovered point compares to the other points.
::: info For demos of how to use tooltips with a variety of data, see https://github.com/flekschas/jupyter-scatter-tutorial. :::
By default, the tooltip shows all properties that are visually encoded but you can limit which properties are shown:
scatter.tooltip(properties=["color", "opacity"])
Importantly, you can also show other properties in the tooltip that are not directly visualized with the scatter plot. Other properties have to be referenced by their respective column names.
scatter.tooltip(
properties=[
"color",
"opacity",
"group",
"effect_size",
]
)
Here, for instance, we're showing the point's group
and effect_size
properties, which are two other DataFrame columns we didn't visualize.
::: tip
The order of properties
defines the order of the entries in the tooltip.
:::
The histograms of numerical data properties consists of 20
bins, by default,
and is covering the entire data range, i.e., it starts at the minumum and ends
at the maximum value. You can adjust both aspects either globally for all
histograms as follows:
scatter.tooltip(histograms_bins=40, histograms_ranges=(0, 1))
To customize the number of bins and the range by property you can do:
scatter.tooltip(
histograms_bins={"color": 10, "effect_size": 30},
histograms_ranges={"color": (0, 1), "effect_size": (0.25, 0.75)}
)
Since an increased number of bins can make it harder to read the histogram, you can also adjust the size as follows:
scatter.tooltip(histograms_size="large")
If you set the histogram range to be smaller than the data extent, some points
might lie outside the histogram. For instance, previously we restricted the
effect_size
to [0.25, 0.75]
, meaning we disregarded part of the lower and
upper end of the data.
In this case, hovering a point with an effect_size
less than .25
will be
visualized by a red ]
to the left of the histogram to indicate it's value is
smaller than the value represented by the left-most bar.
Likewise, hovering a point with an effect_size
larger than 0.75
will be
visualized by a red [
to the right of the histogram to indicate it's value is
larger than the value represented by the right-most bar.
Finally, if you want to transform the histogram in some other way, use your
favorite method and save the transformed data before referencing it. For
instance, in the following, we winsorized the effect_size
to the [10, 90]
percentile:
from scipy.stats.mstats import winsorize
df['effect_size_winsorized'] = winsorize(df.effect_size, limits=[0.1, 0.1])
scatter.tooltip(properties=['effect_size_winsorized'])
In cases where your data has a media representation like text, images, or audio, you can show a preview of the media in the tooltip by referencing a column name that holds either plain text, URLs referencing images, or URLs referencing audio.
scatter.tooltip(preview="headline")
By default, the media type is set to text
. If you want to show an image or
audio file as the preview, you additionally need to specify the corresponding
media type.
scatter.tooltip(preview="url", preview_type="image")
You can further customize the media preview via media type-specific arguments. For instance in the following, we limit the audio preview to 2 seconds and loop the audio playback.
scatter.tooltip(
preview="audio_url",
preview_type="audio",
preview_audio_length=2,
preview_audio_loop=True
)
For more details on how to customize the tooltip preview, see the API docs for
tooltip()
.