Skip to content

Commit 547670a

Browse files
Add post on TidyTuesday palm tree data
1 parent 70b9688 commit 547670a

File tree

3 files changed

+142
-0
lines changed

3 files changed

+142
-0
lines changed
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
{
2+
"hash": "77c4d21024610ecf037e9e1d9caae5cb",
3+
"result": {
4+
"engine": "jupyter",
5+
"markdown": "---\ntitle: \"TidyTuesday: Palm Trees\"\ndescription: \"Analyzing palm tree data by fruit color and size with the great_tables package.\"\ndate: \"2025-03-22\"\ncategories: [TidyTuesday, Python, Pandas]\nimage: \"image.png\"\n---\n\n\nThis week's TidyTuesday [dataset](https://github.com/rfordatascience/tidytuesday/blob/main/data/2025/2025-03-18/readme.md) contains information about palm trees.\nI decided to make a table about this data using `Pandas` and the `great_tables` Python packages.\nThe table counts the number of palm tree species by fruit color and gives some statistics about fruit width.\n\nFirst, let's import the packages needed for this analysis.\n\n::: {#a0163c3f .cell execution_count=1}\n``` {.python .cell-code}\nimport pandas as pd\nimport great_tables\n```\n:::\n\n\nThen we download the data from GitHub into a Pandas dataframe.\n\n::: {#7bbd1a2f .cell execution_count=2}\n``` {.python .cell-code}\nbase_url = \"https://raw.githubusercontent.com/rfordatascience/tidytuesday/main/data/2025/2025-03-18\"\ndf = pd.read_csv(f\"{base_url}/palmtrees.csv\", encoding=\"windows-1252\")\n```\n:::\n\n\nThe dataset includes information about 2557 palm tree species.\nOf these, 758 do not report a main fruit color.\n\nNext we wrangle the data into the desired tabular format.\n\n::: {#4e742571 .cell execution_count=3}\n``` {.python .cell-code}\ndf_table = (\n df.dropna(subset=[\"main_fruit_colors\"])\n # Create one row for each fruit color\n .assign(fruit_color=df[\"main_fruit_colors\"].str.split(\"; \"))\n .explode(\"fruit_color\")\n # Compute summary statistics by fruit color\n .groupby(\"fruit_color\")\n .agg(\n n=(\"spec_name\", \"size\"),\n min_average_fruit_width_cm=(\"average_fruit_width_cm\", \"min\"),\n max_average_fruit_width_cm=(\"average_fruit_width_cm\", \"max\"),\n # Sample one species for each fruit color\n sample_row_index=(\"spec_name\", lambda x: x.sample(1, random_state=0).index[0]),\n )\n # Extract information for sampled fruit\n .assign(\n spec_name=lambda x: df.loc[x[\"sample_row_index\"], \"spec_name\"].values,\n average_fruit_width_cm=lambda x: df.loc[\n x[\"sample_row_index\"], \"average_fruit_width_cm\"\n ].values,\n )\n # Clean-up dataframe for final table presentation\n .drop(columns=\"sample_row_index\")\n .reset_index()\n .assign(\n fruit_color=lambda x: x[\"fruit_color\"]\n .str.capitalize()\n .replace(\"Straw-coloured\", \"Straw\"),\n )\n .sort_values(\"n\", ascending=False)\n)\n```\n:::\n\n\nNow we create the desired table with the `great_tables` package.\n\n::: {#d59250c9 .cell execution_count=4}\n``` {.python .cell-code}\n(\n great_tables.GT(df_table)\n .tab_header(\n title=\"Palm Tree Fruit Characteristics\",\n subtitle=\"A guide for relating fruit size to fruit color\",\n )\n .tab_spanner(\n label=\"Across all Species\",\n columns=[\n \"n\",\n \"min_average_fruit_width_cm\",\n \"max_average_fruit_width_cm\",\n ],\n )\n .tab_spanner(\n label=\"Sample Species\",\n columns=[\n \"spec_name\",\n \"average_fruit_width_cm\",\n ],\n )\n .cols_label(\n spec_name=\"Species Name\",\n fruit_color=\"Fruit Color\",\n n=\"Number of Species\",\n average_fruit_width_cm=\"Average Fruit Width (cm)\",\n min_average_fruit_width_cm=\"Min Average Fruit Width (cm)\",\n max_average_fruit_width_cm=\"Max Average Fruit Width (cm)\",\n )\n .fmt_number(\n columns=[\n \"average_fruit_width_cm\",\n \"min_average_fruit_width_cm\",\n \"max_average_fruit_width_cm\",\n ],\n decimals=2,\n use_seps=False,\n )\n .tab_source_note(\n source_note=\"TidyTuesday: 2025, week 11 | PalmTraits 1.0 Database.\"\n )\n .tab_source_note(\n f\"Note, some species can have multiple fruit colors, and \\\n {df['main_fruit_colors'].isna().sum()} species \\\n have no reported main fruit color.\"\n )\n .opt_row_striping()\n # Save table as an image for the blog listing, also shows the table\n .save(\"./image.png\")\n)\n```\n\n::: {.cell-output .cell-output-display execution_count=45}\n```{=html}\n<div id=\"bdkikluxjn\" style=\"padding-left:0px;padding-right:0px;padding-top:10px;padding-bottom:10px;overflow-x:auto;overflow-y:auto;width:auto;height:auto;\">\n<style>\n#bdkikluxjn table {\n font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, 'Helvetica Neue', 'Fira Sans', 'Droid Sans', Arial, sans-serif;\n -webkit-font-smoothing: antialiased;\n -moz-osx-font-smoothing: grayscale;\n }\n\n#bdkikluxjn thead, tbody, tfoot, tr, td, th { border-style: none; }\n tr { background-color: transparent; }\n#bdkikluxjn p { margin: 0; padding: 0; }\n #bdkikluxjn .gt_table { display: table; border-collapse: collapse; line-height: normal; margin-left: auto; margin-right: auto; color: #333333; font-size: 16px; font-weight: normal; font-style: normal; background-color: #FFFFFF; width: auto; border-top-style: solid; border-top-width: 2px; border-top-color: #A8A8A8; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #A8A8A8; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; }\n #bdkikluxjn .gt_caption { padding-top: 4px; padding-bottom: 4px; }\n #bdkikluxjn .gt_title { color: #333333; font-size: 125%; font-weight: initial; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; border-bottom-color: #FFFFFF; border-bottom-width: 0; }\n #bdkikluxjn .gt_subtitle { color: #333333; font-size: 85%; font-weight: initial; padding-top: 3px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; border-top-color: #FFFFFF; border-top-width: 0; }\n #bdkikluxjn .gt_heading { background-color: #FFFFFF; text-align: center; border-bottom-color: #FFFFFF; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; }\n #bdkikluxjn .gt_bottom_border { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; }\n #bdkikluxjn .gt_col_headings { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; }\n #bdkikluxjn .gt_col_heading { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; overflow-x: hidden; }\n #bdkikluxjn .gt_column_spanner_outer { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: normal; text-transform: inherit; padding-top: 0; padding-bottom: 0; padding-left: 4px; padding-right: 4px; }\n #bdkikluxjn .gt_column_spanner_outer:first-child { padding-left: 0; }\n #bdkikluxjn .gt_column_spanner_outer:last-child { padding-right: 0; }\n #bdkikluxjn .gt_column_spanner { border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: bottom; padding-top: 5px; padding-bottom: 5px; overflow-x: hidden; display: inline-block; width: 100%; }\n #bdkikluxjn .gt_spanner_row { border-bottom-style: hidden; }\n #bdkikluxjn .gt_group_heading { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; text-align: left; }\n #bdkikluxjn .gt_empty_group_heading { padding: 0.5px; color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; vertical-align: middle; }\n #bdkikluxjn .gt_from_md> :first-child { margin-top: 0; }\n #bdkikluxjn .gt_from_md> :last-child { margin-bottom: 0; }\n #bdkikluxjn .gt_row { padding-top: 8px; padding-bottom: 8px; padding-left: 5px; padding-right: 5px; margin: 10px; border-top-style: solid; border-top-width: 1px; border-top-color: #D3D3D3; border-left-style: none; border-left-width: 1px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 1px; border-right-color: #D3D3D3; vertical-align: middle; overflow-x: hidden; }\n #bdkikluxjn .gt_stub { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; }\n #bdkikluxjn .gt_stub_row_group { color: #333333; background-color: #FFFFFF; font-size: 100%; font-weight: initial; text-transform: inherit; border-right-style: solid; border-right-width: 2px; border-right-color: #D3D3D3; padding-left: 5px; padding-right: 5px; vertical-align: top; }\n #bdkikluxjn .gt_row_group_first td { border-top-width: 2px; }\n #bdkikluxjn .gt_row_group_first th { border-top-width: 2px; }\n #bdkikluxjn .gt_striped { background-color: rgba(128,128,128,0.05); }\n #bdkikluxjn .gt_table_body { border-top-style: solid; border-top-width: 2px; border-top-color: #D3D3D3; border-bottom-style: solid; border-bottom-width: 2px; border-bottom-color: #D3D3D3; }\n #bdkikluxjn .gt_sourcenotes { color: #333333; background-color: #FFFFFF; border-bottom-style: none; border-bottom-width: 2px; border-bottom-color: #D3D3D3; border-left-style: none; border-left-width: 2px; border-left-color: #D3D3D3; border-right-style: none; border-right-width: 2px; border-right-color: #D3D3D3; }\n #bdkikluxjn .gt_sourcenote { font-size: 90%; padding-top: 4px; padding-bottom: 4px; padding-left: 5px; padding-right: 5px; text-align: left; }\n #bdkikluxjn .gt_left { text-align: left; }\n #bdkikluxjn .gt_center { text-align: center; }\n #bdkikluxjn .gt_right { text-align: right; font-variant-numeric: tabular-nums; }\n #bdkikluxjn .gt_font_normal { font-weight: normal; }\n #bdkikluxjn .gt_font_bold { font-weight: bold; }\n #bdkikluxjn .gt_font_italic { font-style: italic; }\n #bdkikluxjn .gt_super { font-size: 65%; }\n #bdkikluxjn .gt_footnote_marks { font-size: 75%; vertical-align: 0.4em; position: initial; }\n #bdkikluxjn .gt_asterisk { font-size: 100%; vertical-align: 0; }\n \n</style>\n<table class=\"gt_table\" data-quarto-disable-processing=\"false\" data-quarto-bootstrap=\"false\">\n<thead>\n\n <tr class=\"gt_heading\">\n <td colspan=\"6\" class=\"gt_heading gt_title gt_font_normal\">Palm Tree Fruit Characteristics</td>\n </tr>\n <tr class=\"gt_heading\">\n <td colspan=\"6\" class=\"gt_heading gt_subtitle gt_font_normal gt_bottom_border\">A guide for relating fruit size to fruit color</td>\n </tr>\n<tr class=\"gt_col_headings gt_spanner_row\">\n <th class=\"gt_col_heading gt_columns_bottom_border gt_left\" rowspan=\"2\" colspan=\"1\" scope=\"col\" id=\"fruit_color\">Fruit Color</th>\n <th class=\"gt_center gt_columns_top_border gt_column_spanner_outer\" rowspan=\"1\" colspan=\"3\" scope=\"colgroup\" id=\"Across-all-Species\">\n <span class=\"gt_column_spanner\">Across all Species</span>\n </th>\n <th class=\"gt_center gt_columns_top_border gt_column_spanner_outer\" rowspan=\"1\" colspan=\"2\" scope=\"colgroup\" id=\"Sample-Species\">\n <span class=\"gt_column_spanner\">Sample Species</span>\n </th>\n</tr>\n<tr class=\"gt_col_headings\">\n <th class=\"gt_col_heading gt_columns_bottom_border gt_right\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"n\">Number of Species</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_right\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"min_average_fruit_width_cm\">Min Average Fruit Width (cm)</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_right\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"max_average_fruit_width_cm\">Max Average Fruit Width (cm)</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_left\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"spec_name\">Species Name</th>\n <th class=\"gt_col_heading gt_columns_bottom_border gt_right\" rowspan=\"1\" colspan=\"1\" scope=\"col\" id=\"average_fruit_width_cm\">Average Fruit Width (cm)</th>\n</tr>\n</thead>\n<tbody class=\"gt_table_body\">\n <tr>\n <td class=\"gt_row gt_left\">Red</td>\n <td class=\"gt_row gt_right\">501</td>\n <td class=\"gt_row gt_right\">0.21</td>\n <td class=\"gt_row gt_right\">11.00</td>\n <td class=\"gt_row gt_left\">Bactris timbuiensis</td>\n <td class=\"gt_row gt_right\">1.40</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left gt_striped\">Brown</td>\n <td class=\"gt_row gt_right gt_striped\">484</td>\n <td class=\"gt_row gt_right gt_striped\">0.40</td>\n <td class=\"gt_row gt_right gt_striped\">20.00</td>\n <td class=\"gt_row gt_left gt_striped\">Astrocaryum confertum</td>\n <td class=\"gt_row gt_right gt_striped\">3.00</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left\">Black</td>\n <td class=\"gt_row gt_right\">462</td>\n <td class=\"gt_row gt_right\">0.40</td>\n <td class=\"gt_row gt_right\">20.00</td>\n <td class=\"gt_row gt_left\">Chamaedorea guntheriana</td>\n <td class=\"gt_row gt_right\">0.60</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left gt_striped\">Orange</td>\n <td class=\"gt_row gt_right gt_striped\">265</td>\n <td class=\"gt_row gt_right gt_striped\">0.40</td>\n <td class=\"gt_row gt_right gt_striped\">15.50</td>\n <td class=\"gt_row gt_left gt_striped\">Ceroxylon pityrophyllum</td>\n <td class=\"gt_row gt_right gt_striped\">1.70</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left\">Yellow</td>\n <td class=\"gt_row gt_right\">206</td>\n <td class=\"gt_row gt_right\">0.35</td>\n <td class=\"gt_row gt_right\">6.00</td>\n <td class=\"gt_row gt_left\">Calamus delicatulus</td>\n <td class=\"gt_row gt_right\">1.00</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left gt_striped\">Green</td>\n <td class=\"gt_row gt_right gt_striped\">195</td>\n <td class=\"gt_row gt_right gt_striped\">0.30</td>\n <td class=\"gt_row gt_right gt_striped\">14.00</td>\n <td class=\"gt_row gt_left gt_striped\">Dypsis eriostachys</td>\n <td class=\"gt_row gt_right gt_striped\">0.60</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left\">Purple</td>\n <td class=\"gt_row gt_right\">175</td>\n <td class=\"gt_row gt_right\">0.40</td>\n <td class=\"gt_row gt_right\">20.00</td>\n <td class=\"gt_row gt_left\">Calamus divaricatus</td>\n <td class=\"gt_row gt_right\">0.90</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left gt_striped\">White</td>\n <td class=\"gt_row gt_right gt_striped\">87</td>\n <td class=\"gt_row gt_right gt_striped\">0.30</td>\n <td class=\"gt_row gt_right gt_striped\">5.00</td>\n <td class=\"gt_row gt_left gt_striped\">Areca hutchinsoniana</td>\n <td class=\"gt_row gt_right gt_striped\">1.70</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left\">Pink</td>\n <td class=\"gt_row gt_right\">36</td>\n <td class=\"gt_row gt_right\">0.20</td>\n <td class=\"gt_row gt_right\">3.20</td>\n <td class=\"gt_row gt_left\">Pinanga philippinensis</td>\n <td class=\"gt_row gt_right\">0.70</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left gt_striped\">Straw</td>\n <td class=\"gt_row gt_right gt_striped\">22</td>\n <td class=\"gt_row gt_right gt_striped\">0.60</td>\n <td class=\"gt_row gt_right gt_striped\">3.17</td>\n <td class=\"gt_row gt_left gt_striped\">Daemonorops scapigera</td>\n <td class=\"gt_row gt_right gt_striped\">2.45</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left\">Blue</td>\n <td class=\"gt_row gt_right\">19</td>\n <td class=\"gt_row gt_right\">0.47</td>\n <td class=\"gt_row gt_right\">2.80</td>\n <td class=\"gt_row gt_left\">Livistona saribus</td>\n <td class=\"gt_row gt_right\">1.50</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left gt_striped\">Cream</td>\n <td class=\"gt_row gt_right gt_striped\">11</td>\n <td class=\"gt_row gt_right gt_striped\">0.50</td>\n <td class=\"gt_row gt_right gt_striped\">1.30</td>\n <td class=\"gt_row gt_left gt_striped\">Calamus moti</td>\n <td class=\"gt_row gt_right gt_striped\">1.30</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left\">Grey</td>\n <td class=\"gt_row gt_right\">10</td>\n <td class=\"gt_row gt_right\">0.47</td>\n <td class=\"gt_row gt_right\">2.00</td>\n <td class=\"gt_row gt_left\">Calamus kiahii</td>\n <td class=\"gt_row gt_right\">2.00</td>\n </tr>\n <tr>\n <td class=\"gt_row gt_left gt_striped\">Ivory</td>\n <td class=\"gt_row gt_right gt_striped\">9</td>\n <td class=\"gt_row gt_right gt_striped\">0.60</td>\n <td class=\"gt_row gt_right gt_striped\">4.00</td>\n <td class=\"gt_row gt_left gt_striped\">Daemonorops oblata</td>\n <td class=\"gt_row gt_right gt_striped\">2.65</td>\n </tr>\n</tbody>\n <tfoot class=\"gt_sourcenotes\">\n \n <tr>\n <td class=\"gt_sourcenote\" colspan=\"6\">TidyTuesday: 2025, week 11 | PalmTraits 1.0 Database.</td>\n </tr>\n\n\n <tr>\n <td class=\"gt_sourcenote\" colspan=\"6\">Note, some species can have multiple fruit colors, and 758 species have no reported main fruit color.</td>\n </tr>\n\n</tfoot>\n\n</table>\n\n</div>\n \n```\n:::\n:::\n\n\nWe see that most palm tree species have red fruit and that ivory is the least common fruit color.\nBrown, black, and purple fruit have the largest maximum average fruit size of 20 cm.\nCream colored fruit have the smallest maximum fruit size of 1.3 cm.\nAlso, the species names sound like spells from Harry Potter.\n\nOverall, I suspect this table can be made nicer with additional styling, such as adding a border or rearranging columns, but this was only meant to be a quick analysis so I'll leave it here for now!\n\n",
6+
"supporting": [
7+
"index_files"
8+
],
9+
"filters": [],
10+
"includes": {
11+
"include-in-header": [
12+
"<script src=\"https://cdnjs.cloudflare.com/ajax/libs/require.js/2.3.6/require.min.js\" integrity=\"sha512-c3Nl8+7g4LMSTdrm621y7kf9v3SDPnhxLNhcjFJbKECVnmZHTdo+IRO05sNLTH/D3vA6u1X32ehoLC7WFVdheg==\" crossorigin=\"anonymous\"></script>\n<script src=\"https://cdnjs.cloudflare.com/ajax/libs/jquery/3.5.1/jquery.min.js\" integrity=\"sha512-bLT0Qm9VnAYZDflyKcBaQ2gg0hSYNQrJ8RilYldYQ1FxQYoCLtUjuuRuZo+fjqhx/qtq/1itJ0C2ejDxltZVFg==\" crossorigin=\"anonymous\" data-relocate-top=\"true\"></script>\n<script type=\"application/javascript\">define('jquery', [],function() {return window.jQuery;})</script>\n"
13+
]
14+
}
15+
}
16+
}

posts/2025-03-22_palm-trees/image.png

105 KB
Loading

0 commit comments

Comments
 (0)