Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

int64 summation fails #1545

Closed
ghost opened this issue Mar 11, 2024 · 3 comments
Closed

int64 summation fails #1545

ghost opened this issue Mar 11, 2024 · 3 comments

Comments

@ghost
Copy link

ghost commented Mar 11, 2024

Problem: 4 + 3 + 4 = 5.43-e323
Screenshot 2024-03-11 at 4 13 10 PM

Reprex:

tibble::tribble(
              ~ group, ~ `1`, ~ `2`, ~ `3`,
              "A", bit64::as.integer64(4), bit64::as.integer64(3), bit64::as.integer64(4)
            ) %>%
    dplyr::group_by(group) %>%
    tidyr::nest() %>% # creates 'data' column
    dplyr::mutate(
             x = purrr::map(data, function(x) {list("sum"=sum(unlist(x)))})
           ) %>%
    tidyr::unnest_wider(c(data, x))
@munoztd0
Copy link

The issue you're experiencing is due to the use of the base R sum function with integer64 objects from the bit64 package. The base R sum function does not handle integer64 objects correctly, which is causing the incorrect result.

You should use the bit64::sum.integer64 function instead, which is designed to correctly handle integer64 objects. Here's the corrected code:

tibble::tribble(
  ~ group, ~ `1`, ~ `2`, ~ `3`,
  "A", bit64::as.integer64(4), bit64::as.integer64(3), bit64::as.integer64(4)
) |>
  dplyr::group_by(group) |>
  tidyr::nest() |> # creates 'data' column
  dplyr::mutate(
    x = purrr::map(data, function(x) {list("sum"=bit64::sum.integer64(unlist(x)))})
  ) |>
  tidyr::unnest_wider(c(data, x))
#> # A tibble: 1 × 5
#> # Groups:   group [1]
#>   group     `1`     `2`     `3`     sum
#>   <chr> <int64> <int64> <int64> <int64>
#> 1 A           4       3       4      11

This should resolve the issue and give you the correct sum for your integer64 values.

@hadley
Copy link
Member

hadley commented Mar 14, 2024

Technically it's the unlist causing the problem, not sum.

Fortunately, there's an easier way to tackle this with dplyr: use c_across():

library(dplyr, warn.conflicts = FALSE)

df <- tribble(
  ~ group, ~ `1`, ~ `2`, ~ `3`,
  "A", bit64::as.integer64(4), bit64::as.integer64(3), bit64::as.integer64(4)
)

df |> mutate(sum = sum(c_across("1":"3")))
#> # A tibble: 1 × 5
#>   group     `1`     `2`     `3`     sum
#>   <chr> <int64> <int64> <int64> <int64>
#> 1 A           4       3       4      11

Created on 2024-03-14 with reprex v2.1.0

I had actually forgotten how to do this, and found what I needed in https://dplyr.tidyverse.org/articles/rowwise.html#per-row-summary-statistics.

@hadley hadley closed this as completed Mar 14, 2024
@ghost
Copy link
Author

ghost commented Mar 14, 2024

@munoztd0 @hadley Thank you both for explaining the problem and solution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants