Skip to content

unexpected result with groupby apply on categorical data #9603

@tmoroz

Description

@tmoroz
df = pandas.DataFrame({'a': [1, 0, 0, 0]})
df.groupby(pandas.cut(df.a, [0, 1, 2, 3, 4])).apply(lambda x: len(x))

Out[3]:
a
(0, 1]    1
(1, 2]    0
(2, 3]    0
(3, 4]    3
dtype: int64

final group length should be 0 not 3

Activity

added this to the 0.16.1 milestone on Mar 6, 2015
jreback

jreback commented on Mar 6, 2015

@jreback
Contributor

I recall another issue that is similar (but can't find ATM).

You can use these (and are much faster anyhow)

In [6]: df.groupby(pandas.cut(df.a, [0, 1, 2, 3, 4])).size()
Out[6]: 
a
(0, 1]     1
(1, 2]   NaN
(2, 3]   NaN
(3, 4]   NaN
dtype: float64

In [7]: df.groupby(pandas.cut(df.a, [0, 1, 2, 3, 4])).count()
Out[7]: 
        a
a        
(0, 1]  1
(1, 2]  0
(2, 3]  0
(3, 4]  0
jreback

jreback commented on Apr 29, 2015

@jreback
Contributor

closed by #10014

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

      Development

      No branches or pull requests

        Participants

        @jreback@tmoroz

        Issue actions

          unexpected result with groupby apply on categorical data · Issue #9603 · pandas-dev/pandas