-
-
Notifications
You must be signed in to change notification settings - Fork 19.2k
Open
Labels
EnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode
Description
merge_asof
with interpolation would be a new feature. When you merge df1
and df2
, the merged columns of df2
would not be the exact "asof" values as in df2
, but the interpolated values (for example in a linearly time-weighted fashion).
Define
np.random.seed(0)
start = pd.Timestamp("2018-07-23 09:00:00")
df1 = pd.DataFrame(np.random.normal(size=5), index=pd.date_range(start, periods=5, freq='S'), columns=['something'])
df2 = pd.DataFrame(np.random.normal(size=5), index=pd.date_range(start, periods=5, freq='S')+datetime.timedelta(seconds=0.5), columns=['something_else'])
df2 = df2.iloc[:2].append(df2.iloc[3:])
df1
:
something | |
---|---|
2018-07-23 09:00:00 | 1.764052 |
2018-07-23 09:00:01 | 0.400157 |
2018-07-23 09:00:02 | 0.978738 |
2018-07-23 09:00:03 | 2.240893 |
2018-07-23 09:00:04 | 1.867558 |
df2
:
something_else | |
---|---|
2018-07-23 09:00:00.500 | -0.977278 |
2018-07-23 09:00:01.500 | 0.950088 |
2018-07-23 09:00:03.500 | -0.103219 |
2018-07-23 09:00:04.500 | 0.410599 |
This is how merge_asof
works:
pd.merge_asof(df1, df2, left_index=True, right_index=True)
Returns:
\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \nsomething | something_else | |
---|---|---|
2018-07-23 09:00:00 | 1.764052 | NaN |
2018-07-23 09:00:01 | 0.400157 | -0.977278 |
2018-07-23 09:00:02 | 0.978738 | 0.950088 |
2018-07-23 09:00:03 | 2.240893 | 0.950088 |
2018-07-23 09:00:04 | 1.867558 | -0.103219 |
And so
pd.merge_asof(df1, df2, left_index=True, right_index=True, method="linear")
would return
\n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \n \nsomething | something_else | |
---|---|---|
2018-07-23 09:00:00 | 1.764052 | NaN |
2018-07-23 09:00:01 | 0.400157 | -0.013595 |
2018-07-23 09:00:02 | 0.978738 | 0.598986 |
2018-07-23 09:00:03 | 2.240893 | 0.247884 |
2018-07-23 09:00:04 | 1.867558 | 0.153690 |
method="last"
could reproduce default behaviour.
This could also be reproduced by doing these operations manually:
pd.merge(df1, df2, left_index=True, right_index=True, how="outer").interpolate("linear").merge(df1, left_index=True, right_index=True, on="something", how="right")
phoebebright, sitems and dparker-moment
Metadata
Metadata
Assignees
Labels
EnhancementReshapingConcat, Merge/Join, Stack/Unstack, ExplodeConcat, Merge/Join, Stack/Unstack, Explode