Description
Sorry to semi-revive #167, but I think mirroring DataFrame.interpolate() will help ease some pain points that multiple people have raised concerning NaNs, until we find a way to get their behavior in sync with commonly-accepted boolean logic. See #181 as an illustration for funky workarounds people have tried.
Pandas has DataFrame.interpolate(), which makes use of scipy's interpolation methods. Since these are tools from well-known libraries that are already dependencies of ours, we don't have to start from scratch, we can just port the functionality.
By implementing this, we can at least for now say something like "sorry, we're working the NaNs, but here's some interpolation (or extrapolation) you can do until we figure it out."
Is your feature request aligned with the scope of the package?
Describe the solution you'd like, or your current workaround.
I'll illustrate here with a linear example, which is where we can start, but for the full description of where we can take this, see the Pandas docs and Scipy docs:
Linear, origin axis:
tri = cl.Triangle(
data={
'origin': [1985, 1985, 1985, 1985, 1986, 1986, 1986, 1987, 1987, 1988],
'development': [1985, 1986, 1987, 1988, 1986, 1987, 1988, 1987, 1988, 1988],
'paid': [np.nan, 600, 700, 800, np.nan, 1000, 1100, 1200, 1300, 1400]
},
origin='origin',
development='development',
columns=['paid'],
cumulative=True
)
tri
12 24 36 48
1985 NaN 600.0 700.0 800.0
1986 NaN 1000.0 1100.0 NaN
1987 1200.0 1300.0 NaN NaN
1988 1400.0 NaN NaN NaN
tri.interpolate(method='linear', axis=3, extrapolate=True)
12 24 36 48
1985 500.0 600.0 700.0 800.0
1986 900.0 1000.0 1100.0 NaN
1987 1200.0 1300.0 NaN NaN
1988 1400.0 NaN NaN NaN
Linear, development axis:
tri.interpolate(method='linear', axis=4, extrapolate=True)
12 24 36 48
1985 800.0 600.0 700.0 800.0
1986 1000.0 1000.0 1100.0 NaN
1987 1200.0 1300.0 NaN NaN
1988 1400.0 NaN NaN NaN
From trying to make these examples, I would guess missing values in the earliest and latest origin periods would be the most common scenario. This would technically be extrapolation, but scipy uses the term interpolate to refer to both interpolation/extrapolation.
Do you have any additional supporting notes?
Pandas DataFrame.interpolate() doesn't support extrapolation (at least not in a way that was obvious to me). I still think it would be useful to offer an augmented analogue that allows for extrapolation, though.
Would you be willing to contribute this ticket?
Description
Sorry to semi-revive #167, but I think mirroring
DataFrame.interpolate()will help ease some pain points that multiple people have raised concerning NaNs, until we find a way to get their behavior in sync with commonly-accepted boolean logic. See #181 as an illustration for funky workarounds people have tried.Pandas has
DataFrame.interpolate(), which makes use of scipy's interpolation methods. Since these are tools from well-known libraries that are already dependencies of ours, we don't have to start from scratch, we can just port the functionality.By implementing this, we can at least for now say something like "sorry, we're working the NaNs, but here's some interpolation (or extrapolation) you can do until we figure it out."
Is your feature request aligned with the scope of the package?
Describe the solution you'd like, or your current workaround.
I'll illustrate here with a linear example, which is where we can start, but for the full description of where we can take this, see the Pandas docs and Scipy docs:
Linear, origin axis:
Linear, development axis:
From trying to make these examples, I would guess missing values in the earliest and latest origin periods would be the most common scenario. This would technically be extrapolation, but scipy uses the term
interpolateto refer to both interpolation/extrapolation.Do you have any additional supporting notes?
Pandas
DataFrame.interpolate()doesn't support extrapolation (at least not in a way that was obvious to me). I still think it would be useful to offer an augmented analogue that allows for extrapolation, though.Would you be willing to contribute this ticket?