Skip to content

Fix splines and add followup_splines_df parameter#47

Merged
remlapmot merged 7 commits intomainfrom
devel-2026-05-05
May 5, 2026
Merged

Fix splines and add followup_splines_df parameter#47
remlapmot merged 7 commits intomainfrom
devel-2026-05-05

Conversation

@remlapmot
Copy link
Copy Markdown
Contributor

The fixes here for the splines are much smaller.

This,

  • Run autoformat.yml workflow on any branch beginning devel
  • Pre-compute spline knots from full dataset to ensure consistency across bootstrap replications
  • Replace squared spline term with the spline basis itself
  • Add followup_spline_df parameter to control spline degrees of freedom
  • Add Modelling Followup with a Natural Cubic Spline section to docs/vignettes/more_advanced_models.md

(Had to modify expected values of a test.)

remlapmot and others added 7 commits May 5, 2026 10:42
…oss bootstraps

`cr(followup, df=3)` in patsy computes interior knot positions from the quantiles of the training data. When bootstrapping, each replicate is fit on a resampled dataset that may have a different followup distribution, causing the spline basis to differ between the original model and bootstrap models, and between bootstrap replicates. Fix this by computing knot positions (interior knots, lower bound, upper bound) once from the full unexpanded dataset at the start of the original fit, storing them on the object, and embedding them as literals in the formula string passed to patsy for all subsequent fits (bootstrap replicates included). This mirrors the fix applied to the R package SEQTaRget.
`I(cr(followup, df=3)**2)` was being appended to the formula to represent the non-linear followup effect, but this squares the spline basis matrix element-wise rather than including the spline basis directly. When no treatment-by-followup interaction term is present (e.g. hazard estimation or km_curves=False), this meant the model contained only a squared spline with no main spline effect, completely misrepresenting the followup relationship. Replace the squared term with the spline basis itself. When an interaction term such as `treatment*cr(followup, ...)` is also present, patsy's formula expansion already includes the spline as a main effect and deduplicates it.

Update expected test coefficients accordingly.
Add `followup_spline_df` (default 4) to `SEQopts` to allow users to control the number of degrees of freedom for the natural cubic spline fit to followup, matching the `followup.spline.df` parameter introduced in the R package SEQTaRget. The default changes from the previously hardcoded value of 3 to 4, consistent with R. A validation check ensures the value is at least 2, which is the minimum supported by patsy's natural cubic spline implementation. Update expected test coefficients accordingly.
@remlapmot remlapmot requested a review from ryan-odea May 5, 2026 10:43
Copy link
Copy Markdown
Collaborator

@ryan-odea ryan-odea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These look good! Thanks for tackling - got a bit busy last/this week with some finishing up of other projects

@remlapmot remlapmot merged commit 8c1782b into main May 5, 2026
5 checks passed
@remlapmot remlapmot deleted the devel-2026-05-05 branch May 5, 2026 12:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants