| Title: | GAM-based Recursive Partitioning |
|---|---|
| Description: | Recursively partitions the observations in a dataset, based on the parameters of a generalized additive model (GAM). Function splinetree implements partitioning based on parametric or non-penalized spline models. It offers faster computational speed and more flexibility for modeling multilevel and longitudinal data. Function gamtree implements partitioning based on semi-parametric or penalized spline models, a.k.a. smoothing splines. It requires less user involvement in correctly specifying the splines and knots, but has a heavier computational load and offers less flexibility in modeling multilevel and longitudinal data. Packages mgcv and splines are used for spline and model estimation, packages partykit and merDeriv are used for partitioning and derivative computations. |
| Authors: | Marjolein Fokkema [aut, cre] |
| Maintainer: | Marjolein Fokkema <[email protected]> |
| License: | GPL-2 | GPL-3 |
| Version: | 0.0.3 |
| Built: | 2026-05-17 08:01:10 UTC |
| Source: | https://github.com/marjoleinF/gamtree |
coef.gamtree extracts fixed- or random-effects coefficients from
a GAM tree.
## S3 method for class 'gamtree' coef(object, which = "fixed", ...)## S3 method for class 'gamtree' coef(object, which = "fixed", ...)
object |
an object of class |
which |
character. Either |
... |
further arguments to be passed to |
The data comprises light-response curves, which describe the relationship between photosynthetically
active radiation (PAR) and photosynthetic rate (Pn).
Observations are repeated measures on the same plants. Variable 'Specimen' provides an identifier
for individual plants. Dataset was copied from
https://stackoverflow.com/questions/37037445/using-mob-trees-partykit-package-with-nls-model
For illustrating the funtionality of package gamtree, two variables
were added: Specimen, which identifies individual plants, and noise
an artificially generated continuous covariate that is pure (gaussian) noise.
data(eco)data(eco)
## 'eco'
A data.frame with 628 observations of 5 variables:
categorical partitioning variable, indicator for specie (signal).
continuous predictor variable for the node-specific model (signal).
continuous response variable.
identification number of individual plants.
artifically generated continuous covariate (noise).
https://stackoverflow.com/questions/37037445/using-mob-trees-partykit-package-with-nls-model
summary(eco)summary(eco)
fitted.gamm4 extract fitted values from objects of class gamm4.
## S3 method for class 'gamm4' fitted(object, ...)## S3 method for class 'gamm4' fitted(object, ...)
object |
an object of class |
... |
currently not used. |
fixef.gamtree extracts fixed-effects coefficients from a GAM tree.
## S3 method for class 'gamtree' fixef(object, ...)## S3 method for class 'gamtree' fixef(object, ...)
object |
an object of class |
... |
further arguments to be passed to |
gamtree recursively partitions a dataset into subgroups with
penalized GAMs, characterized by differences in the parameter estimates.
gamtree( formula, data, weights = NULL, REML = TRUE, method = "mob", cluster = NULL, offset = NULL, verbose = FALSE, parm = c(1, 2, 4), gam_ctrl = list(), tree_ctrl = list(), alt_formula = NULL, ... )gamtree( formula, data, weights = NULL, REML = TRUE, method = "mob", cluster = NULL, offset = NULL, verbose = FALSE, parm = c(1, 2, 4), gam_ctrl = list(), tree_ctrl = list(), alt_formula = NULL, ... )
formula |
specifies the model formula, consisting of three
parts: the response variable followed by a tilde ('~'); the terms for the
node-specific GAMs, followed by a vertical bar ('|') and the potential
partitioning variables (separated by a '+').
The 'by' argument of function
|
data |
|
weights |
numeric vector of length |
REML |
logical, defaults to |
method |
character, one of |
cluster |
optional, a name refering to a colum of |
offset |
numeric vector of length |
verbose |
logical. Should progress be printed to the commande line in every iteration? If true, the iteration number, information on the splitting procedure, and the log-likelihood (with df) value of the fitted full mixed-effects gam model is printed. |
parm |
vector of one or more integers, indicating which parameters should be
included in the parameter stability tests. The default |
gam_ctrl |
a list of fit control parameters to replace defaults returned by
|
tree_ctrl |
a |
alt_formula |
list with two elements, for specifying non-standard model formulae
for GAM. E.g., the formula list required for use of the |
... |
additional arguments to be passed to function |
MOB is short for model-based recursive
partitioning, ctree is short for conditional inference tree. MOB is
based more strongly on parametric theory, thereby allowing for easy inclusion
of clustering structures into the estimation procedure (see also argument
cluster), yielding similar to a GEE-type approach for estimation of
multilevel and longitudinal data structures. Yet, computation time for MOB is much
larger than for ctree, which is mostly due to how it searches for
the optimal splitting value, after the variable for splitting
has been selected. ctree uses tests based on permutation theory,
and thereby offers a less parametrically oriented approach. It is much
faster than MOB, but does not provide a natural way of accounting
for multilevel or longitudinal data structures.
Returns an object of class "gamtree". This is a list, containing
(amongst others) the GAM-based recursive partition (in $tree).
The following methods are available to extract information from the fitted object:
predict.gamtree, for obtaining predicted values for training and new
observations; plot.gamtree for plotting the tree and variables' effects;
coef.gamtree, fixef.gamtree and ranef.gamtree
for extracting estimated coefficients. VarCorr.gamtree for extracting
random-effects (co)variances, summary.gamtree for a summary of the
fitted models.
predict.gamtree plot.gamtree
coef.gamtree summary.gamtree
gt_m <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) summary(gt_m) gt_c <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, method = "ctree") summary(gt_c)gt_m <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) summary(gt_m) gt_c <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, method = "ctree") summary(gt_c)
Takes a fitted GAM tree and plots the smooth functions fitted in each of the terminal nodes of the tree.
## S3 method for class 'gamtree' plot( x, which = "both", ylim = "firstnode", treeplot_ctrl = list(), gamplot_ctrl = list(), ... )## S3 method for class 'gamtree' plot( x, which = "both", ylim = "firstnode", treeplot_ctrl = list(), gamplot_ctrl = list(), ... )
x |
object of class |
which |
character. The default ( |
ylim |
|
treeplot_ctrl |
list of (named) arguments to be passed to
|
gamplot_ctrl |
list of (named) arguments to be passed to
|
... |
further arguments, currently not used. |
The plotted terms by default also represent confidence bands. These should be taken with a big grain of salt, because they do NOT account for the searching of the tree structure; they assume the tree structure was known in advance. They should be interpreted as overly optimistic and with caution.
gt <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) plot(gt, which = "tree") # default is which = 'both' plot(gt, which = "terms")gt <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) plot(gt, which = "tree") # default is which = 'both' plot(gt, which = "terms")
plot.splinetree takes a fitted (g)lmertree with splines and plots it.
It is a wrapper for plot.glmertree and
plot.lmertree, with critical adjustments for
better visualization of spline models in the terminal nodes.
## S3 method for class 'splinetree' plot(x, which = "all", fitted = "marginal", ...)## S3 method for class 'splinetree' plot(x, which = "all", fitted = "marginal", ...)
x |
fitted object of class |
which |
character, "both", "tree" or "ranef". Other options are available, see
|
fitted |
character, "marginal" (default), "combined" or "none".
Specifies whether and how fitted values should be computed and visualized.
See |
... |
additional arguments to be passed to |
setup.spline predict.splinetree
plot.lmertree plot.glmertree
st <- splinetree(Pn ~ ns(PAR, df = 5) | Specimen | Species, data = eco, cluster = Specimen) plot(st)st <- splinetree(Pn ~ ns(PAR, df = 5) | Specimen | Species, data = eco, cluster = Specimen) plot(st)
predict.gamm4 extract predictions from objects of class gamm4.
## S3 method for class 'gamm4' predict(object, newdata, ...)## S3 method for class 'gamm4' predict(object, newdata, ...)
object |
an object of class |
newdata |
an optional |
... |
currently not used. |
Takes a fitted GAM tree (of class "gamtree") and returns
predictions given a new set of values for the model covariates,
or for the original covariate values used for fitting the GAM tree.
## S3 method for class 'gamtree' predict(object, newdata = NULL, type = "link", ...)## S3 method for class 'gamtree' predict(object, newdata = NULL, type = "link", ...)
object |
an object of class |
newdata |
a |
type |
character vector of length 1, specifying the type of prediction
to be returned. |
... |
further arguments to be passed to |
Returns a vector of predicted values.
predict.splinetree computes predictions for a fitted
(g)lmertree that is based on splines.
## S3 method for class 'splinetree' predict(object, newdata, ...)## S3 method for class 'splinetree' predict(object, newdata, ...)
object |
fitted object of class |
newdata |
|
... |
additional arguments to be passed to |
setup.spline predict.splinetree
lmertree glmertree
st <- splinetree(Pn ~ ns(PAR, df = 5) | Specimen | Species, data = eco, cluster = Specimen) predict(st, newdata = eco[1L:5L, ])st <- splinetree(Pn ~ ns(PAR, df = 5) | Specimen | Species, data = eco, cluster = Specimen) predict(st, newdata = eco[1L:5L, ])
Prints the local and/or global terms in a fitted GAM tree.
## S3 method for class 'gamtree' print(x, ...)## S3 method for class 'gamtree' print(x, ...)
x |
object of class |
... |
further arguments to be passed to |
gt <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) gt ## or: print(gt)gt <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) gt ## or: print(gt)
ranef.gamtree extracts random-effects coefficients from a GAM tree.
## S3 method for class 'gamtree' ranef(object, ...)## S3 method for class 'gamtree' ranef(object, ...)
object |
an object of class |
... |
further arguments to be passed to |
setup.spline takes a dataset and spline specification as input, and returns the
dataset with spline bases added.
setup.spline(spline, data, ...)setup.spline(spline, data, ...)
spline |
character vector of length 1, describing the spline basis to be created.
Currently, functions |
data |
a |
... |
additional arguments to be passed to the function specified in |
a data.frame with a many rows and columns as data. The spline
consists of df basis functions, but is contained in a single column named
spline, followed by the name of the predictor variable specified.
plot.splinetree predict.splinetree
lmertree glmertree
data <- setup.spline("ns(PAR, df = 3)", data = eco) head(data) matplot(x = data$PAR[order(data$PAR)], y = data$spline.PAR[order(data$PAR),], type = "l") data <- setup.spline("bs(PAR, degree = 2, df = 4)", data = eco) head(data) matplot(x = data$PAR[order(data$PAR)], y = data$spline.PAR[order(data$PAR),], type = "l")data <- setup.spline("ns(PAR, df = 3)", data = eco) head(data) matplot(x = data$PAR[order(data$PAR)], y = data$spline.PAR[order(data$PAR),], type = "l") data <- setup.spline("bs(PAR, degree = 2, df = 4)", data = eco) head(data) matplot(x = data$PAR[order(data$PAR)], y = data$spline.PAR[order(data$PAR),], type = "l")
splinetree is a wrapper for functions (g)lmertree to simplify
fitting, visualizing and predicting spline-based trees.
splinetree(formula, data, family = "gaussian", ...)splinetree(formula, data, family = "gaussian", ...)
formula |
A four-part function See Examples below, and
|
data |
a |
family |
family specification for |
... |
additional arguments to be passed to function |
A object of class"splinetree" and "lmertree" or "glmertree".
plot.splinetree predict.splinetree
lmertree glmertree
sp <- splinetree(Pn ~ ns(PAR, df = 5) | Specimen | Species, data = eco) spsp <- splinetree(Pn ~ ns(PAR, df = 5) | Specimen | Species, data = eco) sp
Prints a summary of the local and/or global terms in a fitted GAM tree.
## S3 method for class 'gamtree' summary(object, ...)## S3 method for class 'gamtree' summary(object, ...)
object |
object of class |
... |
further arguments to be passed to |
The printed results by default also provide standard error and significance tests. These should be taken with a big grain of salt, because they do NOT account for the searching of the tree structure; they assume the tree structure was known in advance. They thus should be interpreted as overly optimistic and with caution.
## GAM tree without global terms: gt <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) summary(gt)## GAM tree without global terms: gt <- gamtree(Pn ~ s(PAR, k = 5L) | Species, data = eco, cluster = Specimen) summary(gt)
VarCorr.gamtree extracts fixed-effects random-effects covariance
matrices from the nodes of a GAM tree.
## S3 method for class 'gamtree' VarCorr(x, sigma = 1, which = "terminal", ...)## S3 method for class 'gamtree' VarCorr(x, sigma = 1, which = "terminal", ...)
x |
an object of class |
sigma |
an optional numeric value used as a multiplier for the standard deviations. |
which |
character. |
... |
additional arguments to be passed to |