| Title: | Post-Selection Inference via Simultaneous Confidence Intervals |
|---|---|
| Description: | Post-selection inference in linear regression models, constructing simultaneous confidence intervals across a user-specified universe of models. Implements the methodology described in Kuchibhotla, Kolassa, and Kuffner (2022) "Post-Selection Inference" <doi:10.1146/annurev-statistics-100421-044639> to ensure valid inference after model selection, with applications in high-dimensional settings like Lasso selection. |
| Authors: | Henry Chukwuma [aut, cre] |
| Maintainer: | Henry Chukwuma <[email protected]> |
| License: | MIT + file LICENSE |
| Version: | 0.1.2 |
| Built: | 2026-06-09 09:59:56 UTC |
| Source: | https://github.com/chukyhenry/posir |
Visualizes confidence intervals returned by simultaneous_ci() using base R graphics.
Estimates are shown as points with corresponding CI segments, grouped and labeled by
model and coefficient name. Supports customization for log scale, character sizes,
label trimming, and reference lines.
## S3 method for class 'simultaneous_ci_result' plot( x, y = NULL, subset_pars = NULL, log.scale = FALSE, cex = 0.8, cex.labels = 0.8, las.labels = 1, pch = 16, col.estimate = "blue", col.ci = "darkgray", col.ref = "red", ref.line.pos = 0, lty.ref = 2, main = "Simultaneous Confidence Intervals", xlab = NULL, label.trim = NULL, ... )## S3 method for class 'simultaneous_ci_result' plot( x, y = NULL, subset_pars = NULL, log.scale = FALSE, cex = 0.8, cex.labels = 0.8, las.labels = 1, pch = 16, col.estimate = "blue", col.ci = "darkgray", col.ref = "red", ref.line.pos = 0, lty.ref = 2, main = "Simultaneous Confidence Intervals", xlab = NULL, label.trim = NULL, ... )
x |
An object of class |
y |
Ignored. |
subset_pars |
Optional character vector. Coefficient names to subset the plot. Default: all. |
log.scale |
Logical. Plot on logarithmic scale. Intervals crossing 0 or with nonpositive bounds are excluded. |
cex |
Point size for estimates. Default = 0.8. |
cex.labels |
Label size for y-axis. Default = 0.8. |
las.labels |
Orientation of y-axis labels (0, 1, 2, or 3). Default = 1. |
pch |
Plot character for point estimates. Default = 16. |
col.estimate |
Color of point estimates. Default = "blue". |
col.ci |
Color of confidence interval lines. Default = "darkgray". |
col.ref |
Color of reference line(s). Default = "red". |
ref.line.pos |
Position(s) for vertical reference line(s). Default = 0. Set to NULL to omit. |
lty.ref |
Line type for reference lines. Default = 2 (dashed). |
main |
Plot title. Default = "Simultaneous Confidence Intervals". |
xlab |
X-axis label. If NULL and |
label.trim |
Integer. Trims long coefficient labels to this width (adds "..."). Optional. |
... |
Additional arguments passed for future use (currently ignored). |
Invisibly returns a list:
ycoords: Named vector of y-axis positions for each label
xlim: Range of x-axis limits used
ylim: Range of y-axis limits used
If no valid intervals are available for plotting, returns invisible(NULL).
set.seed(1) X <- matrix(rnorm(100*2), 100, 2, dimnames = list(NULL, c("X1", "X2"))) y <- 1 + X[,1] - X[,2] + rnorm(100) res <- simultaneous_ci(X, y, list(mod = 1:3), B = 100, add_intercept = TRUE) plot(res)set.seed(1) X <- matrix(rnorm(100*2), 100, 2, dimnames = list(NULL, c("X1", "X2"))) y <- 1 + X[,1] - X[,2] + rnorm(100) res <- simultaneous_ci(X, y, list(mod = 1:3), B = 100, add_intercept = TRUE) plot(res)
Implements Algorithm 1 from the reference paper using bootstrap-based max-t statistics to construct valid simultaneous confidence intervals for selected regression coefficients across a user-specified universe of linear models.
simultaneous_ci( X, y, Q_universe, alpha = 0.05, B = 1000, add_intercept = TRUE, bootstrap_method = "pairs", cores = 1, use_pbapply = TRUE, seed = NULL, verbose = TRUE, ... )simultaneous_ci( X, y, Q_universe, alpha = 0.05, B = 1000, add_intercept = TRUE, bootstrap_method = "pairs", cores = 1, use_pbapply = TRUE, seed = NULL, verbose = TRUE, ... )
X |
Numeric matrix (n x p): Design matrix. Must have unique column names.
Do not include an intercept if |
y |
Numeric vector (length n): Response vector. |
Q_universe |
Named list of numeric vectors. Each element specifies a model as a
vector of column indices (accounting for intercept if |
alpha |
Significance level for the confidence intervals. Default is 0.05. |
B |
Integer. Number of bootstrap samples. Default is 1000. |
add_intercept |
Logical. If TRUE, adds an intercept as the first column of the design matrix. Default is TRUE. |
bootstrap_method |
Character. Bootstrap type. Only "pairs" is currently supported. |
cores |
Integer. Number of CPU cores to use for bootstrap parallelization. Default is 1. |
use_pbapply |
Logical. Use |
seed |
Optional numeric. Random seed for reproducibility. Used for parallel-safe RNG. |
verbose |
Logical. Whether to display status messages. Default is TRUE. |
... |
Reserved for future use. |
Supports parallel execution, internal warnings capture, and returns structured results with estimates, intervals, bootstrap diagnostics, and inference statistics.
A list of class simultaneous_ci_result with elements:
intervals: Data frame with estimates, confidence intervals, variances, and SEs
K_alpha: Bootstrap (1 - alpha) quantile of max-t statistics
T_star_b: Vector of bootstrap max-t statistics
n_valid_T_star_b: Number of finite bootstrap max-t statistics
alpha, B, bootstrap_method: Metadata
warnings_list: Internal warnings collected during bootstrap/model fitting
valid_bootstrap_counts: Valid bootstrap replicates per parameter
n_bootstrap_errors: Total bootstrap fitting errors
Kuchibhotla, A., Kolassa, J., & Kuffner, T. (2022). Post-selection inference. Annual Review of Statistics and Its Application, 9(1), 505–527.
set.seed(123) X <- matrix(rnorm(100 * 2), 100, 2, dimnames = list(NULL, c("X1", "X2"))) y <- X[,1] * 0.5 + rnorm(100) Q <- list(model = 1:2) res <- simultaneous_ci(X, y, Q, B = 100, cores = 1) print(res$intervals) plot(res)set.seed(123) X <- matrix(rnorm(100 * 2), 100, 2, dimnames = list(NULL, c("X1", "X2"))) y <- X[,1] * 0.5 + rnorm(100) Q <- list(model = 1:2) res <- simultaneous_ci(X, y, Q, B = 100, cores = 1) print(res$intervals) plot(res)