fastcpd (Beta version) 2023
fastcpd implements an algorithm based on the sequential gradient descent and quasi-Newton’s method for change-point analysis.
It can be applied to change-point detection in linear models, generalized linear models, robust regression, penalized regression, autoregressive models, etc.
MicrobiomeStat: Statistical Methods for Microbiome
Compositional Data 2022
A suite of methods for powerful and robust microbiome data analysis
addressing zero-inflation, phylogenetic structure and compositional
effects (Zhou et al., 2021). The methods can be applied to the
analysis of other (high-dimensional) compositional data arising from
sequencing experiments.
TDFDR: Two-dimensional false discovery rate control for powerful confounder adjustment in omics association studies 2021
The package implements the two-dimensional false discovery rate control for powerful confounder adjustment in omics association analysis. The method is based on the idea that the confounder(s) usually affect part of the omics features, and thus adjusting the confounder(s) for ALL omics features will be over-adjustment, leading to reduced statistical power. The proposed procedure starts with performing the unadjusted analysis (first dimension - filtering) to narrow down the list of omics features that are more likely to be affected by either the confounder or the variable of interest or both. In the second dimension, we conduct confounder-adjusted analysis on these 'top' candidates, which are enriched in signals, to reduce multiple testing burden and increase the power. The method belongs to the general topic of using auxiliary data to increase the power of multiple testing, which has recently received tremendous research interest. In our case, the auxiliary data are the the unadjusted statistics, which could inform the probability of the null hypotheses being true. The difficulty here is to take into account the correlation between the auxiliary data (unadjusted statistics) and the main data (adjusted statistics). We provide a procedure that is theoretically guaranteed to control the false discovery rate while maximizing the power.
OrderShapeEM 2021
OrderShapeEM implements the optimal false discovery rate (FDR)
control procedure with auxiliary information, particularly for prior
ordering information. The framework is based on local FDR with
hypothesis-specific null probability. The prior null proabilities
are estimated using isotonic regression (PAVA algorithm) with
respect to the prior ordering information. The inputs of our
OrderShapeEM are simply P-values and their prior ordering.
CAMT 2020
The CAMT package implements two covariate adaptive multiple testing
procedures (FDR and FWER) described in Covariate Adaptive False
Discovery Rate Control with Applications to Omics-Wide Multiple
Testing and Covariate Adaptive Family-wise Error Control with
Applications to Genome-wide Association Studies. CAMT allows the
prior null probability and/or the alternative distribution to depend
on covariates. It is robust to model mis-specification and is
computationally efficient. The package also contains functions for
testing the informativeness of the covariates for multiple testing,
and a comprehensive simulation function, which covers a wide range
of settings.
GUniFrac: Generalized UniFrac Distances,
Distance-Based Multivariate Methods and Feature-Based Univariate
Methods for Microbiome Data Analysis
2021
A suite of methods for powerful and robust microbiome data analysis
including data normalization, data simulation, community-level
association testing and differential abundance analysis. It
implements generalized UniFrac distances, Geometric Mean of Pairwise
Ratios (GMPR) normalization, semiparametric data simulator,
distance-based statistical methods, and feature-based statistical
methods. The distance-based statistical methods include three
extensions of PERMANOVA: (1) PERMANOVA using the Freedman-Lane
permutation scheme, (2) PERMANOVA omnibus test using multiple
matrices, and (3) analytical approach to approximating PERMANOVA
p-value. Feature-based statistical methods include linear
model-based methods for differential abundance analysis of
zero-inflated high-dimensional compositional data.
jdcov 2019
jdov computes joint distance covariance (JdCov) among more than two
random vectors of arbitrary dimensions (see Chakraborty and Zhang,
2019) and implements a bootstrap based test for joint independence
among the random vectors based on JdCov.
SILM: Simultaneous Inference for Linear Models
2019
Simultaneous inference procedures for high-dimensional linear models
as described by Zhang and Cheng (2017).