AUTHOREA
Log in Sign Up Browse Preprints
LOG IN SIGN UP
Di Chang
Di Chang

Public Documents 1
COR: An R package for Optimal Subset Selection in Distributed Estimation
Di Chang
Guangbao Guo

Di Chang

and 2 more

May 29, 2025
In the practice of distributed regression, selecting the optimal subset to eliminate redundant information is crucial for enhancing model performance. Distributed data subsets often face multiple challenges, including outliers, high variability, data duplication, excess independent variables, and point redundancy. Effectively managing and reducing this redundant information is an important approach to mitigate inconsistencies in statistical inference. In this paper, we have developed an R package COR, which implements optimal subset selection with respect to the covariance matrix, observation matrix, and response vector (COR), as well as estimating the optimal subset length. The implementation details of the COR package are presented, and its superior performance is demonstrated through a series of simulation studies and real-world applications, including the estate dataset ranging from low to high dimensions and riboflavin datasets.

| Powered by Authorea.com

  • Home