报告人:Cheng Yong Tang
报告地点:数学与统计学院四楼报告厅
报告时间:2019年07月01日星期一10:30-11:30
邀请人:
报告摘要:
It is well known that strong correlations between explanatory variables are problematic for high-dimensional regularized regression methods. Due to the violation of the irrepresentable condition, the popular LASSO method may suffer from false inclusions of non-contributing variables. In this paper, we propose preprocessing orthogonal decompositions (PROD) for the explanatory variables in high-dimensional regressions. The PROD procedure is constructed based upon a generic orthogonal decomposition of the design matrix. We investigate in detail three specific cases of the PROD: one by the conventional principal component analysis, one by a novel optimization incorporating the impact from the response variable, and one by random projections. We recognize that the PROD can be flexibly adapted taking multiple objectives into consideration such as avoiding increasing the variance of the resulting estimator while alleviating strong correlations between the explanatory variables. Extensive numerical studies with simulations and data analysis show the promising performance of the PROD in improving the performance of high-dimensional penalized regression. Our theoretical analysis also confirms its effect and benefit for high-dimensional regularized regression methods.
主讲人简介:
Dr. Cheng Yong Tang is Associate Professor of Statistics and the Seymour Wolfbein Senior Research Fellow of Fox School of Business at Temple University. He is an Associate Editor of Reproducibility of the Journal of the American Statistical Association, Application and Case Studies. Dr Tang’s research experience covers topics in data sciences, finance, econometrics, sampling survey statistics, and statistical learning. His research has been funded by the NSF and NIH. Dr Tang is an Elected Member of the International Statistical Institute, a Fellow of the Royal Statistical Society, a member of the American Statistical Association, and a member of the Institute of Mathematical Statistics.