报告人:李启寨
报告地点:数学与统计学院104报告厅
报告时间:2018年11月22日星期四10:30-11:30
报告摘要:
Recent advances in high-throughput biotechnologies have provided an unprecedented oppor tunity for biomarker discovery, which, from a statistical point of view, can be cast as a variable selection problem. This problem is challenging due to the high-dimensional and non-linear nature of omics data and, in general, it suffers three difficulties: (i) an unknown functional form of the nonlinear system, (ii) variable selection consistency, and (iii) high-demanding computation. To circumvent the first difficulty, we employ a feed-forward neural network to approximate the unknown nonlinear function motivated by its universal approximation ability. To circumvent the second difficulty, we conduct structure selection for the neural network, which induces variable selection, by choosing appropriate prior distributions that lead to the consistency of variable selection. To circumvent the third difficulty, we implement the population stochastic approximation Monte Carlo algorithm, a parallel adaptive Markov Chain Monte Carlo algorithm, on the Open MP platform which provides a linear speedup for the simulation with the number of cores of the computer. The numerical results indicate that the proposed method can work very well for identication of relevant variables for high-dimensional nonlinear systems. The proposed method is successfully applied to identication of the genes that are associated with anticancer drug sensitivities based on the data collected in the cancer cell line encyclopedia study.
主讲人简介:
2001年于中国科学技术大学获学士学位,2006年于中国科学院数学与系统科学研究院获博士学位,2006年-2009年在美国国立卫生健康研究院国家癌症研究所从事博士后研究,2006年7月至今在中国科学院数学与系统科学研究院工作,其中2006年-2010年任助理研究员,2010年-2015年任副研究员,2015年至今任研究员。在自然遗传学,美国人类遗传学,英国皇家统计学会会刊B辑,美国统计学会会刊等杂志发表及接收发表SCI论文80余篇,被国际同行引用1600余次。曾获国家优秀青年科学基金、中国科学院卢嘉锡青年人才奖、中国工业与应用数学学会优秀青年学者奖等。