A&S Graduate Studies
[PAST EVENT] Qiong Wu, Computer Science - Dissertation Proposal
Abstract:
The big data era has witnessed an explosion in the massive collection of data. The trend is towards more observations but even more radically larger numbers of features---automatic and systematic collection of hyper-informative details about each observation. This phenomenon is characterized by high dimensionality. Owing to the "curse of dimensionality'' and its negative impact on generalization performance, the high dimensional setting is known for high risk of overfitting. In addition, the high-dimensional data often possess a low signal-to-noise ratio, which causes the overfitting problem more severe.
To overcome the overfitting issues and produce non-trivial forecasts, we propose regularization techniques for linear and non-linear models with theoretical guarantees in the high-dimensional setting. Our regulations can be directly applied to a wide range of applications, such as social networks and finance.
Provable regularization for linear models. We consider the linear regression problem where existing low-rank regularization techniques are not optimized for the high dimensional setting. We propose Apdative-RRR, an efficient algorithm which only involves two SVD, and establishes statistical guarantees on its performance. The algorithm decouples the problem by first estimating the precision matrix of the features, and then solving the matrix denoising problem. To complement the upper bound, we introduce new techniques for establishing lower bounds on the performance of any algorithm for this problem. Our experimental results demonstrate the efficacy of our algorithm.
Integrating high-dim and machine learning techniques for non-linear models. We use high-dimensional kernel-based techniques to design a provable algorithm that is robust against overfitting issues. We further extensively integrate our high-dimensional regularization with many machine learning techniques such as deep learning, gradient boosting, and non-parametric methods to effectively extract non-linear signals. Extensive experiments demonstrate that all these methods deliver significantly stronger forecasting power compared to the recent state-of-the-art methods.
Bio:
Qiong Wu is a Ph.D. candidate in the Department of Computer Science at William & Mary advised by Prof. Zhenming Liu. Her research on high-dimensional models for low-signal to noise problems, with a goal to integrate high-dimensional regularization methods with deep learning, non-parametric techniques, and recommendation system. Her Ph.D. research has been published in NeurIPS 2020, ICAIF 2020, AAAI 2019, and ICSC 2019. Her current and past efforts include collaborations with AT&T on customer care analysis, and the Alan Turing Institute on high-dimensional regularization methods and developing robust forecasting models for financial instruments. Before joining William & Mary, she received her B.Eng degree from the Dalian University of Technology in 2014 and an M.Sc degree from the University of Hong Kong in 2016.