[PAST EVENT] Jianing Zhao, Computer Science - Oral Preliminary Exam

October 24, 2017
1pm - 2:30pm
Location
ISC1 (Integrated Science Center), Room 1291
540 Landrum Dr
Williamsburg, VA 23185Map this location
Jianing Zhao

Abstract:

Causal inference of observational studies is an important research area in machine learning, data mining and artificial intelligence. As more and more data generated and collected, we have great opportunity and also challenges to do causal inference of observational data. We focus on causal inference methods that can be applied to data sets of aid projects of World Bank.

The World Bank provides billions of dollars in development finance to countries across the world every year.  As many projects are related to the environment, we want to understand the World Bank projects impact to forest cover. However, the global extent of these projects results in substantial heterogeneity in impacts due to geographic, cultural, and other factors.  Recent research by Athey and Imbens has illustrated the potential for hybrid machine learning and causal inferential techniques which may be able to capture such heterogeneity.  We apply their approach using a geolocated dataset of World Bank projects, and augment this data with satellite-retrieved characteristics of their geographic context (including temperature, precipitation, slope, distance to urban areas, and many others). We use this information in conjunction with causal tree (CT) and causal forest (CF) approaches to contrast `control' and `treatment' geographic locations to estimate the impact of World Bank projects on vegetative cover.

Quantifying the impact of an intervention or treatment in a real setting is a common and challenging problem. For example, we would like to calculate the environmental implications of aid projects in third world countries that target economic development.  For causal inference problems of this kind, the Rubin causal model is one of several popular theoretical frameworks that comes with a set of algorithmic methods to quantify treatment effects. However, for a given data set, we neither know the ground truth nor can we easily increase the size of the data set. So, simulation is a natural choice to evaluate the applicability of a set of methods for a particular problem. In this paper, we report findings of a simulation study with four causal inference approaches, namely two single tree approaches  (transformed outcome tree, causal tree), and two random forest versions of the former.

Bio:
Jianing Zhao is a Ph.D. candidate at William & Mary, working with Dr. Peter Kemper. His research interests are in causal inference, data mining and machine learning. He received his B.S. degree in EE and a Master degree in Computer Engineering from China Agricultural University.


Contact

[[vlthompsondopp, Vicki Dopp]]