[PAST EVENT] System Reliability and Data Movement Challenges at Scale: Challenges, Insights, and Opportunities

February 12, 2016
8am - 9am
McGlothlin-Street Hall, Room 020
251 Jamestown Rd
Williamsburg, VA 23185Map this location
System Reliability and Data Movement Challenges at Scale: Challenges, Insights, and Opportunities

Devesh Tiwari, Oak Ridge National Lab


Continued increase in computing power and faster storage subsystems have enabled scientists to expedite the process of scientific discovery and helped businesses increase profit on investments. As a result, emerging data-intensive workloads (e.g., scientific simulations, data analytics, etc.) are able to continuously produce and analyze unprecedented amount of data on large-scale computing systems. However, these trends have aggravated system reliability and data movement challenges for data-centric computing systems -- limiting the system efficiency significantly. Current strategies for mitigating reliability and data movement challenges will be far from optimal for future computing systems due to high performance and I/O overhead.

In this talk, I will discuss my research addressing these challenges. I will present evidences that provide new insights and challenge conventional wisdom about system characteristics, workload behavior, and interaction between them. I will share novel techniques that exploit these characteristics and interactions to improve the overall efficiency of large-scale systems. Finally, I will discuss the challenges in efficient and reliable management of future heterogenous data-centric computing systems, and possible approaches toward mitigating these challenges.


Devesh Tiwari is a Staff Scientist at the Oak Ridge National Laboratory. His primary research interests are in large scale data management and improving the system efficiency of large scale computing systems. His papers have received best paper award nominations at conferences including Supercomputing (SC), DSN, and IPDPS. His work has appeared in various conferences such as USENIX FAST, SC, DSN, HPCA, IPDPS, ICAC, LCTES, and covered by media including slashdot and HPCWire. Devesh earned his Ph.D. degree in Electrical and Computer Engineering from North Carolina State University. Before that, he obtained his B.S. degree in Computer Science and Engineering from Indian Institute of Technology (IIT) Kanpur, India.