USRC Students and Postdocs

This page lists all current and past students who have worked at the USRC.

Doug Otstott,

PhD Student, School of Computing & Information Sciences, Florida International University


Visit Homepage
dotst001@fiu.edu
Doug is a PhD student from Florida International University's School of Computing and Information Sciences in Miami.  It was here that he began his graduate studies after graduating in the Summer of 2011 and receiving the CS program's award for Outstanding Graduate.

While in Miami, Doug works at lab for Virtualized Infrastructure, Software and Applications at FIU doing research on caching for large scale systems and scheduling optimizations for SSDs.

During his time at USRC, Doug will be working to extend a preexisting project "Transparently Consistent Asynchronous Shared Memory" by exploring applications in check pointing and in-situ data analysis.


Dr. Sean Williams,

Postdoctoral Researcher, New Mexico Consortium


swilliams@newmexicoconsortium.org
Sean Williams is a postdoctoral researcher with the New Mexico Consortium. Williams is working to create a unified abstraction for complex memory systems, along with a suite of libraries to automate the placement of allocations on memory devices.

Kristina Frye,

PhD Student, Portland State University


kfrye@pdx.edu
Kristina Frye has a BA in Mathematics from Reed College and currently is a PhD student at Portland State University.

Fry's research is in the area of interference and performance variability on HPC systems. At the USRC, Frye works on Application performance characterization through application monitoring.


Atanu Barai,

Masters Student, New Mexico State University


atanu@nmsu.edu
Atanu Barai is a first-year Masters student at New Mexico State University. Barai is a member of the Performance Evaluation and Architecture Laboratory (PEARL) at NMSU, and working with Dr. Badawy. His research and interests lie in the general area of Computer Architecture and Performance Analysis.

At the USRC, Barai is working on the Performance Prediction Toolkit, which is a discrete-event-simulation based parameterized hardware software co-design framework.


Isaiah Liberda,

Student, USRC


isaiahliberda@gmail.com
Isaiah Liberda is a high school student who works in the USRC machine room and completes much needed tasks such as organizing and labeling cords, and running fibers to different machines.

Yehia Arafa,

Masters Student, Klipsch School of Electrical and Computer Engineering


yarafa@nmsu.edu
Yehia Arafa is currently a Masters student at Klipsch school of Electrical and Computer Engineering, New Mexico State University. Yehia got his Bachelor degree from Faculty of Engineering, Alexandria University, Egypt. Department of Computer and Communications in 2017. His research interest includes: High performance Computing (HPC), Distributed Systems and Big Data, Performance Analysis/Prediction, and Software Hardware Co-Design.

At USRC, Yehia is working on performance prediction of super computers through contributing and validating the hardware model of the open source tool developed at Los Alamos National Lab (LANL); Performance Prediction Toolkit (PPT) and improving its prediction accuracy.


Michael Sevilla,

PhD Student, University of California, Santa Cruz


msevilla@ucsc.edu
Michael evaluates and designs distributed file system metadata management systems. His publications prototype ideas on CephFS, the file system that uses the Ceph distributed object store. His lab also has a special interest in storage system programmability and reproducibility in systems research. At USRC, Michael is working with Brad Settlemyer on load balancing policies for HXHIM, an HPC key-value store.

Alexandra DeLucia,

Undergraduate Student, Rollins College


aadelucia@gmail.com
Alexandra is working on a bachelors degree in Computer Science and Mathematics at Rollins College. At the USRC she is working on creating a model of system logs from high performance computers. The model will later be used in anomaly detection.

Michael Lane,

Post Bachelor, Los Alamos National Laboratory


mlane@lanl.gov
Michael Lane is researching techniques to make exact computations on fault-prone hardware.

Kai Wu,

PhD Student, University of California, Merced


Visit Homepage
kwu42@ucmerced.edu
Kai currently a Computer Science PhD student in EECS at University of California, Merced. Before coming to UC Merced, he got his Masters degree in Computer Science and Engineering from Michigan State University in 2016. His research broadly falls into general areas of High Performance Computing (Large-Scale Parallel Systems). Specifically, he focuses on the following areas:(i) Parallel programming models and runtime; (ii) Performance optimization and modeling; (iii) Resilience and Consistency; (iv)Non-volatile memory; (v) Fault Tolerance in Extreme-Scale Parallel Systems. At USRC, Kai is working on building faults model on serial codes and predicting the faults on parallel codes.

Olena Tkachenko,

Post Bachelor, Los Alamos National Laboratory


otkac001@fiu.edu
Olena graduated with a B.S. in Computer Science from FIU's School of Computing and Information Sciences in Miami. At FIU she did research at the VISA lab as a URA working on masquerading network traffic for Mission Critical Cloud Computing, and isolation benchmarking of containers. While working at LANL as a PostBac she designed an application model (IMCSim) of the implicit Monte Carlo particle code IMC using the Performance Prediction Toolkit (PPT), a discrete-event simulation-based modeling framework for predicting code performance on a large range of parallel platforms. At USRC she is currently working on predicting DRAM fault locations in HPC systems using structured learning and various ML techniques. Her research interests include HPC, ML, and fault prediction/mitigation.

Jian Peng,

PhD Student, Computer Science, Illinois Institute of Technology


jpeng10@hawk.iit.edu
I'm a second-year Phd student majoring in Computer Science at Illinois Institute of Technology(IIT), advised by Dr. Ioan Raicu. My research interest is HPC.

Currently at USRC I'm working on Burst Buffer simulation in Dragonfly network. The goal is to develop a simulator which models supercomputers with dragonfly network and burst buffer storage architecture. With such a simulator, we will be able to carry out more research on problems such as system bottleneck and burst buffer related scheduling.


Nathaniel Graham,

Graduate Student, Computer Science, UNM


ngraham@lanl.gov
Nathaniel Graham earned his bachelors degree in Computer Science from Eastern Washington University, and is currently pursuing a master’s of Computer Science degree from Georgia Tech.

At the USRC, Nathaniel works on networking software. He provides bug fixes, documentation, and enhancements for Open MPI, as well as some work with UCX. He is also the primary maintainer of the Open MPI Java bindings.


Abida Haque,

PhD Student, Computer Science, North Carolina State University


haqueabida@gmail.com
Abida will be a PhD student at North Carolina State University in computer science. She has a bachelor's degree in mathematics from Carnegie Mellon University and a master's degree in computer science from Georgia Tech.

During her time at USRC, Abida will help with the project Latent Anomaly Detection for Supercomputing System Performance.


Mitchell Klein,

Student, Los Alamos National Laboratory


mjklein@lanl.gov
​Mitchell Klein is a summer intern at Los Alamos where he is working on a project that tests the resilience of algorithms. He recently earned his B.A. in Applied Mathematics from the University of St. Thomas in St. Paul, Minnesota. In the future, Mitchell plans to pursue graduate studies in mathematics or a related field.

Emily Vecchia,

Undergraduate Student, University of St. Thomas


evecchia@lanl.gov
Emily Vecchia is an undergraduate student majoring in Mathematics with a minor in Computer and Information Sciences at the University of Saint Thomas in Saint Paul, Minnesota. This summer, she is working with her mentor, Laura Monroe, on a probabilistic computing project testing the resilience of algorithms to faults. In the fall, Emily will return to St. Thomas where she will be a senior.

Ryan Slechta,

Graduate Student, Ohio State University


rslechta@lanl.gov
Ryan Slechta received his Bachelors degree in Mathematics and Computer Science from the University of St. Thomas in May 2016. He works with NMC staff and scientists on problems of algorithmic resilience, and is currently working to improve the reliability of erasure code techniques. In the fall, he will be joining the Topology, Geometry, and Data Analysis group at The Ohio State University.

Qing Zhen,

Graduate Student, Carnegie Mellon University


zhengq@cs.cmu.edu
Qing is a 4th-year Ph.D. student at Carnegie Mellon University Computer Science Department. At Carnegie Mellon, Qing works with Professor Garth Gibson, researchers at the Carnegie Mellon Parallel Data Lab, and scientists at Los Alamos National Lab (LANL), on file system metadata designs (IndexFS and DeltaFS) for massive-scale science applications. Their IndexFS paper has won Best Paper Award at the Supercomputing Conference (SC) 2014. At NMC Ultra System Research Center (USRC), Qing works with Brad Settlemyer and other USRC and LANL scientists on VPIC and DeltaFS integration, and high-performance metadata implementation and demonstration.

Rusty H Davis,

Graduate Student, Clemson University


zopppo@gmail.com
Rusty graduated with his B.S. in computer science​ from the School of Computing at Clemson University in May 2016. He will begin pursuing his masters of Computer Science at Clemson University in Fall 2016. Rusty has been working with the USRC since the summer of 2014. His initial work was with Dr. Nathan DeBardeleben and Dr. William Jones concerning Algorithmic-Based Fault Tolerant Matrix Multiplication. His current work is focused on ​quantifying the resiliency of​ Algorithmic-Based Fault Tolerant Fast Fourier Transforms and creating an interface for the F-SEFI fault injector. His research interests include High Performance Computing, Operating Systems, and Resilience/Fault tolerance.

Scott Lavigne,

Graduate Student, Ohio State University


lavigne@lanl.gov
Scott is a graduate student studying network errors on LANL's Trinity supercomputer. While obtaining his BS in Computer Science with a minor in Applied Mathematics from Coastal Carolina University, Scott worked on various projects for the USRC, ranging from fault injection studies with F-SEFI to analyzing ECC of interest to the team. In the fall, Scott will begin the direct PhD track at The Ohio State University.

Adam P Morrow,

Undergraduate Student, Brigham Young University


Adam is an undergraduate intern at the New Mexico Consortium where he is working on mapping error clusters to originating faults and errors in DRAM units on leadership-class supercomputing machines. He is pursuing a B.S. degree at Brigham Young University in Applied Computational Mathematics and Computer Science.

Dr. Li Tan,

Postdoctoral Researcher, Los Alamos National Laboratory


Li Tan graduated with a Ph.D. degree in Computer Science from University of California, Riverside (UCR) in 2015. His chief research interest is High Performance Computing (HPC), in particular improving resilience/reliability and energy/power efficiency for high performance scientific algorithms and applications, and software debugging in large-scale HPC environments. At USRC, he works in fine-grained resilience and low-power modeling and provisioning for HPC applications, using fault injection and near-threshold voltage reduction techniques. He served as a reviewer for prestigious conferences and journals on high performance parallel and distributed computing, such as SC, IPDPS, PACT, CCGrid, IEEE TPDS, IJHPCA, and JSS. He is a recipient of Dean's Distinguished Fellowship from UCR in 2010. He is a Member of the IEEE and a Member of the ACM.

Jason Lee,

Master's Student, Computer Science, Florida State University


Jason Lee is a master's student at Florida State University and received a B.S. from Rensselaer Polytechnic Institute. His interests include cryptography, parallel programming, and networking.

At the USRC he is working on software defined networking with Infiniband for high performance computing systems.


Song Huang,

Ph.D. Student, Department of Computer Science and Engineering, University of North Texas


SongHuang@my.unt.edu
Song Huang is a Ph.D. student in the Department of Computer Science and Engineering at the University of North Texas. He works in the Dependable Computing Systems Lab directed by Dr. Song Fu. His research interests include power and energy consumption on the HPC system, disk failure modeling and analysis, resilience and fault tolerance techniques on the HPC system. Currently, he works at USRC on characterizing the power consumption on the Haswell machines and resource allocation and scheduling on the HPC system.

Panruo Wu,

PhD Student, Computer Science & Engineering, University of California Riverside


pwu011@ucr.edu
Panruo Wu got his bachelor’s degree in mathematics from University of Science and Technology of China in 2011. He is currently a PhD candidate in UCR. His research interests include fault tolerance in parallel and distributed systems and numerical algorithms. At the USRC he works on F-SEFI fault injector and developing highly fault tolerant algorithms that can run correctly and efficiently in the presence of numerous architectural faults.

Nicholas Nelson,

Graduate Student, New Mexico State University


njnelson@nmsu.edu
Nicholas received his bachelor's degree in micro-biology from Cornell University in 2010. After working in Flow Cytometry and Scanning Electron Microscopy he entered the computer science graduate program at New Mexico State University. Nicholas is researching parallel programing in high performance computing (HPC). At the USRC, Nicholas worked on benchmarking OpenSHMEM in contrast to MPI.

Zhou Tong,

PhD Student, Computer Science, Florida State University


tong@cs.fsu.edu
Zhou is a Ph.D. student in the Department of Computer Science at Florida State University, and he received his B.S. from Millsaps College. His research includes interconnection network, parallel application performance modeling and data mining. At the USRC, Zhou worked on performance evaluation and communication modeling on MPI applications to provide fast classification assessment based on the understanding of the characteristics of various parallel applications in production HPC systems.

Michael Carlton,

Undergraduate Student, Computer Engineering and Computer Science, University of Kentucky


mcarlton93@gmail.com
Michael Carlton is an undergraduate with a dual major in Computer Engineering and Computer Science at the University of Kentucky. This summer at LANL he worked with his mentors, Nathan DeBardeleben and Sean Blanchard, to develop a new method for handling memory hardware errors. This project was a success and will hopefully be utilized in the near future on production machines. Michael recently returned to college to begin his senior year and will be working as both a Teaching Assistant and Resident Advisor during the academic year.

Lewis Tseng,

PhD Student, Computer Science, University of Illinois at Urbana-Champaign


ltseng3@uiuc.edu
Lewis Tseng is a PhD student in the department of Computer Science at the University of Illinois at Urbana-Champaign. His research interests include fault-tolerant distributed algorithms and systems. In particular, he mainly focus on related issues of consensus or consistency in theoretical models that capture the characteristics of large-scale systems in both wired and wireless network. At USRC, Lewis designed and developed a distributed key-value store to support service registration and lookup service for HPC systems. The main goals of the new key-value store are: (i) autonomous system management, (ii) configurable per-key consistency and replication at the data granularity, (iii) flexible and configurable deployment based on different policies for power, performance and resilience.

Zhenjie Chen,

PhD Student, Computer Science, University of New Mexico


Visit Homepage
zhenjie@cs.unm.edu
505 615-9186 (o)
Zhenjie is a graduate student at The University of New Mexico and joined Scalable Systems Lab at CS@UNM from 2012. He mainly focus on scalable system and fault tolerance. He likes hiking, snowboarding, skating and coding, and ...

At USRC, Zhenjie is working on integrating Scalable Information Propagation service into LIBI(Lightweight Infrastructure-Bootstrapping Infrastructure), in order to improve the performance of bootstrapping numerous processes especially the wire-up procedure. 



Page 1 of 2

© 2018 New Mexico Consortium