Machine Learning and Statistics

Reinforcement Learning

Natural Language Processing

Computer Vision

Applications (Science, Climate, Health, etc…)

Supervised Learning

Unsupervised Learning

Online Learning

Graph Neural Networks

Faculty

Ari Holtzman

Assistant Professor of Computer Science and Data Science
Nikos Ignatiadis

Assistant Professor of Statistics and Data Science
Frederic Koehler

Assistant Professor of Statistics and Data Science
Bo Li

Associate Professor of Computer Science and Data Science
Tian Li

Assistant Professor of Computer Science and Data Science
Dan Nicolae

Elaine M. and Samuel D. Kersten, Jr. Distinguished Service Professor; Faculty Co-Director, Data Science Institute and Committee on Data Science; Professor of Statistics, Human Genetics, Medicine, Section of Genetic Medicine and the College
Veronika Rockova

Professor of Econometrics and Statistics at the Booth School of Business
Aaron Schein

Assistant Professor of Statistics and Data Science
Nathan Srebro

Professor, Toyota Technological Institute at Chicago; Professor, Department of Computer Science
Matthew Stephens

Chair, Department of Statistics; Ralph W. Gerard Professor of Statistics, Human Genetics, and the College
Chenhao Tan

Assistant Professor of Computer Science and Data Science
David Uminsky

Executive Director, Data Science Institute; Senior Research Associate, Department of Computer Science
Victor Veitch

Assistant Professor of Statistics and Data Science
Jingshu Wang

Assistant Professor, Department of Statistics and the College
Rebecca Willett

Faculty Director of AI, Data Science Institute; Professor, Statistics, Computer Science, and the College
Haifeng Xu

Assistant Professor of Computer Science and Data Science

Ari is an incoming Assistant Professor of Computer Science and Data Science, starting July 2024.

His research has focused broadly on generative models of text: how we can use them and how can we understand them better. His research interests have spanned everything from dialogue, including winning the first Amazon Alexa Prize in 2017, to fundamental research on text generation, such as proposing Nucleus Sampling, a decoding algorithm used broadly in deployed systems such as the OpenAI API. With the new wave of powerful generative models being continually released, Ari has argued for using the lens of Complex Systems to understand generative models of human media, suggesting that a lack of precise behavioral vocabulary to describe what language models are doing is the bottleneck to explaining how language models are capable of such impressive performance on a range of tasks. He completed his PhD in Computer Science at the University of Washington studying “Interpretation Errors” in how we understand generative models after an interdisciplinary degree at NYU combining Computer Science and the Philosophy of Language.

I am an assistant professor of Statistics and Data Science at the University of Chicago. Previously, I was a postdoctoral research scientist in the Department of Statistics at Columbia University. I received my Ph.D. in Stanford’s Statistics department in the summer of 2022, and my thesis was recognized with the Jerome H. Friedman dissertation award. Before that, I received degrees in Mathematics (B.Sc.), Molecular Biotechnology (B.Sc.), and Scientific Computing (M.Sc.) at the University of Heidelberg in Germany, where I was a researcher at the European Molecular Biology Laboratory.

As a statistician with formal training in mathematics, molecular biology, and computation, I seek to develop practical and theoretically justified statistical methods, accompanied by robust software implementations, for the analysis of datasets generated from modern technologies. My research is inspired by new modeling and inference opportunities made possible through the wealth of modern data. My methodological interests encompass empirical Bayes analysis, causal inference, multiple testing, and statistics in the presence of contextual side-information.

Frederic Koehler is an incoming Assistant Professor of Statistics and Data Science in January 2024.

Frederic is currently at Stanford University as a Motwani Postdoctoral Fellow. Prior, he was a research fellow in UC Berkeley’s Simons Institute in the Program on Computational Complexity of Statistical Inference. He received his PHD in Mathematics and Statistics from MIT, where he was coadvised by Ankur Moitra and Elchanan Mossel. Before that, he received his undergraduate degree in Mathematics at Princeton University.

His current research interests include computational learning theory and related topics: probability theory, high-dimensional statistics, optimization, related aspects of statistical physics, etc. In particular, he is very interested in learning and inference in graphical models.

Bo is an Associate Professor in the Computer Science Department and Data Science Institute at UChicago.

Bo’s research addresses trustworthy machine learning from both theoretical and practical aspects and aims to enable reliable machine learning algorithms and systems in the real world, such as safe autonomous vehicles and federated (distributed) learning. She focuses on three interconnected aspects: robustness, privacy, generalization, and their underlying connections.

Bo received her Ph.D. in Computer Science from Vanderbilt University in 2016. She was a Postdoctoral Researcher at UC Berkeley 2017-2018 (working with Prof. Dawn Song) and joined the faculty at UIUC in 2018.

She been recognized by a long list of notable awards and fellowships for young faculty. She is a Sloan Fellow, MIT Technology Review TR-35 innovator, and recipient of the IJCAI Computers and Thought Award, NSF CAREER, Intel Rising Star Faculty award, Symantec Research Labs Fellowship, Rising Stars in EECS, Research Awards from Amazon/Facebook/Google, and best paper awards at multiple top machine learning and security conferences. Her research has been featured by major publications and media outlets such as Nature, Wired, New York Times, Fortune, and is on display at the Science Museum in London.

Tian Li is an Assistant Professor of Computer Science and Data Science starting in July 2024. Her research interests are in distributed optimization, large-scale machine learning, federated learning, and data-intensive systems. Prior to CMU, she received her undergraduate degrees in Computer Science and Economics from Peking University. She was a research intern at Google Research in 2022. She received the Best Paper Award at ICLR Workshop on Security and Safety in Machine Learning Systems (2021), was selected as Rising Stars in Machine Learning (2021), and was invited to participate in EECS Rising Stars Workshop (2022).

Dan Nicolae obtained his Ph.D. in statistics from The University of Chicago and has been a faculty at the same institution since 1999, with appointments in Statistics (since 1999) and Medicine (since 2006). His research focus is on developing statistical and computational methods for understanding the human genetic variation and its influence on the risk for complex traits, with an emphasis on asthma related phenotypes. The current focus in his statistical genetics research is centered on data integration and system-level approaches using large datasets that include clinical and environmental data as well as various genetics/genomics data types: DNA variation, gene expression (RNA-seq), methylation and microbiome.

Homepage

Veronika Rockova is a Professor of Econometrics and Statistics and the James S. Kemper Faculty Scholar at the Booth School of Business at the University of Chicago. She joined Booth after completing her postdoctoral training in statistics at the Wharton School of the University of Pennsylvania. She teaches a course on Big Data at Booth. Her research interests lie at the intersection of statistics and machine learning, with a primary focus on creating innovative decision-centric tools for extracting insights from extensive datasets. She specializes in Bayesian computation, high-dimensional decision theory, and hierarchical modeling. Her applied areas of interest include healthcare analytics and computational medicine. Her research was acknowledged with the National Science Foundation CAREER Award in 2020 and the COPPS Emerging Leader Award in 2023. She currently serves as an associate editor for the Annals of Statistics, Journal of the American Statistical Association, and Journal of the Royal Statistical Society. Beyond her academic pursuits, Veronika is an avid pianist, tennis enthusiast, and golf neophyte.

Aaron is an Assistant Professor in the Statistics Department and Data Science Institute at UChicago. His research develops methodology in Bayesian statistics, causal inference, and machine learning for applied problems in political science, economics, and genetics, among other fields. Prior to joining UChicago, Aaron was a postdoctoral fellow in the Data Science Institute at Columbia University. He received his PhD in Computer Science from UMass Amherst, as well as an MA in Linguistics and a BA in Political Science.

Dr. Srebro is interested in statistical and computational aspects of machine learning, and the interaction between them. He has done theoretical work in statistical learning theory and in algorithms, devised novel learning models and optimization techniques, and has worked on applications in computational biology, text analysis and collaborative filtering. Before coming to TTIC, Dr. Srebro was a postdoctoral fellow at the University of Toronto and a visiting scientist at IBM Research.

Personal Homepage.

My lab works on a wide variety of problems at the interface of Statistics and Genetics. We often tackle problems where novel statistical methods are required, or can learn something new compared with existing approaches. Thus, much of our research involves developing new statistical methodology, many of which have a non-trivial computational component. And because data sets are getting larger and larger our work often involves modern methods for “high-dimensional statistics”. Our work often makes extensive use of Bayesian hierarchical models to borrow information across data sets or sampling units.

Recently my lab has been increasingly focussed on making its research more open, reproducible and extensible. This is because I see this as the first step towards greater cooperation of scientists to achieve common goals.

See http://github.com/stephenslab/ash for an example of a recent project I conducted “in the open”. And see https://jdblischak.github.io/workflowr/ for an R package we have developed to help students and others make research websites of their analyses. In learning to do research this way, my lab uses git for version control, github for sharing code, and knitr and RStudio for helping make our R analyses clear and share-able.

Current research interests include:

Sparsity, shrinkage, and false discovery rates, particularly for complex inter-related datasets.
Factor Analysis, dimension reduction, and estimation of large covariance matrices.
Clustering methods, and generalizations (eg grade of membership)
Applications of multi-scale and wavelet methods to genomic data
Reproducible research and open science

Chenhao Tan is an assistant professor at the Department of Computer Science and the UChicago Data Science Institute. His main research interests include language and social dynamics, human-centered machine learning, and multi-community engagement. He is also broadly interested in computational social science, natural language processing, and artificial intelligence.

Website

David Uminsky joined the University of Chicago in September 2020 as a senior research associate and Executive Director of Data Science. He was previously an associate professor of Mathematics and Executive Director of the Data Institute at University of San Francisco (USF). His research interests are in machine learning, signal processing, pattern formation, and dynamical systems. David is an associate editor of the Harvard Data Science Review. He was selected in 2015 by the National Academy of Sciences as a Kavli Frontiers of Science Fellow. He is also the founding Director of the BS in Data Science at USF and served as Director of the MS in Data Science program from 2014-2019. During the summer of 2018, David served as the Director of Research for the Mathematical Science Research Institute Undergrad Program on the topic of Mathematical Data Science.

Before joining USF he was a combined NSF and UC President’s Fellow at UCLA, where he was awarded the Chancellor’s Award for outstanding postdoctoral research. He holds a Ph.D. in Mathematics from Boston University and a BS in Mathematics from Harvey Mudd College.

I am an assistant professor of Statistics and Data Science at the University of Chicago and a research scientist at Google Cambridge. My recent work revolves around the intersection of machine learning and causal inference, as well as the design and evaluation of safe and credible AI systems. Other noteable areas of interests include network data, and the foundations of learning and statistical inference.

I was previously a Distinguished Postdoctoral Researcher in the department of statistics at Columbia University, where I worked with the groups of David Blei and Peter Orbanz. I completed my Ph.D. in statistics at the University of Toronto, where I was advised by Daniel Roy. In a previous life, I worked on quantum computing at the University of Waterloo. I won a number of awards, including the Pierre Robillard award for best statistics thesis in Canada.

My main research interest is in developing statistical methods for cutting-edge bio-technologies and genetic problems. I currently work on problems in single-cell omics, Mendelian Randomization and structural variation in the 3D genome. My research also includes developing general statistical methodology in causal inference and hypotheses testing that arise from new challenges in genetics and public health.

Rebecca Willett is a Professor of Statistics and Computer Science at the University of Chicago. She completed her PhD in Electrical and Computer Engineering at Rice University in 2005 and was an Assistant then tenured Associate Professor of Electrical and Computer Engineering at Duke University from 2005 to 2013. She was an Associate Professor of Electrical and Computer Engineering, Harvey D. Spangler Faculty Scholar, and Fellow of the Wisconsin Institutes for Discovery at the University of Wisconsin-Madison from 2013 to 2018. Prof. Willett received the National Science Foundation CAREER Award in 2007, was a member of the DARPA Computer Science Study Group 2007-2011, and received an Air Force Office of Scientific Research Young Investigator Program award in 2010. Prof. Willett has also held visiting researcher positions at the Institute for Pure and Applied Mathematics at UCLA in 2004, the University of Wisconsin-Madison 2003-2005, the French National Institute for Research in Computer Science and Control (INRIA) in 2003, and the Applied Science Research and Development Laboratory at GE Medical Systems (now GE Healthcare) in 2002. Her research interests include network and imaging science with applications in medical imaging, wireless sensor networks, astronomy, and social networks. She is also an instructor for FEMMES (Females Excelling More in Math Engineering and Science; news article here) and a local exhibit leader for Sally Ride Festivals. She was a recipient of the National Science Foundation Graduate Research Fellowship, the Rice University Presidential Scholarship, the Society of Women Engineers Caterpillar Scholarship, and the Angier B. Duke Memorial Scholarship.

Homepage

Haifeng Xu is an assistant professor in the Department of Computer Science and the Data Science Institute at UChicago. He directs the Strategic IntelliGence for Machine Agents (SIGMA) research lab which focuses on designing algorithms/systems that can effectively elicit, process and exploit information, particularly in strategic environments. Haifeng has published more than 55 publications at leading venues on computational economics, machine learning and theoretical computer science, such as EC, ICML, NeurIPS, STOC and SODA. His research has been recognized by multiple awards, including the Google Faculty Research Award, ACM SIGecom Dissertation Award (honorable mention), IFAAMAS Victor Lesser Distinguished Dissertation Award (runner-up), Google PhD fellowship, and multiple best paper awards.

The following research themes are the recent focus of our research lab. Please refer to our lab’s website for more details.

The economics of data/information, including selling, acquiring, and exploiting information
Machine learning in multi-agent setups under information asymmetry, incentive conflicts, and deception
Resource allocation in adversarial domains, with applications to security and privacy protection

Faculty

Ari Holtzman

Nikos Ignatiadis

Frederic Koehler

Bo Li

Tian Li

Dan Nicolae

Veronika Rockova

Aaron Schein

Nathan Srebro

Matthew Stephens

Chenhao Tan

David Uminsky

Victor Veitch

Jingshu Wang

Rebecca Willett

Haifeng Xu