Use-Inspired Data Science

Image processing

Audio Signal Processing

Natural Language Processing

Time Series

Unstructured and Semi-structured data

Genetics & Genomics Data

Health Data

Spatial Data

Multi-modal Data

Reproducibility

Experiments

Faculty

Luc Anselin

Stein-Freiler Distinguished Service Professor of Sociology and the College; Director, Center for Spatial Data Science
Raul Castro Fernandez

Assistant Professor, Computer Science
Greg Green

Associate Senior Instructional Professor; Director of MS in Applied Data Science Program; Senior Director of Industrial Partnerships and Strategy
Robert Grossman

Frederick H. Rawson Distinguished Service Professor in Medicine and Computer Science; Jim and Karen Frank Director of the Center for Translational Data Science; Chief, Section of Biomedical Data Science, Dept. of Medicine
Nikos Ignatiadis

Assistant Professor of Statistics and Data Science
Dan Nicolae

Elaine M. and Samuel D. Kersten, Jr. Distinguished Service Professor; Faculty Co-Director, Data Science Institute and Committee on Data Science; Professor of Statistics, Human Genetics, Medicine, Section of Genetic Medicine and the College
Samantha Riesenfeld

Assistant Professor of Molecular Engineering and Medicine
Nick Ross

Data Science Clinic Director, Data Science Institute; Associate Senior Instructional Professor
Aaron Schein

Assistant Professor of Statistics and Data Science
Matthew Stephens

Chair, Department of Statistics; Ralph W. Gerard Professor of Statistics, Human Genetics, and the College
Jingshu Wang

Assistant Professor, Department of Statistics and the College

Professor Anselin is the developer of the SpaceStat and GeoDa software packages for spatial data analysis. His publications include many hundreds of articles and several edited books in the fields of quantitative geography, regional science, geographic information science, econometrics, economics, and computer science.

Read more about Professor Anselin here.

In my research, I ask what is “the value of data” and explore the potential of data markets to unlock that value. My group collaborates with economists, legal scholars, statisticians, and domain scientists. We build systems to share, discover, prepare, integrate, and process data. I have traditionally worked on distributed query processing systems and continue to do so. I have received a SIGMOD’23 Test-of-time-Award for my PhD work.

Homepage.

Greg Green is Senior Instructional Professor and Director of the MS in Applied Data Science Program at the University of Chicago, and Senior Director for Industrial Partnerships and Strategy at the Data Science Institute. Dr. Green helps the University of Chicago professional data science students learn to apply data science to solve complex industry problems with greater impact.

Dr. Green is reshaping the content and approaches used to educate the next generation of professional data scientists at the University of Chicago. Additionally, Greg is designing new, creative offerings more deeply connected to MS and PhD research programs in Data Science, Computer Science, Statistics, and Financial Mathematics. New course offerings developed launched since joining the University of Chicago include an innovative approach to “Leadership in Data Science and Artificial Intelligence”, “Consulting in Data Science” and “Your Career in Data Science”.

Throughout his professional career, Greg has used his expertise in digital strategies, business analytics, and new product development to drive rapid revenue growth and accelerate business transformation. His previous work bringing innovation to an academic environment included authoring a Marketing Analytics course, designing a pre-requisite applied statistics course and serving as a lecturer for Marketing Analytics at Northwestern University.

Greg’s industry roles include Chief Analytics Officer at Harland Clarke Holdings, Director at Google, EVP/Managing Director at Publicis Groupe, and Analytics Practice Lead at PwC. Greg’s patented cloud-based media analytics platform was highlighted in Harvard Business Review and Fast Company.

Greg holds a Doctor of Philosophy in Mathematics from Claremont Graduate School and a Master of Science in Statistics from Michigan State University. Born in Owosso, Michigan, Greg is married to Jill, an artist, and their adult children include two more artists, a teacher, and an engineer. Greg and his family enjoy snowboarding, snow/water skiing and live theatre—as well as good food and friendships. Their passion for the environment is reflected in a love for Lake Michigan where they like to spend as much of the summer as possible.

I am the Frederick H. Rawson Distinguished Service Professor of Medicine and Computer Science and the Jim and Karen Frank Director of the Center for Translational Data Science (CTDS) at the University of Chicago.

I am the Chief of the Section of Biomedical Data Science in the Department of Medicine at the University of Chicago.

I am the Chair of the not-for-profit Open Commons Consortium, which develops and operates clouds to support research in science, medicine, health care, and the environment. I am also a Partner of Analytic Strategy Partners LLC.

I am an assistant professor of Statistics and Data Science at the University of Chicago. Previously, I was a postdoctoral research scientist in the Department of Statistics at Columbia University. I received my Ph.D. in Stanford’s Statistics department in the summer of 2022, and my thesis was recognized with the Jerome H. Friedman dissertation award. Before that, I received degrees in Mathematics (B.Sc.), Molecular Biotechnology (B.Sc.), and Scientific Computing (M.Sc.) at the University of Heidelberg in Germany, where I was a researcher at the European Molecular Biology Laboratory.

As a statistician with formal training in mathematics, molecular biology, and computation, I seek to develop practical and theoretically justified statistical methods, accompanied by robust software implementations, for the analysis of datasets generated from modern technologies. My research is inspired by new modeling and inference opportunities made possible through the wealth of modern data. My methodological interests encompass empirical Bayes analysis, causal inference, multiple testing, and statistics in the presence of contextual side-information.

Dan Nicolae obtained his Ph.D. in statistics from The University of Chicago and has been a faculty at the same institution since 1999, with appointments in Statistics (since 1999) and Medicine (since 2006). His research focus is on developing statistical and computational methods for understanding the human genetic variation and its influence on the risk for complex traits, with an emphasis on asthma related phenotypes. The current focus in his statistical genetics research is centered on data integration and system-level approaches using large datasets that include clinical and environmental data as well as various genetics/genomics data types: DNA variation, gene expression (RNA-seq), methylation and microbiome.

Homepage

Samantha Riesenfeld is Assistant Professor in the UChicago Pritzker School of Molecular Engineering, with additional affiliations in the Department of Medicine, Section of Genetic Medicine, the Institute for Biophysical Dynamics, the Comprehensive Cancer Center, and the Committee on Immunology, where she co-chairs the Computational and Systems Immunology track of the PhD training program. She leads a highly interdisciplinary research group that develops and applies machine learning methods to use functional genomics, including single-cell transcriptomics, and multimodal data to investigate complex biological systems. Areas of focus include inflammatory immune responses, neuroimmune interactions, and solid tumor cancers. Dr. Riesenfeld has a BA in mathematics and computer science from Harvard University and a PhD in theoretical computer science from UC Berkeley. She did postdoctoral training at the interface of machine learning, systems biology, and immunology at the Broad Institute of MIT and Harvard, Brigham and Women’s Hospital, and the Gladstone Institutes at UCSF. Her honors include a PhRMA Foundation Post Doctoral Fellowship, an NIH F32 NRSA postdoctoral fellowship, a BroadIgnite postdoctoral award, and a Cancer Research Foundation Young Investigator Award.

Dr. Ross is an experienced data science executive and academic leader who specializes in leveraging business, engineering, and data to optimize decision-making. His various roles have ranged from architecting and designing production ML/AI systems, to hiring, growing, and leading engineering and data science teams.

Previously, Dr. Ross led the data science and backend engineering efforts at The Meta, an esports training platform used by millions of competitive gamers. Before joining The Meta, Dr. Ross was a Professor of Data Science at the University of San Francisco, where his research focused on how to effectively use data and data science techniques to answer business questions. During this time, he was also the Assistant Director of the University of San Francisco’s Data Institute, where he led and developed academic-industry partnerships to create a world-class masters of data science program. Under his leadership, the Data Institute placed hundreds of students into top data science positions in both the private and public sectors, with a job placement rate of over 90% within 3 months of graduation. As a consultant, he spearheaded data efforts at leading tech companies in the video and online game industry, from early-stage startups to multinational companies.

Dr. Ross received his PhD from UCLA, his Masters from UC Davis, and his Bachelor of Science from UC Berkeley. He has published papers in a variety of journals as well as given talks in both academic and industry settings.

Aaron is an Assistant Professor in the Statistics Department and Data Science Institute at UChicago. His research develops methodology in Bayesian statistics, causal inference, and machine learning for applied problems in political science, economics, and genetics, among other fields. Prior to joining UChicago, Aaron was a postdoctoral fellow in the Data Science Institute at Columbia University. He received his PhD in Computer Science from UMass Amherst, as well as an MA in Linguistics and a BA in Political Science.

My lab works on a wide variety of problems at the interface of Statistics and Genetics. We often tackle problems where novel statistical methods are required, or can learn something new compared with existing approaches. Thus, much of our research involves developing new statistical methodology, many of which have a non-trivial computational component. And because data sets are getting larger and larger our work often involves modern methods for “high-dimensional statistics”. Our work often makes extensive use of Bayesian hierarchical models to borrow information across data sets or sampling units.

Recently my lab has been increasingly focussed on making its research more open, reproducible and extensible. This is because I see this as the first step towards greater cooperation of scientists to achieve common goals.

See http://github.com/stephenslab/ash for an example of a recent project I conducted “in the open”. And see https://jdblischak.github.io/workflowr/ for an R package we have developed to help students and others make research websites of their analyses. In learning to do research this way, my lab uses git for version control, github for sharing code, and knitr and RStudio for helping make our R analyses clear and share-able.

Current research interests include:

Sparsity, shrinkage, and false discovery rates, particularly for complex inter-related datasets.
Factor Analysis, dimension reduction, and estimation of large covariance matrices.
Clustering methods, and generalizations (eg grade of membership)
Applications of multi-scale and wavelet methods to genomic data
Reproducible research and open science

My main research interest is in developing statistical methods for cutting-edge bio-technologies and genetic problems. I currently work on problems in single-cell omics, Mendelian Randomization and structural variation in the 3D genome. My research also includes developing general statistical methodology in causal inference and hypotheses testing that arise from new challenges in genetics and public health.

Faculty

Luc Anselin

Raul Castro Fernandez

Greg Green

Robert Grossman

Nikos Ignatiadis

Dan Nicolae

Samantha Riesenfeld

Nick Ross

Aaron Schein

Matthew Stephens

Jingshu Wang