Data. Policy. Impact.

The Data Science and Public Policy Lab at Carnegie Mellon University, across the Machine Learning Department and the Heinz College of Public Policy, works to develop and further the use of ML/AI/data science in social good, and public policy research and practice. Our work includes educating current and future policymakers, researchers, and practitioners, working on ML/AI/data science projects with government, nonprofit, academic, and foundation partners, conducting new research, and developing new methods, open-source tools, and guides that support and extend the use of ML/AI/data science for public policy and social impact. Our team consists of data scientists and researchers from computer science, statistics, and social science backgrounds to bring in methods from all of these disciplines, software engineers to make sure our work becomes usable code and implemented, domain and policy experts to provide context and relevance, and project managers who help get things done.

We believe that effective use of data, ML, and AI is critical in making adaptive and personalized policies that improve lives of everyone in a measurable, fair, and equitable manner.

Our Work

Collaborative Projects

We work with governments, non-profits, and other organizations on data science projects across health, criminal justice, public safety, education, economic development, transportation, and more. Most of our projects tackle operational problems that have tangible impact, and result in software that can be used by our partner organizations (and others) for social impact and improved policies. Recent examples of our projects include:
  • Building Data-Driven Police Early Intervention Systems
  • Prioritizing Preventative Lead Hazard Inspections
  • Prioritizing Health and Safety Housing Inspections
  • Reducing incarcerations by identifying at risk individuals in need of social services

Research Areas

Our research initiatives are motivated by working on hands-on data science projects with governments, non-profits, and other policy organizations. As we tackle policy problems, we identify open areas where existing methods from computer science, machine learning, artificial intelligence  or social sciences are lacking and formulate our research initiatives to fill those gaps. We then push the results of our research back into our data science tools so they can be used across our projects and by our project partners. We are currently working on:
  • Auditing  and Correcting for Bias and Equity Issues in Data Science Systems
  • Increasing the interpretability and transparency of machine learning models used in policy decisions
  • Designing experimental validation methodologies for machine learning systems
  • Developing methods for monitoring and updating deployed data science systems

ML/AI/Data Science Pipelines and Tools

We believe in open and reusable code and tools. All of our (non-confidential) project code is available under an open source license on our github page. All of our internal data science tools are also available for other organizations to use. Examples of such tools include:
  • Triage: Our data science pipeline platform that’s used in many of our internal projects, which contains components for generating features, building machine learning models, and evaluating those models.
  • Entity Deduplication Tool (pgdedupe)
  • Post-Modeling Tools for analyzing the models built, feature importances, and exploring the outputs of those models before deployment.
  • Bias Audits: To run bias audits on the outputs of machine learning models


We run training programs, workshops, and tutorials for students, government agencies, non profits, foundations, and corporations. Some of our trainings include:
Our trainings for governments and non-profits are designed for Directors and Executives of organizations as well as Analysts and Policymakers.

Our Project Partners


The future of Public Policy is open, adaptive, scalable, micro-policies that  benefit everyone in a measurable, equitable, and fair manner. We can help get there.