Bay Area Homelessness Analysis

DSWG Teammates Catherine Zhang (Co-Lead), Matt Mollison, Ph.D. (Co-Lead), Qianqian Ye, Sameeran Kunche, Annalie Kruseman

Project Detail

HMIS Data Science Study

Members of the Data Science Working Group at Code for San Francisco have been charged with answering the Community Technology Alliance’s prompt about homelessness programs.


What variables best predict whether an individual is categorized as ‘in permanent housing’ as an outcome, by population segment:

  • Veterans
  • Chronically Homeless
  • Continuously Homeless
  • Has Disabling Condition
  • Domestic Violence Victim
  • Male/Female
  • Latino/Non-Latino


Data is in HMIS format, a data standard defined by the US Department of Housing and Urban Development


View the HMIS Data Science Study Presentation for a summary of our findings

Featured Notebooks


Install Jupyter Notebook; this is most easily done by installing Anaconda:

Install seaborn. To do this in a new conda environment:
conda create --name datasci seaborn

To deactivate/activate the environment:
source deactivate datasci
source activate datasci

Get Started

  1. Fork this repository and clone it locally.
  2. Locate the dataset (pinned in #datasci-homeless on Slack).
  3. Run jupyter notebook
  4. Navigate to notebooks/load_data_example_v2.ipynb to start exploring the data.

Additional information on completed and open items can be found in the pinned documents in #datasci-homeless.