This project contains data mining exercises and project work on a topic that arose from personal and professional interest, and attempts to answer the question "Is stage at diagnosis a predictor of breast cancer survival?" Data used is from the NPCR and SEER-U.S. Cancer Statistics public use databases via an academic access request. The dataset is over one million records/rows and is too large to import into Github. Programming language: Python. IDE: Jupyter Notebook. Audio/PowerPoint presentation of final project available at https://screenrec.com/share/vsAxEcmQge