Skip to content

Latest commit

 

History

History
79 lines (56 loc) · 4.85 KB

README.md

File metadata and controls

79 lines (56 loc) · 4.85 KB

Hospital Finder Using Semantic Web Technologies

Final project for the course - CS 6315: Semantic Web, Spring 2018

Correlation between leading causes of death and neighboring hospitals using Semantic Web technologies

Team members

  • Ankita Patil
  • Krupali Patel
  • Shivdev Kumar

Demo: Dempnstration of the project can be found here.


Project Description

The project aims to find the top five leading causes of death and once we find it, we address one of leading causes of death and suggest hospitals near to a user for that particular disease so that they can be cured if possible and we can reduce the death rate for that particular disease. For this purpose, data about "Potentially Excess Deaths from the Five Leading Causes of Death(NCHS)" from National Vital Statistics System is used and combined with the "Timely and Effective Care- Hospital" and "500 Cities: Local Data for Better Health" for the year 2014.

Target Audience

  1. The general population of the United States: To locate nearby specialized hospitals based on their location within a particular radius, so that they can get respective treatments in the quickest possible way.
  2. Government officials: to address the mismatch between the excessive count of deaths and scarcity of hospital for a state.

Datasets

  1. Timely and Effective Care – Hospital
    Link: https://catalog.data.gov/dataset/timely-and-effective-care-hospital-e4aad
    Domain: Federal
    Description: This data set includes provider-level data for measures of heart attack care, heart failure care, pneumonia care, surgical care, emergency department care, preventive care, children asthma care, blood clot prevention and treatment, pregnancy and delivery care, and cancer care.
    Format available: CSV, RDF, JSON, XSL

  2. Potentially Excess Deaths from the Five Leading Causes of Death
    Link: https://catalog.data.gov/dataset/nchs-potentially-excess-deaths-from-the-five-leading-causes-of-death
    Domain: Federal
    Description: This dataset contains Potentially Excess Deaths from the Five Leading Causes of Death in Nonmetropolitan and Metropolitan Areas, United States, 2005-2015. Mortality data for U.S. residents come from the National Vital Statistics System.
    Format available: CSV, RDF, JSON, XSL

  3. 500 Cities: Local Data for Better Health
    Link: https://catalog.data.gov/dataset/500-cities-local-data-for-better-health-b32fd
    Domain: Federal
    Description: This is the complete dataset for the 500 Cities project. This dataset includes 2013, 2014 model-based small area estimates for 27 measures of chronic disease related to unhealthy behaviors, health outcomes and use of preventive services. It includes estimates for the 500 largest US cities and approximately 28,000 census tracts within these cities. These estimates can be used to identify emerging health problems and to inform development and implementation of effective, targeted public health prevention activities.
    Format available: CSV, RDF, JSON, XSL


Data Pre-processing

  • The datasets downloaded from the www.data.gov are in CSV format
  • Apache Fuskei requires data in RDF format. Hence, in order to convert the data into RDF format csv2rdf4lod-automation tool is used.
  • Converted pre-processed data can be found here.

Technology Stack

  • Apache Fuseki 3.6 (SPARQL server) to load the data and create SPARQL endpoints
  • csv2rdf4lod-automation tool for converting the data from CSV to RDF format
  • Bootstrap 4 and JavaScript for front-end
  • Google Visulaization API
  • Google Maps API

Instructions to run the code

  1. Download Apache Jena Fuseki

  2. Go to the location where Apache Jena Fuskei is downloaded

  3. Create a folder named dataDir in Apache Fueski's directory

  4. To Start Fuseki Server, open command prompt and type the command

      java -Xmx8096M -jar fuseki-server.jar --update --loc=dataDir /myDataset
    
    
  5. Now, in the browser go to http://localhost:3030/

  6. Click on "manage datasets" tab and beside /myDataset click on "upload data" and load all the 3 datasets from the dataset folder with a graph name for all the 3 datasets. For example load city.rdf with graph name as "city", load hospital.rdf with graph name "hospital" and load nchs.rdf with graph name "nchs".

  7. After the datasets are uploaded successfully, from Webapp folder, open index.html to view the results