Harsimar Singh
- The live streaming of project.
- Project can be viewed at my Github Repository.
This is the project of Udacity NanoDegree. This project includes large database professes to be designed for newspaper site that includes 3 tables viz. articles, authors and log with millions of rows by SQL queries. This database contains the data of articles read, authors of the articles and web server log for the newspaper site.This data is used for conclusion for different results.
- Python
- Vagrant
- VirtualBox
1.Install Vagrant and VirtualBox 2.Download or clone from github fullstack-nandegree-vm repository 3.Now we got newsdata.sql in our vagrant directory and now we are good to go.
- Download the newsdata.sql file.
-
Launch Vagrant VM by running
vagrant up
, you can the log in withvagrant ssh
-
Load the data into databse named news, use the command
psql -d news -f newsdata.sql
only once.
-use \c to connect to database="news"
-use \dt to see the tables in database
-use \dv to see the views in database
-use \q to quit the database -
Connect to databse, run the command
psql -d news
. -
Create a view , use the command psql -d news and then run the SQL statement as mentioned below.
- SQL query for creating view: CREATE VIEW v8:
create view v8 as select date(time) as Date,
round(sum(case when status not like '%200%' AND status like '%404%' then 1 else 0 end)*100.0/count(status),2) as Errorr from log group by Date;
- To execute the program, run
python Log_Analysis.py
from the command line.