Web-Scraping-IMDb

This is an example of Web Scraping using Scrapy (a Python package) and Python

1. Install the latest version of Python in your computer

2. Open up a terminal and install Scrapy using the command(After installing Python):

$ C:\Users\YOU> pip install Scrapy

3. Clone this repository anywhere in your computer

4. In a terminal, cd into the project directory and navigate to the directory where the settings and pipelines file are.

$ C:\Users\YOU\Desktop\MovieSpider>

5. Run the following command: scrapy crawl greatspider

$ C:\Users\YOU\Desktop\MovieSpider> scrapy crawl greatspider

6. The spider will crawl the website and display the results in the terminal

7. You can choose to output the results in file format like JSON or XML

$ C:\Users\YOU\Desktop\MovieSpider> scrapy crawl greatspider -o movies.json

$ C:\Users\YOU\Desktop\MovieSpider> scrapy crawl greatspider -o movies.xml

Congratulations!

You have successfully scraped your first website on the Internet!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Web-Scraping-IMDb

1. Install the latest version of Python in your computer

2. Open up a terminal and install Scrapy using the command(After installing Python):

3. Clone this repository anywhere in your computer

4. In a terminal, cd into the project directory and navigate to the directory where the settings and pipelines file are.

5. Run the following command: scrapy crawl greatspider

6. The spider will crawl the website and display the results in the terminal

7. You can choose to output the results in file format like JSON or XML

Congratulations!

Files

README.md

Latest commit

History

README.md

File metadata and controls

Web-Scraping-IMDb

1. Install the latest version of Python in your computer

2. Open up a terminal and install Scrapy using the command(After installing Python):

3. Clone this repository anywhere in your computer

4. In a terminal, cd into the project directory and navigate to the directory where the settings and pipelines file are.

5. Run the following command: scrapy crawl greatspider

6. The spider will crawl the website and display the results in the terminal

7. You can choose to output the results in file format like JSON or XML

Congratulations!