Scrapy Beginning Tutorial

Hi to all. Today i started learning scrapy .Here i’m going to start scrapy from the beginning ..

What is Scrapy?

Scrapy is an application framework for crawling websites and extracting  structured data which can be used for a wide range of useful applications, like data mining, information processing etc.

Take a look at the documentation of scrapy  for more information here

Scrapy was written in Python. Hence you must some knowledge in python to work in scrapy.

For those who are beginners in python i would suggest these books “A Byte of Python ”  & “Learning Python the HardWay” (or) “Dive into Python“.

For those who have already knowledge in python remember this . Scrapy supports python 2.6 and 2.7 . Scrapy doesn’t support Python3.

Lets see the installation here.

Pre requisites :

  • Python 2.7
  • lxml
  • Opensssl
  • Pip or easy_install (Python package Managers)

To install Scrapy Open Terminal and type(cntrl + Alt + T)

$ pip install Scrapy

or

$ easy_install Scrapy

After installation type  $ scrapy

Scrapy 0.18.4 – no active project

Usage:
  scrapy <command> [options] [args]

Available commands:
  bench         Run quick benchmark test
  fetch         Fetch a URL using the Scrapy downloader
  runspider     Run a self-contained spider (without creating a project)
  settings      Get settings values
  shell         Interactive scraping console
  startproject  Create new project
  version       Print Scrapy version
  view          Open URL in browser, as seen by Scrapy

  [ more ]      More commands available when run from project directory

Use “scrapy <command> -h” to see more info about a command

It shows the installed scrapy version and other details .
Thats it .. From my next post we will get started with Coding …

Happy Times !!! Thanks !!!!!!!!

 

Author: Balaji

Hi..My name is Balaji and i am working as a Lead Developer in India. I am interested in Shell scripts, python, erlang , linux kernel , Machine Learning.

Leave a comment