Crawl Frontier
Crawl Frontier at a glance
1. Create your crawler
2. Integrate your crawler with the frontier
3. Choose your backend
4. Run the spider
What else?
What’s next?
Installation Guide
What is a Crawl Frontier?
Architecture overview
Overview
Components
Data Flow
Frontier objects
Request objects
Response objects
Identifying unique objects
Adding additional data to objects
Frontier API
Crawl Frontier API / Manager
Loading from settings
Frontier Manager
Starting/Stopping the frontier
Frontier iterations
Finishing the frontier
Component objects
Test mode
Another ways of using the frontier
Settings
Designating the settings
How to access settings
Settings class
Built-in frontier settings
Built-in fingerprint middleware settings
Default settings
Middlewares
Activating a middleware
Writing your own middleware
Built-in middleware reference
Backends
Activating a backend
Writing your own backend
Built-in backend reference
Using the Frontier with Scrapy
Activating the frontier
Organizing files
Running the Crawl
Frontier Scrapy settings
Using the Frontier with Requests
Graph Manager
Defining a Site Graph
Using the Graph Manager
CrawlPage objects
Adding pages and Links
Adding multiple sites
Graphs Database
Using graphs with status codes
A simple crawl faking example
Rendering graphs
How to use it
Testing a Frontier
Creating a Frontier Tester
Running a Test
Test Parameters
An example of use
Recording a Scrapy crawl
Activating the recorder
Choosing your storage engine
Running the Crawl
Recorder settings
Scrapy Seed Loaders
Activating a Seed loader
FileSeedLoader
S3SeedLoader
Examples
requests
scrapy_frontier
scrapy_recording
scripts
Tests
Running tests
Writing tests
Backend testing
Testing backend sequences
Testing basic algorithms
Release Notes
0.2.0 (released 2015-01-12)
0.1
Crawl Frontier
Docs
»
Logging
Edit on GitHub
Logging
¶
Note
TO-DO!
lorem ipsum...
Logger Object
¶
class
Logger
¶
A
Logger
object represents ...
EventLogger Object
¶
class
EventLogger
¶
A
EventLogger
object represents ...