FireCrawl Document Loader

Node Details

Text Splitter (optional)
- Type: TextSplitter
- Description: A text splitter to process the loaded documents
URLs
- Type: string
- Description: URL to be crawled/scraped
Crawler type
- Type: options
- Options:
  - Crawl: Crawl a URL and all accessible subpages
  - Scrape: Scrape a URL and get its content
- Default: Crawl
Max Crawl Pages (implied from the code)
- Type: string
- Description: Maximum number of pages to crawl
Generate Image Alt Text (implied from the code)
- Type: boolean
- Description: Whether to generate alternative text for images
Return Only URLs (implied from the code)
- Type: boolean
- Description: Whether to return only URLs without content
Only Main Content (implied from the code)
- Type: boolean
- Description: Whether to extract only the main content of the page
URL Patterns Excludes (implied from the code)
- Type: string
- Description: Comma-separated list of URL patterns to exclude from crawling
URL Patterns Includes (implied from the code)
- Type: string
- Description: Comma-separated list of URL patterns to include in crawling
Metadata (optional, implied from the code)
- Type: string or object
- Description: Additional metadata to add to the documents

The node takes various configuration parameters as input, including the URL to crawl/scrape, crawler options, and API credentials.

The node outputs an array of Document objects. Each Document contains:

The FireCrawl API key is required and should be set up in the credentials.
The node supports both crawling (multiple pages) and scraping (single page) modes.
Various options allow for customization of the crawling/scraping process, such as limiting the number of pages, including/excluding URL patterns, and focusing on main content.

On this page