The Spider Document Loaders node is a component designed to scrape and crawl web content using the Spider API. It allows users to extract text content from web pages, either by scraping a single page or crawling multiple pages within the same domain.
Text Splitter (optional)
Mode
Web Page URL
Limit
Additional Metadata (optional)
Additional Parameters (optional)
Omit Metadata Keys (optional)
An array of Document objects, each containing:
pageContent
: The extracted text content from the web pagemetadata
: A combination of default metadata (e.g., source URL) and any additional metadata provided