Github Document Loader

Node Details

Repo Link (required)
- Type: string
- Description: The URL of the GitHub repository
- Example: https://github.com/Ardor_Cerebrum/Ardor
Branch (required)
- Type: string
- Default: “main”
- Description: The branch of the repository to load from
Recursive (optional)
- Type: boolean
- Description: Whether to recursively traverse the repository
Max Concurrency (optional)
- Type: number
- Description: Maximum number of concurrent operations
Ignore Paths (optional)
- Type: string (JSON array)
- Description: An array of paths to be ignored
- Example: [“*.md”]
Max Retries (optional)
- Type: number
- Description: Maximum number of retries for a single call
- Default: 2
Text Splitter (optional)
- Type: TextSplitter
- Description: A text splitter to apply to the loaded documents
Additional Metadata (optional)
- Type: JSON
- Description: Additional metadata to be added to the extracted documents
Omit Metadata Keys (optional)
- Type: string
- Description: Comma-separated list of metadata keys to omit from the documents

An array of IDocument objects representing the loaded and processed documents from the GitHub repository.

When accessing private repositories, make sure to provide the appropriate GitHub API credentials.
The node supports various options for customizing the loading process, including recursive traversal, concurrency control, and retry logic.
Use the text splitter option to break down large documents into smaller chunks if needed.
The additional metadata and metadata key omission features allow for fine-grained control over the document metadata.

On this page