
Node Details
- Name: folderFiles
- Type: Document
- Category: Document Loaders
- Version: 3.0
Parameters
-
Folder Path
- Type: string
- Description: The path to the folder containing the files to be processed.
-
Recursive
- Type: boolean
- Description: If set to true, the loader will search for files in subdirectories as well.
-
Text Splitter
- Type: TextSplitter
- Optional: Yes
- Description: A text splitter to be applied to the loaded documents.
-
Pdf Usage
- Type: options
- Options:
- One document per page
- One document per file
- Default: One document per page
- Description: Determines how PDF files are processed.
-
JSONL Pointer Extraction
- Type: string
- Optional: Yes
- Description: Specifies the pointer for extracting data from JSONL files.
-
Additional Metadata
- Type: json
- Optional: Yes
- Description: Additional metadata to be added to the extracted documents.
-
Omit Metadata Keys
- Type: string
- Optional: Yes
- Description: Comma-separated list of metadata keys to be omitted from the extracted documents. Use * to omit all metadata keys except those specified in Additional Metadata.
Supported File Formats
- JSON (.json)
- JSONL (.jsonl)
- Text (.txt)
- CSV (.csv, .xls, .xlsx)
- Word Documents (.doc, .docx)
- PDF (.pdf)
- ASP (.aspx, .asp)
- C++ (.cpp, .h)
- C (.c)
- C# (.cs)
- CSS (.css)
- Go (.go)
- Kotlin (.kt)
- Java (.java)
- JavaScript (.js)
- Less (.less)
- TypeScript (.ts)
- PHP (.php)
- Protocol Buffers (.proto)
- Python (.python, .py)
- reStructuredText (.rst)
- Ruby (.ruby, .rb)
- Rust (.rs)
- Scala (.scala, .sc)
- Sass (.scss)
- Solidity (.sol)
- SQL (.sql)
- Swift (.swift)
- Markdown (.markdown, .md)
- LaTeX (.tex, .ltx)
- HTML (.html)
- Visual Basic (.vb)
- XML (.xml)
Input
- Folder path and configuration options as specified in the parameters.
Output
- An array of document objects, each containing the content of a file and its associated metadata.
Functionality
- The node creates a DirectoryLoader with specific loaders for each supported file type.
- It loads documents from the specified folder, optionally searching recursively.
- If a text splitter is provided, it splits the loaded documents.
- Additional metadata is added to each document if specified.
- Metadata keys are omitted based on the “Omit Metadata Keys” parameter.
Use Cases
- Bulk loading of documents from a file system for processing or analysis.
- Preparing diverse document sets for ingestion into language models or other NLP tasks.
- Extracting and organizing content from multiple file types in a structured manner.