
Node Details
- Name: Text_DocumentLoaders
- Label: Text File
- Version: 3.0
- Type: Document
- Category: Document Loaders
Input Parameters
-
Txt File (required)
- Type: file
- Supported formats: .txt, .html, .aspx, .asp, .cpp, .c, .cs, .css, .go, .h, .java, .js, .less, .ts, .php, .proto, .python, .py, .rst, .ruby, .rb, .rs, .scala, .sc, .scss, .sol, .sql, .swift, .markdown, .md, .tex, .ltx, .vb, .xml
-
Text Splitter (optional)
- Type: TextSplitter
- Purpose: Splits the loaded text into smaller chunks
-
Additional Metadata (optional)
- Type: JSON
- Description: Extra metadata to be added to the extracted documents
-
Omit Metadata Keys (optional)
- Type: string
- Description: Comma-separated list of metadata keys to omit from the default set. Use * to omit all keys except those specified in Additional Metadata.
Outputs
-
Document
- Description: Array of document objects containing metadata and pageContent
- Base Classes: Document, json
-
Text
- Description: Concatenated string from pageContent of documents
- Base Classes: string, json
Functionality
-
File Loading:
- Supports loading from local storage or base64-encoded file data
- Can handle single files or multiple files (passed as a JSON array)
-
Text Processing:
- Uses TextLoader from langchain to load text content
- Optionally splits text using provided TextSplitter
-
Metadata Management:
- Adds user-provided additional metadata
- Can omit specific or all default metadata keys
- Merges existing and new metadata
-
Output Formatting:
- Can output either as Document objects or concatenated text
- Handles escape characters in text output
Use Cases
- Loading and processing text-based documents from various sources
- Preparing text data for further NLP or machine learning tasks
- Extracting and managing metadata from text documents
- Splitting large text documents into manageable chunks
Notes
- The node is flexible in handling file inputs, supporting both direct file uploads and references to files in storage
- It integrates well with text splitting operations, allowing for easy segmentation of large documents
- The metadata management features provide fine-grained control over what information is attached to each document