Document loaders
Plain Text Document Loader
The Plain Text Document Loader is a component designed to load and process plain text data. It can split the text into smaller chunks if needed and allows for additional metadata to be added to the documents.
Node Details
- Name: PlainText_DocumentLoaders
- Type: Document
- Category: Document Loaders
- Version: 2.0
Input Parameters
-
Text (required)
- Type: string
- Description: The input plain text to be processed
- UI: Multiline text input (4 rows)
-
Text Splitter (optional)
- Type: TextSplitter
- Description: An optional text splitter to divide the input text into smaller chunks
-
Additional Metadata (optional)
- Type: JSON
- Description: Additional metadata to be added to the extracted documents
-
Omit Metadata Keys (optional)
- Type: string
- Description: A comma-separated list of metadata keys to omit from the final documents. Use ’*’ to omit all default metadata keys except those specified in Additional Metadata.
- UI: Multiline text input (4 rows)
Outputs
-
Document
- Description: An array of document objects containing metadata and pageContent
- Type: Document, JSON
-
Text
- Description: Concatenated string from pageContent of all documents
- Type: string, JSON
Functionality
- The node takes plain text as input.
- If a text splitter is provided, it splits the text into multiple documents.
- If additional metadata is provided, it’s added to each document.
- The node can selectively omit certain metadata keys based on user input.
- The output can be either an array of document objects or a concatenated string of all document contents.
Use Cases
- Loading and preprocessing plain text data for further NLP tasks
- Splitting large text documents into smaller, manageable chunks
- Adding custom metadata to text documents
- Preparing text data for use in language models or other AI applications
Notes
- The node handles escape characters in the output text when returning the concatenated string.
- It uses the
@langchain/core/documents
package for document handling. - The node is flexible in handling metadata, allowing users to add custom metadata and omit default metadata as needed.