Node Details

  • Name: Text_DocumentLoaders
  • Label: Text File
  • Version: 3.0
  • Type: Document
  • Category: Document Loaders

Input Parameters

  1. Txt File (required)

    • Type: file
    • Supported formats: .txt, .html, .aspx, .asp, .cpp, .c, .cs, .css, .go, .h, .java, .js, .less, .ts, .php, .proto, .python, .py, .rst, .ruby, .rb, .rs, .scala, .sc, .scss, .sol, .sql, .swift, .markdown, .md, .tex, .ltx, .vb, .xml
  2. Text Splitter (optional)

    • Type: TextSplitter
    • Purpose: Splits the loaded text into smaller chunks
  3. Additional Metadata (optional)

    • Type: JSON
    • Description: Extra metadata to be added to the extracted documents
  4. Omit Metadata Keys (optional)

    • Type: string
    • Description: Comma-separated list of metadata keys to omit from the default set. Use * to omit all keys except those specified in Additional Metadata.

Outputs

  1. Document

    • Description: Array of document objects containing metadata and pageContent
    • Base Classes: Document, json
  2. Text

    • Description: Concatenated string from pageContent of documents
    • Base Classes: string, json

Functionality

  1. File Loading:

    • Supports loading from local storage or base64-encoded file data
    • Can handle single files or multiple files (passed as a JSON array)
  2. Text Processing:

    • Uses TextLoader from langchain to load text content
    • Optionally splits text using provided TextSplitter
  3. Metadata Management:

    • Adds user-provided additional metadata
    • Can omit specific or all default metadata keys
    • Merges existing and new metadata
  4. Output Formatting:

    • Can output either as Document objects or concatenated text
    • Handles escape characters in text output

Use Cases

  • Loading and processing text-based documents from various sources
  • Preparing text data for further NLP or machine learning tasks
  • Extracting and managing metadata from text documents
  • Splitting large text documents into manageable chunks

Notes

  • The node is flexible in handling file inputs, supporting both direct file uploads and references to files in storage
  • It integrates well with text splitting operations, allowing for easy segmentation of large documents
  • The metadata management features provide fine-grained control over what information is attached to each document