The HtmlToMarkdown Text Splitter is a specialized text splitter that converts HTML content to Markdown and then splits the resulting Markdown text into smaller chunks based on headers. This node is particularly useful for processing HTML documents and preparing them for further natural language processing or analysis tasks.
NodeHtmlMarkdown.translate()
function to convert the HTML to Markdown.
MarkdownTextSplitter
class from the langchain/text_splitter
package.
MarkdownTextSplitter
class to handle HTML input.