LLMS.txt Parser API
API reference for @25xcodes/llmstxt-parser - parse, validate, and transform llms.txt files.
Functions
parseLLMSTxt
function parseLLMSTxt(markdown: string): LLMSTxtDocumentParse llms.txt markdown content into a structured document.
Parameters:
markdown- Raw markdown content following the llmstxt.org spec
Returns: Parsed LLMSTxtDocument
Example:
import { parseLLMSTxt } from '@25xcodes/llmstxt-parser'
const doc = parseLLMSTxt(`# My Project
> A brief description of the project.
## Documentation
- [Getting Started](https://example.com/docs): Quick start guide
- [API Reference](https://example.com/api): Full API docs
`)
console.log(doc.title) // "My Project"
console.log(doc.summary) // "A brief description of the project."
console.log(doc.links) // Array of parsed linksvalidateLLMSTxt
function validateLLMSTxt(doc: LLMSTxtDocument): LLMSTxtValidationResultValidate a parsed document per the llmstxt.org specification.
Parameters:
doc- Parsed LLMSTxtDocument
Returns: Validation result with errors, warnings, and quality score
Example:
import { parseLLMSTxt, validateLLMSTxt } from '@25xcodes/llmstxt-parser'
const doc = parseLLMSTxt(markdown)
const result = validateLLMSTxt(doc)
if (result.valid) {
console.log(`Score: ${result.score}/100`)
} else {
console.log('Errors:', result.errors)
}fetchLLMSTxt
function fetchLLMSTxt(
urlOrDomain: string,
options?: FetchOptions
): Promise<LLMSTxtDocument>Fetch and parse llms.txt from a URL or domain. Automatically tries well-known paths.
Discovery order:
/llms.txt/llms-full.txt(ifcheckFullis true)/.well-known/llms.txt
Parameters:
urlOrDomain- Full URL or domain nameoptions.timeout- Request timeout in ms (default: 10000)options.checkFull- Also check for llms-full.txt (default: true)options.corsProxy- CORS proxy URL for browser environments
Example:
import { fetchLLMSTxt } from '@25xcodes/llmstxt-parser'
// From domain (tries well-known paths)
const doc = await fetchLLMSTxt('example.com')
// With CORS proxy (for browsers)
const doc = await fetchLLMSTxt('example.com', {
corsProxy: 'https://your-proxy.workers.dev'
})discoverLLMSTxtFiles
function discoverLLMSTxtFiles(
domain: string,
options?: DiscoverOptions
): Promise<DiscoveredFile[]>Discover all llms.txt files available for a domain.
Returns: Array of discovered files with their URLs and types
estimateTokens
function estimateTokens(doc: LLMSTxtDocument): TokenEstimateEstimate token count for a document (~4 characters per token).
Returns:
interface TokenEstimate {
total: number
breakdown: {
title: number
summary: number
sections: number
links: number
}
}toRAGFormat
function toRAGFormat(doc: LLMSTxtDocument): stringConvert document to plain text format optimized for RAG/embedding.
Returns: Plain text string suitable for vector embedding
extractLinksForIndex
function extractLinksForIndex(doc: LLMSTxtDocument): RAGLinkEntry[]Extract structured link data for vector database indexing.
Returns:
interface RAGLinkEntry {
title: string
url: string
description?: string
section?: string
embedding_text: string // Combined text for embedding
}parseAndValidate
function parseAndValidate(markdown: string): {
document: LLMSTxtDocument
validation: LLMSTxtValidationResult
}Parse and validate in a single call.
Types
LLMSTxtDocument
interface LLMSTxtDocument {
title: string // H1 heading
summary?: string // Blockquote after title
sections: LLMSTxtSection[] // H2 sections
links: LLMSTxtLink[] // All extracted links
raw: string // Original markdown
sourceUrl?: string // URL if fetched
isFull?: boolean // True if from llms-full.txt
}LLMSTxtSection
interface LLMSTxtSection {
title: string
content: string
links: LLMSTxtLink[]
}LLMSTxtLink
interface LLMSTxtLink {
title: string
url: string
description?: string
section?: string // Parent section name
optional?: boolean // True if in Optional section
}LLMSTxtValidationResult
interface LLMSTxtValidationResult {
valid: boolean
score: number // 0-100 quality score
errors: LLMSTxtValidationError[]
warnings: LLMSTxtValidationWarning[]
}LLMSTxtValidationError
interface LLMSTxtValidationError {
code: string // e.g., 'MISSING_TITLE'
message: string
line?: number
}Browser Usage
For browser environments, use the corsProxy option:
import { fetchLLMSTxt } from '@25xcodes/llmstxt-parser'
const doc = await fetchLLMSTxt('example.com', {
corsProxy: 'https://your-cors-proxy.workers.dev'
})ESM and CommonJS
The package supports both module formats:
// ESM
import { parseLLMSTxt, validateLLMSTxt } from '@25xcodes/llmstxt-parser'
// CommonJS
const { parseLLMSTxt, validateLLMSTxt } = require('@25xcodes/llmstxt-parser')