@25xcodes/llmfeed-health-monitor

Feed crawling, health tracking, and report generation for LLMFeed.

Format Support

Format	Crawling	Validation	Reports
LLMFeed JSON	✅ Full	✅ Full	✅ Full
llm.txt	🚧 Fetch only	🚧 Coming	🚧 Coming

Current Status

Health monitoring with full validation is available for LLMFeed JSON format. The crawler can fetch llm.txt files but validation and detailed reports require JSON feeds.

Features

🕷️ Feed Crawling - Fetch and validate feeds from URLs
📊 Health Tracking - Store and track feed health over time
📝 Report Generation - Generate HTML, JSON, and Markdown reports
💾 Flexible Storage - In-memory or custom storage backends
⏰ Scheduling - Run health checks on a schedule
🔔 Notifications - Alert on feed health changes

Quick Start

bash

npm install @25xcodes/llmfeed-health-monitor

typescript

import { 
  crawlFeed, 
  generateReport 
} from '@25xcodes/llmfeed-health-monitor'

// Crawl a LLMFeed JSON (full validation support)
const result = await crawlFeed('https://example.com/.well-known/mcp.llmfeed.json')

// Access the result
console.log(result.feed.url)                    // Feed URL
console.log(result.healthCheck.reachable)       // Was feed accessible?
console.log(result.healthCheck.validation?.score)  // Security score (0-100)
console.log(result.healthCheck.responseTimeMs)  // Response time in ms
console.log(result.optedOut)                    // Whether feed opted out

// Generate a report
if (!result.optedOut) {
  const report = generateReport(result.feed, result.healthCheck)
  console.log(report.html)
}

Core Functions

`crawlFeed(url, config?)`

Crawl a single feed:

typescript

import { crawlFeed, type CrawlResult } from '@25xcodes/llmfeed-health-monitor'

const result = await crawlFeed('https://example.com/.well-known/mcp.llmfeed.json', {
  timeoutMs: 10000,          // Request timeout
  respectRobotsTxt: true,    // Check robots.txt opt-out
  userAgent: 'My-Bot/1.0'    // Custom user agent
})

// CrawlResult structure
interface CrawlResult {
  feed: {
    id: string               // Unique feed identifier
    url: string              // Feed URL
    domain: string           // Domain extracted from URL
    discoveredAt: number     // Timestamp
    optedOut: boolean        // Whether feed has opted out
  }
  healthCheck: {
    timestamp: number        // Check timestamp
    reachable: boolean       // Was feed accessible?
    httpStatus?: number      // HTTP status code
    responseTimeMs?: number  // Response time in ms
    validation?: {           // Validation result (if reachable)
      valid: boolean
      score: number          // 0-100
      errorCount: number
      warningCount: number
      signatureValid?: boolean
    }
    errors: string[]         // Any errors encountered
  }
  optedOut: boolean          // Whether feed opted out
  optOutReason?: string      // Reason for opting out
}

`crawlFeeds(urls, config?)`

Crawl multiple feeds concurrently:

typescript

import { crawlFeeds } from '@25xcodes/llmfeed-health-monitor'

const results = await crawlFeeds([
  'https://example.com/.well-known/mcp.llmfeed.json',
  'https://another.com/.well-known/mcp.llmfeed.json'
], {
  maxConcurrency: 5,  // Max concurrent requests
  timeoutMs: 10000
})

`generateReport(options)`

Generate health reports:

typescript

import { generateReport, MemoryStorage } from '@25xcodes/llmfeed-health-monitor'

const storage = new MemoryStorage()

// HTML report
const htmlReport = await generateReport({
  storage,
  format: 'html',
  outputPath: './report.html'
})

// JSON report
const jsonReport = await generateReport({
  storage,
  format: 'json',
  outputPath: './report.json'
})

// Markdown report
const mdReport = await generateReport({
  storage,
  format: 'markdown',
  outputPath: './report.md'
})

Utility Functions

typescript

import {
  normalizeUrl,
  generateFeedId,
  checkMetaOptOut,
  checkFeedOptOut
} from '@25xcodes/llmfeed-health-monitor'

// Normalize URLs for consistent storage
const normalized = normalizeUrl('example.com/llm.txt')
// => 'https://example.com/llm.txt'

// Generate stable feed IDs
const feedId = generateFeedId('https://example.com/llm.txt')

// Check for opt-out signals
const hasMetaOptOut = checkMetaOptOut(htmlContent)
const hasFeedOptOut = checkFeedOptOut(feedContent)

Storage Backends

MemoryStorage

Simple in-memory storage for testing and single runs:

typescript

import { MemoryStorage } from '@25xcodes/llmfeed-health-monitor'

const storage = new MemoryStorage()

// Storage persists only during runtime

Custom Storage

Implement the FeedStorage interface:

typescript

import type { FeedStorage, CrawlResult } from '@25xcodes/llmfeed-health-monitor'

class CustomStorage implements FeedStorage {
  async save(result: CrawlResult): Promise<void> {
    // Save to your database
  }
  
  async get(feedId: string): Promise<CrawlResult | null> {
    // Retrieve from database
  }
  
  async getAll(): Promise<CrawlResult[]> {
    // Get all results
  }
  
  async getHistory(feedId: string, limit?: number): Promise<CrawlResult[]> {
    // Get historical results
  }
}

CLI Usage

bash

# Crawl a single feed
npx llmfeed-health crawl https://example.com/.well-known/llm.txt

# Crawl multiple feeds from a file
npx llmfeed-health crawl --file ./feeds.txt

# Generate reports
npx llmfeed-health report --format html --output ./report.html
npx llmfeed-health report --format json --output ./report.json

# Run continuous monitoring
npx llmfeed-health monitor --interval 5m --feeds ./feeds.txt

Health Status

Feeds are classified into three health statuses:

Status	Criteria
🟢 healthy	HTTP 200, valid structure, valid signature (if present)
🟡 degraded	HTTP 200 but validation warnings, slow response
🔴 unhealthy	HTTP error, invalid structure, signature failure

Report Formats

HTML Report

Beautiful, interactive HTML reports with:

Summary statistics
Health status per feed
Response time graphs
Error details
Historical trends

JSON Report

Machine-readable format for integration:

json

{
  "generatedAt": "2025-12-01T14:30:00Z",
  "summary": {
    "total": 10,
    "healthy": 8,
    "degraded": 1,
    "unhealthy": 1
  },
  "feeds": [
    {
      "url": "https://example.com/llm.txt",
      "status": "healthy",
      "lastCrawled": "2025-12-01T14:25:00Z",
      "responseTime": 245
    }
  ]
}

Markdown Report

Simple text format for documentation:

markdown

# LLMFeed Health Report

Generated: 2025-12-01 14:30:00

## Summary

- Total Feeds: 10
- Healthy: 8
- Degraded: 1  
- Unhealthy: 1

## Feed Details

### https://example.com/llm.txt
- Status: 🟢 Healthy
- Response Time: 245ms
- Last Crawled: 2025-12-01 14:25:00

Next Steps

Installation - Detailed installation guide
Crawling - Complete crawling guide
Reports - Report customization
API Reference - Full API documentation

@25xcodes/llmfeed-health-monitor ​

Format Support ​

Features ​

Quick Start ​

Core Functions ​

crawlFeed(url, config?) ​

crawlFeeds(urls, config?) ​

generateReport(options) ​

Utility Functions ​

Storage Backends ​

MemoryStorage ​

Custom Storage ​

CLI Usage ​

Health Status ​

Report Formats ​

HTML Report ​

JSON Report ​

Markdown Report ​

Next Steps ​

@25xcodes/llmfeed-health-monitor

Format Support

Features

Quick Start

Core Functions

`crawlFeed(url, config?)`

`crawlFeeds(urls, config?)`

`generateReport(options)`

Utility Functions

Storage Backends

MemoryStorage

Custom Storage

CLI Usage

Health Status

Report Formats

HTML Report

JSON Report

Markdown Report

Next Steps