-
Notifications
You must be signed in to change notification settings - Fork 12
New docusaurus llms-txt plugin #290
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Closed
Closed
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
❌ Deploy Preview for signalwire-docs failed.
|
2 tasks
Updated readme
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
docusaurus-plugin-llms-txt
A powerful Docusaurus plugin that generates Markdown versions of your HTML pages and creates an
llms.txt
index file for AI/LLM consumption. Perfect for making your documentation easily accessible to Large Language Models while maintaining human-readable markdown files.Features
Installation
Peer Dependencies
This plugin requires several peer dependencies. Install them if not already present:
Quick Start
Basic Setup
Add the plugin to your
docusaurus.config.js
:Basic Configuration
After building your site (
npm run build
), you'll find:llms.txt
in your build output directoryConfiguration Options
Main Plugin Options
siteTitle
string
siteDescription
string
undefined
depth
1|2|3|4|5
1
enableDescriptions
boolean
true
optionalLinks
OptionalLink[]
[]
includeOrder
string[]
[]
runOnPostBuild
boolean
true
onRouteError
'ignore'|'log'|'warn'|'throw'
'warn'
logLevel
0|1|2|3
1
Content Options (
content
)Note: All content options are optional. If you don't specify a
content
object, all options use their defaults.enableMarkdownFiles
boolean
true
relativePaths
boolean
true
includeBlog
boolean
false
includePages
boolean
false
includeDocs
boolean
true
excludeRoutes
string[]
[]
contentSelectors
string[]
routeRules
RouteRule[]
[]
remarkStringify
object
{}
remarkGfm
boolean|object
true
rehypeProcessTables
boolean
true
Detailed Configuration
Depth Configuration
The
depth
option controls how deep the hierarchical organization goes in your document tree. This is crucial for determining how your URLs are categorized.How it works:
depth: 1
:/api/users
→api
categorydepth: 2
:/api/users/create
→api/users
categorydepth: 3
:/api/users/create/advanced
→api/users/create
categoryUse cases:
depth: 1
- Simple sites with few top-level sectionsdepth: 2
- Most documentation sites with clear section/subsection structuredepth: 3+
- Complex sites with deep hierarchies or very specific organization needsOptional Links
Add external or additional links to your llms.txt in a separate "Optional" section.
Structure:
Required fields:
title
- Display text for the linkurl
- The URL to link toExample:
Output:
Include Order
Controls the order in which categories appear in your llms.txt using glob patterns. Categories matching earlier patterns appear first.
Pattern matching rules:
/**
for matching entire directory trees*
for single-level wildcardsError Handling
The
onRouteError
option controls what happens when individual pages fail to process. Valid values:'ignore'
,'log'
,'warn'
,'throw'
.'ignore'
: Skip failed routes silently'log'
: Log failures but continue (no console output in normal mode)'warn'
: Show warnings for failures but continue (recommended)'throw'
: Stop entire build on first failureLogging Levels
The
logLevel
option controls verbosity of console output. Range: 0-3 (integer).0
(Quiet): Only errors and final success/failure1
(Normal): Errors, warnings, and completion messages (default)2
(Verbose): Above + processing info and statistics3
(Debug): Above + detailed debug informationPath Configuration
The
relativePaths
option controls link format in bothllms.txt
and markdown files:true
:./getting-started/index.md
,../api/reference.md
false
:https://mysite.com/getting-started/
,https://mysite.com/api/reference/
When to use relative paths:
When to use absolute paths:
Route Exclusion
Use glob patterns to exclude specific routes or route patterns:
Common exclusion patterns:
**/_category_/**
- Docusaurus auto-generated category pages/tags/**
- Blog tag pages/archive/**
- Archived content**/*.xml
- Sitemap and RSS files**/internal/**
- Internal documentationContent Selectors
CSS selectors used to extract main content from HTML pages. The plugin tries each selector in order until it finds content.
Default selectors (used when
contentSelectors
is not specified):How it works:
[]
will use the default selectors aboveCustom selectors for different themes:
Important notes:
'main'
or'article'
)Debugging content extraction:
logLevel: 3
to see which selector is being used for each pagedocument.querySelector('your-selector')
Markdown Processing
remarkStringify
OptionsControls how the HTML→Markdown conversion formats the output. These options are passed directly to the remark-stringify library.
📖 For complete option reference, see: remark-stringify options
remarkGfm
OptionsControls table processing, strikethrough text, task lists, and other GitHub-style markdown features. These options are passed directly to the remark-gfm library.
Default values when
remarkGfm: true
:When you set
remarkGfm: true
, the plugin automatically applies these defaults:You can override any of these by providing an object instead of
true
:📖 For complete option reference, see: remark-gfm options
rehypeProcessTables
(Plugin Option)Type:
boolean
| Default:true
This is a plugin-specific option that controls whether to process HTML tables for better markdown conversion. When enabled, the plugin intelligently processes HTML tables to create clean markdown tables. When disabled, tables are left as raw HTML in the markdown output.
Route Rules
Route rules provide powerful per-route customization capabilities. They allow you to override any processing option for specific routes or route patterns.
Basic Route Rule Structure
Required fields:
route
- The glob pattern to match routes againstOptional fields (all have validation constraints):
depth
- Must be integer 1-5categoryName
- Any stringcontentSelectors
- Array of CSS selector stringsincludeOrder
- Array of glob pattern stringsRoute Pattern Matching
Route rules use glob patterns to match routes:
Rule Priority
When multiple rules match the same route, the most specific rule wins:
Advanced Examples
API Documentation with Custom Structure:
Multi-Language Documentation:
CLI Commands
The plugin provides CLI commands for standalone operation and cleanup:
Generate Command
Generates
llms.txt
and markdown files using cached routes from a previous build.Arguments:
siteDir
(optional) - Path to your Docusaurus site directory. Defaults to current working directory.Prerequisites:
You must run
npm run build
first to create the route cache.Examples:
How it works:
When to use:
runOnPostBuild: false
configuredClean Command
Removes all generated markdown files and
llms.txt
using cached file information.Arguments:
siteDir
(optional) - Path to your Docusaurus site directory. Defaults to current working directory.Options:
--clear-cache
- Also clear the plugin cache directoryExamples:
What gets cleaned:
llms.txt
index file--clear-cache
: The entire plugin cache directorySafe operation:
When to use:
enableMarkdownFiles: true/false
Advanced Configuration Examples
Multi-Language Support
API Documentation Focus
Blog-Heavy Site
Custom Content Extraction
Understanding the Output
llms.txt Structure
The generated
llms.txt
follows this structure:Markdown Files
When
enableMarkdownFiles
is true, individual markdown files are created for each page:relativePaths
settingTroubleshooting
Common Issues
"No cached routes found"
npm run build
first to generate the cachedocusaurus.config.js
Empty or minimal content
contentSelectors
configurationlogLevel: 3
for debug outputRoute processing failures
onRouteError: 'ignore'
to skip problematic routeslogLevel: 2
to see which routes are failingexcludeRoutes
to filter out problematic pathsDebug Configuration
Performance Optimization
For large sites:
Caching
The plugin uses intelligent caching to speed up subsequent builds:
.docusaurus/docusaurus-plugin-llms-txt/
llms-txt-clean --clear-cache
to reset the cacheLicense
MIT