Skip to content

dotenvx/llmstxt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

36 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

llmstxt

generate llms.txt–using your sitemap.xml. A llms.txt file is a curated list of your website's pages in markdown format, perfect for training or fine-tuning language models with your content.



 

Quickstart npm version

$ npx -y llmstxt gen https://vercel.com/sitemap.xml
  • expand example
    $ npx -y llmstxt gen https://vercel.com/sitemap.xml
    - [Vercel Documentation](https://vercel.com/docs): Vercel's Frontend Cloud gives developers frameworks, workflows, and infrastructure to build a faster, more personalized web
    - [Accounts on Vercel](https://vercel.com/docs/accounts): Learn how to manage your Vercel account and team members.
    - [Create a Team](https://vercel.com/docs/accounts/create-a-team): Teams on Vercel allow you to collaborate with members on projects, and grant you access to additional resources. Learn how to create or join a team on Vercel.
    - [Create an Account](https://vercel.com/docs/accounts/create-an-account): Learn how to create a Hobby team on Vercel and manage your login connections through your dashboard.
    - [Manage Emails](https://vercel.com/docs/accounts/manage-emails): Learn how to manage your email addresses on Vercel.
    - [Account Plans on Vercel](https://vercel.com/docs/accounts/plans): Learn about the different plans available on Vercel.
    - [Vercel Enterprise Plan](https://vercel.com/docs/accounts/plans/enterprise): Learn about the Enterprise plan for Vercel, including features, pricing, and more.
    ...
    

 

Basics

Basic usage

  • `gen https://yoursite.com/sitemap.xml`

    Outputs to stdout.

    $ llmstxt gen https://vercel.com/sitemap.xml
    - [Vercel Documentation](https://vercel.com/docs): Vercel's Frontend Cloud gives developers frameworks, workflows, and infrastructure to build a faster, more personalized web
    - [Accounts on Vercel](https://vercel.com/docs/accounts): Learn how to manage your Vercel account and team members.
    - [Create a Team](https://vercel.com/docs/accounts/create-a-team): Teams on Vercel allow you to collaborate with members on projects, and grant you access to additional resources. Learn how to create or join a team on Vercel.
    - [Create an Account](https://vercel.com/docs/accounts/create-an-account): Learn how to create a Hobby team on Vercel and manage your login connections through your dashboard.
    - [Manage Emails](https://vercel.com/docs/accounts/manage-emails): Learn how to manage your email addresses on Vercel.
    - [Account Plans on Vercel](https://vercel.com/docs/accounts/plans): Learn about the different plans available on Vercel.
    - [Vercel Enterprise Plan](https://vercel.com/docs/accounts/plans/enterprise): Learn about the Enterprise plan for Vercel, including features, pricing, and more.
    ...
  • `gen https://yoursite.com/sitemap.xml > llms.txt`

    Write to file.

    $ llmstxt gen https://vercel.com/sitemap.xml > llms.txt

 

Advanced

Advanced options

  • `gen --exclude-path` - Exclude path(s)

    Exclude paths from generation.

    # exclude all blog posts
    $ llmstxt gen https://vercel.com/sitemap.xml --exclude-path "**/blog/**"
    
    # exclude all docs
    $ llmstxt gen https://vercel.com/sitemap.xml --exclude-path "**/docs/**"
  • `gen --include-path` - Include path(s)

    Include paths for generation.

    # include all docs only
    $ llmstxt gen https://vercel.com/sitemap.xml --include-path "**/docs/**"
    
    # include all blogs only
    $ llmstxt gen https://vercel.com/sitemap.xml -ip "**/blog/**"
  • `gen --replace-title s/pattern/replacement/` - Replace string(s) from title

    Use --replace-title to remove redundant text from your page titles. For example, dotenvx's titles all end with | dotenvx. I want to replace those with empty string.

    $ llmstxt gen https://vercel.com/sitemap.xml --replace-title 's/\| dotenvx//'
  • `gen --title 'Your Heading'` - set title

    Set your website's heading 1 title.

    $ llmstxt gen https://vercel.com/sitemap.xml --title 'dotenvx'
  • `gen --description 'Some description'` - set description

    Set your website's description.

    $ llmstxt gen https://vercel.com/sitemap.xml --description 'This is a description' 

 

FAQ

Can you give me a real world example?

I'm using it to generate dotenvx.com/llms.txt with the following command:

npx -y llmstxt@latest gen https://example.com/sitemap.xml -ep "**/privacy**" -ep "**/terms**" -ep "**/blog/**" -ep "**/stats/**" -ep "**/support/**" -rt 's/\| dotenvx//' -t 'dotenvx' > llms.txt