(llmstxt.firecrawl.dev)

Hey HN! It’s Eric from Firecrawl (https://firecrawl.dev).

I just launched llms.txt Generator, a tool that transforms any website into a clean, structured text file optimized for feeding to LLMs. You can learn more about the standard at https://llmstxt.org.

Here’s how it works under the hood:

1. We use Firecrawl, our open-source scraper, to fetch the full site, handling JavaScript-heavy pages and complex structures. 2. The markdown content is parsed and then the title and description are extracted using GPT-4o-mini. 3. The everything is combined and the result is a lightweight llms.txt file that you can paste into any LLM.

Let me know what you think!

1. jondwillis ◴[22 Nov 24 15:25 UTC] No.42214547[source]▶

>>42207756 (OP) #

Plain HTTP and passing an API key as a URL query parameter? Yikes!

2. DrillShopper ◴[22 Nov 24 15:59 UTC] No.42214869[source]▶

>>42207756 (OP) #

Thanks for facilitating even more widespread and frictionless copyright violations

replies(1): >>42220258 #

3. IndieCoder ◴[22 Nov 24 16:16 UTC] No.42215031[source]▶

>>42207756 (OP) #

I like the idea but Firecrawl and GPT4o is quite heavy. I use https://github.com/unclecode/crawl4ai in some projects, it works very well without these dependencies and is modular so you can use LLMs but do not have to :)

4. throwaway314155 ◴[22 Nov 24 21:13 UTC] No.42217413[source]▶

>>42207756 (OP) #

For a simple solution, you can just right click->Save Page As.. and upload the resulting `.html` file into Claude/ChatGPT as an attachment. They're both more than capable of parsing the article content from the HTML without needing any pre-processing.

5. e-clinton ◴[23 Nov 24 10:45 UTC] No.42220258[source]▶

>>42214869 #

Wait till you hear about copy/paste

replies(2): >>42228406 #>>42229447 #

6. ◴[24 Nov 24 15:33 UTC] No.42228406{3}[source]▶

>>42220258 #

7. DrillShopper ◴[24 Nov 24 18:28 UTC] No.42229447{3}[source]▶

>>42220258 #

Not the same thing and not at the same scale

↑

Show HN: Llms.txt Generator – Turn websites into a text file to feed to any LLM