←back to thread

28 points ericciarla | 4 comments | | HN request time: 0s | source

Hey HN! It’s Eric from Firecrawl (https://firecrawl.dev).

I just launched llms.txt Generator, a tool that transforms any website into a clean, structured text file optimized for feeding to LLMs. You can learn more about the standard at https://llmstxt.org.

Here’s how it works under the hood:

1. We use Firecrawl, our open-source scraper, to fetch the full site, handling JavaScript-heavy pages and complex structures. 2. The markdown content is parsed and then the title and description are extracted using GPT-4o-mini. 3. The everything is combined and the result is a lightweight llms.txt file that you can paste into any LLM.

Let me know what you think!

1. DrillShopper ◴[] No.42214869[source]
Thanks for facilitating even more widespread and frictionless copyright violations
replies(1): >>42220258 #
2. e-clinton ◴[] No.42220258[source]
Wait till you hear about copy/paste
replies(2): >>42228406 #>>42229447 #
3. ◴[] No.42228406[source]
4. DrillShopper ◴[] No.42229447[source]
Not the same thing and not at the same scale