←back to thread

261 points david927 | 1 comments | | HN request time: 0s | source

What are you working on? Any new ideas that you're thinking about?
Show context
sim04ful ◴[] No.43159838[source]
I'm working on a tool called Font of Web https://fontofweb.com that helps identify the fonts used on any website. It not only detects the fonts but also shows exactly where they're used (which HTML elements) and how they're styled (weight, line height, size, letter spacing).

My goal is to build a comprehensive database of font usage across the web. By collecting and analyzing this data, I believe we can uncover valuable trends, such as:

* Common font pairings * Popular heading fonts over time * Market share of commercial fonts * Top font foundries based on actual usage

I originally built a version of this four years ago and saw a surprising amount of organic interest. I've now rebuilt the tool from the ground up, switching from a Puppeteer-based crawler to an invisible iframe approach. (More details in another post)

Check out the current version at https://fontofweb.com. I would appreciate any feedback

replies(2): >>43160998 #>>43161473 #
jay-barronville ◴[] No.43160998[source]
> I've now rebuilt the tool from the ground up, switching from a Puppeteer-based crawler to an invisible iframe approach.

Where can I go to learn more about your invisible `<iframe>` approach/implementation?

replies(1): >>43161697 #
1. sim04ful ◴[] No.43161697[source]
I figured it out mostly from first principles. It's such a niche crawling method that was perfectly limited to my use-case, and there's alot to say. But the main idea is that you can inject a crawling script in the html of the site via a proxy you control. E.g proxy.yoursite.com?url=<SITE_YOU_WANT_TO_CRAWL>. Then once you've got the data you can call window.postMessage(data) to communicate with the main window.

It's somewhat similar to how browser proxies like: https://proxyium.com/ and https://www.proxysite.com/ fetch the html on your behalf.