←back to thread

1222 points phantomathkg | 7 comments | | HN request time: 0.427s | source | bottom
1. kepano ◴[] No.44065986[source]
I built Obsidian Web Clipper (open source, MIT) to replace my read-it-later app and save everything to local markdown files. Now that Obsidian Bases is available, it makes for a very nice web archival tool and reading experience. Here's a video:

https://mastodon.social/@kepano/114553164915046938

You can use Web Clipper with any app that supports Markdown, not just Obsidian.

Defuddle is the underlying HTML-to-Markdown library I made for Web Clipper, and can also be used as a CLI:

https://github.com/kepano/defuddle

https://github.com/kepano/defuddle-cli

replies(4): >>44067250 #>>44067701 #>>44067853 #>>44067859 #
2. jonahx ◴[] No.44067250[source]
This is very cool.

The thing I really want is this, combined with some automated local background LLM training / rag (not sure what the right approach is) process. So that, at the end of the day, everything I bookmark get saved locally, can be read in a nice format like you have the video, and be semantically queried, and it's all local:

"What was that article I saw read 1-3 months ago some new type of LLM training?"

"Find that really nice explanation of determinants article"

etc...

Have you investigated anything like that?

replies(1): >>44067317 #
3. kepano ◴[] No.44067317[source]
Since the content is saved to Markdown you can use it with pretty much any tool that will ingest that content.

There's also Obsidian Web Clipper's Interpreter feature, which lets you run prompts on a web page before saving:

https://help.obsidian.md/web-clipper/interpreter

4. gausswho ◴[] No.44067701[source]
Thank you for your work on this! It's become my go-to since leaving Pocket.

I do have a bug report: even when explicitly specifying which vault to send clippings to, what I experience is that it sends to my last opened one. On Android w Firefox Nightly and the extension.

5. dorian-graph ◴[] No.44067853[source]
Do you have a trick for getting the images as well, as opposed to them being links to the remotely hosted?
replies(2): >>44070746 #>>44070820 #
6. keybits ◴[] No.44070746[source]
Obsidian recently introduced a native 'Download attachments for current file' which you can invoke with cmd / ctrl + p.

I like this as I don't always want all the images for something I've clipped from the web. This gives me the choice.

7. DoingIsLearning ◴[] No.44070820[source]
I think that is just a convenient way to do it with markdown.

Theoretically you could embed images in markdown with 'data:' scheme. But I am unless it is very small images it will probably not be very efficient to embedded the data in a text file.