An amateur historian has discovered a long-lost short story by Bram Stoker

1. nuz ◴[21 Oct 24 16:50 UTC] No.41905984[source]▶

>>41905664 (OP) #

Seems like a non pessimistic idea of something LLMs could help us out with. Mass analysis of old texts for new finds like this. If this one exists surely there are many more just a mass analysis away

replies(2): >>41906047 #>>41906058 #

2. steve_adams_86 ◴[21 Oct 24 16:57 UTC] No.41906047[source]▶

>>41905984 (TP) #

I accidentally got Zed to parse way more code than I intended last night and it cost close to $2 on the anthropic API. All I can think is how incredibly expensive it would be to feed an LLM text in hopes of making those connections. I don’t think you’re wrong, though. This is the territory where their ability to find patterns can feel pretty magical. It would cost many, many, many $2 though

replies(2): >>41906078 #>>41906144 #

3. hyperbrainer ◴[21 Oct 24 16:59 UTC] No.41906058[source]▶

>>41905984 (TP) #

The problem with copyright is going to be a big hurdle though.

replies(2): >>41906158 #>>41906170 #

4. pcthrowaway ◴[21 Oct 24 17:01 UTC] No.41906078[source]▶

>>41906047 #

This is a pretty good case for just using a local model. Even if it's 50% worse than Anthropic or whatever the gap is now between open models and proprietary state of the art, it's still likely 'good enough' to categorize a story in an old newspaper as missing from an author's known bibliography.

replies(1): >>41907039 #

5. diggan ◴[21 Oct 24 17:10 UTC] No.41906144[source]▶

>>41906047 #

> I accidentally got Zed to parse way more code than I intended last night and it cost close to $2 on the anthropic API

Is that one API call or some out of control process slinging 100s of requests?

Must have been a ton of data, as their most expensive model (Opus) seems to $15 per million input tokens. I guess if you just set it to use an entire project as the input, you'll hit 1m input tokens quickly.

replies(1): >>41906402 #

6. diggan ◴[21 Oct 24 17:11 UTC] No.41906158[source]▶

>>41906058 #

Why? Old texts would be out of copyright, and even if they weren't, as long as you're not publishing the source material or anything containing the source material (or anything that can verbatim output the source), it seems you'd be in the clear.

replies(1): >>41907229 #

7. ebiester ◴[21 Oct 24 17:12 UTC] No.41906170[source]▶

>>41906058 #

If we go to the era of public domain, there is no worry about copyright.

8. steve_adams_86 ◴[21 Oct 24 17:38 UTC] No.41906402{3}[source]▶

>>41906144 #

Come to think of it, I’m not sure how Zed performs LLM requests with the inline assistant.

I wasn’t working in an enormous file, but I meant to highlight a block and accidentally highlighted the entire file and asked it to do something that made no sense in that context. It did its best to do something with the situation and eventually ran out of steam, haha. It’s possible that multiple requests needed to be made, or I was around the 200k context window.

Previous to this I’m fairly sure most of my requests cost fractions of pennies. My credit takes ages to decrease by any meaningful amount. Except until last night. It’s normally an extremely cost-effective tool for me.

9. steve_adams_86 ◴[21 Oct 24 18:39 UTC] No.41907039{3}[source]▶

>>41906078 #

Good point. I use llama3.1 for a lot of small tasks and rarely feel like I need to use Claude instead. It’s fine. I’m even running the model a (big) step down from 70b, because I’ve only got 32GB of ram. It’s a solid model that probably costs me next to nothing to run.

10. hyperbrainer ◴[21 Oct 24 18:59 UTC] No.41907229{3}[source]▶

>>41906158 #

You are right! I forgot about this completely.