←back to thread

169 points Tammilore | 5 comments | | HN request time: 1.165s | source

Documind is an open-source tool that turns documents into structured data using AI.

What it does:

- Extracts specific data from PDFs based on your custom schema - Returns clean, structured JSON that's ready to use - Works with just a PDF link + your schema definition

Just run npm install documind to get started.

1. vunderba ◴[] No.42178522[source]
OP, you've been accused of literally ripping off somebody's more popular repository and posing it as your own.

https://news.ycombinator.com/item?id=42178413

You may wanna get ahead of this because the evidence is fairly damning. Failing to even give credit to the original project is a pretty gross move.

replies(1): >>42178774 #
2. Tammilore ◴[] No.42178774[source]
Hi. This was definitely not the intention.

I made sure to copy and past the MIT license in Zerox exactly as it was into the folder of the code that uses it. I also included it in the main license file as well. If there's anything I could do to make corrections please let me know so I'd change that ASAP.

replies(1): >>42210625 #
3. ankenyr ◴[] No.42210625[source]
Your initial commit makes it look like you wrote all the code. https://github.com/DocumindHQ/documind/commit/d91121739df038... This is because you copied and uploaded the code instead of forking. You could do a lot by restoring attribution. Your history would look the same as https://github.com/getomni-ai/zerox/commits/main/ and diverge from where you forked.

People are getting upset because this is not a nice thing to do. Attribution is significant. No one would care if you replaced all the names with the new ones in a fork because they would see commits that do that.

replies(1): >>42212363 #
4. Tammilore ◴[] No.42212363{3}[source]
Hi. Thank you for pointing this out. I totally understand now that forking would have kept the commit history visible and made the attribution clearer. I have since added a direct note in the repo acknowledging that it is built on the original Zerox project and also linked back to it. If there’s anything else you’d suggest, happy to hear it. Thanks again.
replies(1): >>42216000 #
5. ankenyr ◴[] No.42216000{4}[source]
It would be better to attribute. You can still do this by fixing the git commit history and doing a force push. It would do a lot to make people feel better.