←back to thread

412 points thepuppet33r | 5 comments | | HN request time: 1.542s | source
Show context
random3 ◴[] No.42177658[source]
Fun fact about Google Scholar: it’s "free", but it’s just another soulless Google product - no clear strategy, no support, and a fragile proprietary dependency in what should be an open ecosystem. This creates inherent risks for the academic community. We need the equivalent of arXiv for Google Scholar
replies(8): >>42177738 #>>42178221 #>>42178675 #>>42179796 #>>42180759 #>>42181058 #>>42181064 #>>42183137 #
sitkack ◴[] No.42178675[source]
And that is semantic scholar, https://www.semanticscholar.org/
replies(4): >>42178841 #>>42179369 #>>42181081 #>>42189606 #
1. mapmeld ◴[] No.42178841[source]
For people unfamiliar, Semantic Scholar is run by the Allen Institute and has been researching accurate AI summarization and semantic search for years. Also they have support for author name changes.
replies(1): >>42179115 #
2. crazygringo ◴[] No.42179115[source]
How does it compare with Google Scholar?

It advertises itself as "from all fields of science" -- does that includes fields like economics? Sociology? Political science? What about law journals? In other words, is the coverage as broad? And if it doesn't include certain fields, where is the "science" line drawn?

And I'm curious if people find it to be as useful (or more) just in terms of UX, features, etc.

replies(2): >>42179753 #>>42179807 #
3. Onawa ◴[] No.42179753[source]
Semantic Scholar's search is pretty good, but there are also a variety of other (paid) projects that expand on its API. Look at tools like Scite and LitMaps for what's possible with the semantic scholar dataset.

As for coverage, I think it focuses more on the life sciences, but I'm not positive about that.

4. ninjin ◴[] No.42179807[source]
They are substantially smaller in coverage, but have higher quality in my experience. Remarkably, they are also willing to correct their data if you notify them. This of course in is stark contrast to Google Scholar where the metadata of papers is frequently wildly inaccurate. On top of this, Semantic Scholar shares their underlying data (although you need to request an API key). Overall, they have been growing slowly and steadily over the years and I have a lot of respect for what their team is doing for researchers such as myself.

Now for the less great.

They are pushing the concept of "Highly Influential Citations" [1] as their default metric, which to the best of my knowledge is based on a singular workshop publication that produced a classifier trained on about 500 training samples to classify citations. I am a very harsh critic of any metrics for scientific impact. But this is just utter madness. Guaranteeing that this metric is not grossly misleading is nearly impossible and it feels like the only reason they picked it is because Etzioni (AI2 head) is the last author of the workshop paper. It should have been at best a novelty metric and certainly not the default one.

[1]: https://webflow.semanticscholar.org/faq/influential-citation...

Recently, they introduced their Semantic Reader functionality and are now pushing it as a default way to access PDFs on the website. Forcing you to click on a drop down to access plain PDFs. It may or may not be a great tool, but it feels somewhat obvious that they are attempting to use shady patterns to push you in the direction they want.

Lastly, they have started using Google Analytics. Which is not great, but I can understand why they go for the industry default.

Overall, I use them nearly daily and they are the best offering out there for my area of research. Although, I at times feel tempted to grab the data and create an alternative (simpler) frontend with fewer distractions and "modern" web nonsense.

replies(1): >>42182914 #
5. crazygringo ◴[] No.42182914{3}[source]
Thank you so much!