Most active commenters

crazygringo(6)
asdff(6)
(5)
insane_dreamer(5)
mindcrime(5)
thepuppet33r(4)
afandian(3)
freefaler(3)
random3(3)
dekhn(3)

Popular/hot comments

>>42177658 #
>>42175704 #
>>42179707 #
>>42177049 #
>>42178675 #
>>42175714 #
>>42179815 #
>>42182014 #
>>42175853 #
>>42176089 #
>>42175024 #
>>42177217 #
>>42177542 #
>>42175773 #
>>42177834 #

20 years of Google Scholar

(blog.google)

1. thepuppet33r ◴[18 Nov 24 18:01 UTC] No.42175024[source]▶

>>42175023 (OP) #

Yes, Google deserves to be distrusted and avoided as a whole, but Google Scholar is a genuinely net good for humanity.

replies(3): >>42175704 #>>42180078 #>>42181021 #

2. renewiltord ◴[18 Nov 24 18:19 UTC] No.42175246[source]▶

>>42175023 (OP) #

Google Scholar is fantastic stuff. I am so grateful for it. It’s crazy how easy it is to find papers these days by just going to it. University library search functions are completely useless in comparison.

3. aleph_minus_one ◴[18 Nov 24 18:20 UTC] No.42175264[source]▶

>>42175075 #

Since you are new at HN: what does this have to do with Google Scholar?

replies(1): >>42175296 #

4. markerz ◴[18 Nov 24 18:22 UTC] No.42175296{3}[source]▶

>>42175264 #

I am 100% sure this is a bot to farm link reputation. I have flagged it.

5. elashri ◴[18 Nov 24 18:25 UTC] No.42175329[source]▶

>>42175023 (OP) #

I did not know about PDF Scholar Readee extension [1]. Unfortunately the reason is that I use Firefox only (and safari iOS) and it is not available there. The AI outlines will be useful and I can think of myself using it.

I do not want to comment on number 20. I really wished that I joined CERN 10 years earlier but then it is the mistake of my parents :)

[1] https://chromewebstore.google.com/detail/google-scholar-pdf-...

6. dctoedt ◴[18 Nov 24 18:29 UTC] No.42175366[source]▶

>>42175023 (OP) #

I'd not known about "F.D.C. Willard" — the nom de plume of a Michigan State physics professor's Siamese cat, Chester — who was listed as a co-author of a number of the professor's physics papers.

More on Chester and his co-author status: https://en.wikipedia.org/wiki/F._D._C._Willard

7. mananaysiempre ◴[18 Nov 24 18:51 UTC] No.42175600[source]▶

>>42175023 (OP) #

21. Google Scholar will deny access to you if you (need to) self-host a VPN on a common VPS provider. Being a Google product, it also can’t be special-cased in your routing table. (I genuinely had to retrain myself to use Google Scholar again once I no longer had that need.)

22. Switching on sort by date will impose a filter to papers published within the year, and you cannot do anything about that.

replies(2): >>42175802 #>>42175847 #

8. svat ◴[18 Nov 24 18:55 UTC] No.42175640[source]▶

>>42175023 (OP) #

Related: 2014 article by Steven Levy, titled "The Gentleman Who Made Scholar": https://www.wired.com/2014/10/the-gentleman-who-made-scholar...

replies(1): >>42176203 #

9. dumpHero2 ◴[18 Nov 24 19:00 UTC] No.42175704[source]▶

>>42175024 #

I have similar feeing for Gmail (it's effective anti spam engine), google maps and google docs (which pioneered shared docs. It feels outdated on many fronts now, but it was a pioneer).

replies(8): >>42175773 #>>42175878 #>>42176170 #>>42177151 #>>42177404 #>>42179179 #>>42186118 #>>42187586 #

10. zeroonetwothree ◴[18 Nov 24 19:01 UTC] No.42175714[source]▶

>>42175023 (OP) #

Google Scholar is so good. I started doing research right when it came out and it was amazingly helpful. I can’t imagine how it was done before.

replies(4): >>42175801 #>>42176741 #>>42177679 #>>42186276 #

11. kylebenzle ◴[18 Nov 24 19:03 UTC] No.42175726[source]▶

>>42175023 (OP) #

I was hoping it would be 20 tips and tricks on how to use the service better not random fun facts about its history :-(

12. chromatin ◴[18 Nov 24 19:06 UTC] No.42175750[source]▶

>>42175023 (OP) #

21. No API

13. roflmaostc ◴[18 Nov 24 19:08 UTC] No.42175773{3}[source]▶

>>42175704 #

anti-spam is only an issue if people dump their email anywhere. I usually register my mail on webpages as first.last+webpage@mail.com and once they would spam this mail, it gets blacklisted.

I literally get only 1-3 real spam mails per month without any filter.

replies(3): >>42175853 #>>42176172 #>>42186581 #

14. IshKebab ◴[18 Nov 24 19:11 UTC] No.42175801[source]▶

>>42175714 #

There are alternatives, like Web of Knowledge. You basically need to be in a Uni for that though.

15. eesmith ◴[18 Nov 24 19:11 UTC] No.42175802[source]▶

>>42175600 #

> 22. Switching on sort by date will impose a filter to papers published within the year, and you cannot do anything about that.

!!! And here I thought it's been broken for years, and a sign of decay due to lack of internal support.

replies(1): >>42175856 #

16. ◴[18 Nov 24 19:15 UTC] No.42175847[source]▶

>>42175600 #

17. dripton ◴[18 Nov 24 19:16 UTC] No.42175853{4}[source]▶

>>42175773 #

Words great, until a page rejects email with a '+' in it.

replies(3): >>42175970 #>>42176067 #>>42176089 #

18. buildbot ◴[18 Nov 24 19:17 UTC] No.42175856{3}[source]▶

>>42175802 #

I swear this was working for me until literally today, it was really useful to find older ML papers?!

replies(1): >>42176416 #

19. coderintherye ◴[18 Nov 24 19:19 UTC] No.42175878{3}[source]▶

>>42175704 #

Good for users of Gmail, but is it a net good? Gmail spam prevention is great for the Google Apps orgs I manage. However, for the other inboxes the vast majority of spam they receive comes from @gmail.com

replies(1): >>42177789 #

20. 6510 ◴[18 Nov 24 19:28 UTC] No.42175970{5}[source]▶

>>42175853 #

dots are ignored, can filter by john.doe@gmail.com

not sure about capital letters

21. malshe ◴[18 Nov 24 19:37 UTC] No.42176066[source]▶

>>42175023 (OP) #

I use Google Scholar daily and it's been a fantastic resource. Google Scholar with Zotero completes my articles search and storage.

Btw, Anurag's last name is misspelt under the picture. It reads "Achurya" instead of "Acharya"

Edit: They fixed it

22. hks0 ◴[18 Nov 24 19:37 UTC] No.42176067{5}[source]▶

>>42175853 #

Not everyone's cup of tea, but quite nice if one can afford it: I have my personal domain and a catch-all inbox. So if I want to register at acme-co.xyz I will just use acmecoxyz@my-domain.tld

Maybe I should start using random words though? Wonder if someone will go bananas seeing their brand's name on my domain.

replies(1): >>42177840 #

23. AshamedCaptain ◴[18 Nov 24 19:39 UTC] No.42176089{5}[source]▶

>>42175853 #

Or just knows about this Gmail trick (it's been 20 years already) and sends spam to your real mailbox.

Actually, I am surprised _any_ spammy website these days would even honor the part after the +, and not just directly send to the real mailbox name.

replies(3): >>42177255 #>>42177362 #>>42182174 #

24. whiplash451 ◴[18 Nov 24 19:49 UTC] No.42176170{3}[source]▶

>>42175704 #

Try MS OneDrive before calling google docs outdated

Google spanks everyone else on robustness and responsiveness

replies(2): >>42176281 #>>42179002 #

25. janalsncm ◴[18 Nov 24 19:49 UTC] No.42176172{4}[source]▶

>>42175773 #

I see this recommendation everywhere and I am genuinely surprised that it works. Any spammer can find out your real address since there is an obvious mapping from + addresses to your real address. An actual solution would hide this mapping.

replies(1): >>42176304 #

26. Thrymr ◴[18 Nov 24 19:52 UTC] No.42176203[source]▶

>>42175640 #

> Would he want to continue working on Scholar for another ten years? “One always believes there are other opportunities, but the problem is how to pursue them when you are in a place you like and you have been doing really well. I can do problems that seem very interesting me — but the biggest impact I can possible make is helping people who are solving the world’s problems to be more efficient. If I can make the world’s researchers ten percent more efficient, consider the cumulative impact of that. So if I ended up spending the next ten years going this, I think I would be extremely happy.”

Has he still been working on it in the 10 years since this article? His name is in the byline of the new blog post, but it's not clear from that how much he's been working on it.

replies(1): >>42176475 #

27. lbeckman314 ◴[18 Nov 24 19:52 UTC] No.42176205[source]▶

>>42175023 (OP) #

> 18. A paw-sitive contribution to Physics. F.D.C Willard (otherwise known as Chester, the Siamese cat) is listed as a co-author on an article entitled: “Two, Three, and Four-Atom Exchange Effects” that explores the magnetic properties of solid helium-3 and how interactions between its atoms influence its behavior at extremely low temperatures. Chester’s starring role came about because his co-author/owner, Jack H. Hetherington wrote the entire paper with the plural “we” instead of a single “I.”

---

'Two-, Three-, and Four-Atom Exchange Effects in bcc 3He' by J. H. Hetherington and F. D. C. Willard [0, 1, 2]

[0] https://xkeys.com/media/wysiwyg/smartwave/porto/category/abo...

[1] https://xkeys.com/about/jackspages/fdcwillard.html

[2] https://en.wikipedia.org/wiki/F._D._C._Willard

replies(1): >>42179071 #

28. russellbeattie ◴[18 Nov 24 19:53 UTC] No.42176219[source]▶

>>42175023 (OP) #

Huh. I tried the "Listen to article" button, because I knew it was going to be generated and was curious to hear how it sounded.

Interestingly, it highlighted the words as it read. I haven't seen that before online. Not sure how useful it is (especially for anyone interested in this particular topic), but I thought it was a neat innovation nevertheless.

29. rty32 ◴[18 Nov 24 20:01 UTC] No.42176281{4}[source]▶

>>42176170 #

Yes until it fails

https://www.theverge.com/2023/11/27/23978591/google-drive-de...

replies(1): >>42176556 #

30. bachmeier ◴[18 Nov 24 20:03 UTC] No.42176304{5}[source]▶

>>42176172 #

Yeah. Fastmail masked addresses are random. The best you can do is guess that an address might be masked, due to it not being johnsmith@fastmail.com, but it provides no information about your real email address.

31. gexaha ◴[18 Nov 24 20:07 UTC] No.42176336[source]▶

>>42175023 (OP) #

The most fun fact is that it still exists!

32. mananaysiempre ◴[18 Nov 24 20:15 UTC] No.42176416{4}[source]▶

>>42175856 #

There is filter by date and sort by date. The former works. The latter, when enabled, even adds a banner on top of the page (in large but gray type) that says “Articles added in the last year, sorted by date”, and resets any filter you might have set before.

replies(2): >>42176828 #>>42177517 #

33. robwwilliams ◴[18 Nov 24 20:17 UTC] No.42176441[source]▶

>>42175023 (OP) #

Our department uses GScholar as a great research-focused CV generator. Not used formally except that faculty pages have a link to their GS pages.

34. the-rc ◴[18 Nov 24 20:20 UTC] No.42176475{3}[source]▶

>>42176203 #

12-13 years ago, I ran the system that inlined Scholar and other results on the main search result pages. Anurag was still involved, but AFAIR Alex, the other author of the post who also had been there from the start, worked on most code changes. I would guess that things are more or less the same today. (Because it had such limited headcount, Scholar was known to lag behind other services when it came to code/infrastructure migrations.)

replies(1): >>42176912 #

35. wseqyrku ◴[18 Nov 24 20:22 UTC] No.42176503[source]▶

>>42175023 (OP) #

For a second I thought this was buzzfeed for some reason.

36. GeoAtreides ◴[18 Nov 24 20:25 UTC] No.42176534[source]▶

>>42175023 (OP) #

oh no

they remembered google scholar exists

it's a great product and I don't trust google at all not to break it or mess with it

replies(2): >>42177573 #>>42192333 #

37. whiplash451 ◴[18 Nov 24 20:26 UTC] No.42176556{5}[source]▶

>>42176281 #

That issue got resolved in a few days [1] -- and for each and every one of these extremely rare events at Google, you'll find similar ones at MS.

I am referring to robustness at scale and every day: Google released auto-save years before MS. MS pales in comparison in the UX.

Note: I have no vested interest in Google, not ex-googler, etc.

[1] https://support.google.com/drive/thread/245861992/drive-for-...

38. leephillips ◴[18 Nov 24 20:41 UTC] No.42176741[source]▶

>>42175714 #

I would go to the library and pull volumes of Science Citation Index off the shelves. Yes, Google Scholar was a revolution.

39. MichaelZuo ◴[18 Nov 24 20:48 UTC] No.42176828{5}[source]▶

>>42176416 #

Was this change ever logged or noted some way? Or did it just show up one day?

replies(1): >>42177258 #

40. afandian ◴[18 Nov 24 20:50 UTC] No.42176852[source]▶

>>42175023 (OP) #

Some fun Google Scholar history from another perspective.

https://youtu.be/DZ2Bgwyx3nU?t=315

I recommend you watch the rest of the video, on the subject of open/closed and enclosure of infrastructure.

41. jll29 ◴[18 Nov 24 20:55 UTC] No.42176912{4}[source]▶

>>42176475 #

Thanks for that inside scoop, even if it's a bit dated; I wonder if they read this discussion, perhaps.

An important feature request would be a view where only peer-reviewed publications (specifically, not ArXiv and other pre-print archives) are included in the citation counts, and self-citations are also excluded.

A way to download all citation sources would also be a great nice-to-have.

42. teruakohatu ◴[18 Nov 24 20:59 UTC] No.42176959[source]▶

>>42175023 (OP) #

The best thing, by a long way, that Google Scholar has achieved is denying Elsevier & co a monopoly on academic search.

In most universities here in New Zealand, articles have to be published in a journal indexed by Elsevier's Scopus. Not in a Scopus-indexed journal, it does not count anymore than a reddit comment. This gives Elsevier tremendous power. But in CS/ML/AI most academics and students turn to Google Scholar first when doing searches.

replies(2): >>42177049 #>>42182014 #

43. freefaler ◴[18 Nov 24 21:07 UTC] No.42177049[source]▶

>>42176959 #

or turn to sci-hub and annas-arhive :)

replies(5): >>42177217 #>>42177399 #>>42177609 #>>42177877 #>>42179815 #

44. globular-toast ◴[18 Nov 24 21:17 UTC] No.42177151{3}[source]▶

>>42175704 #

Google maps would only be a net good if the data was available under a free licence. As it is they take data from people that should have gone to a public project like OpenStreetMap.

replies(2): >>42177478 #>>42178530 #

45. philipkglass ◴[18 Nov 24 21:24 UTC] No.42177217{3}[source]▶

>>42177049 #

You use Google Scholar to find papers you're interested in, then use sci-hub to actually read them.

replies(3): >>42177444 #>>42180772 #>>42187763 #

46. jrochkind1 ◴[18 Nov 24 21:25 UTC] No.42177225[source]▶

>>42175023 (OP) #

> 1. The team started with just two of us.

My guess for a while has been that it was back to two of them! if that!

47. thechao ◴[18 Nov 24 21:28 UTC] No.42177255{6}[source]▶

>>42176089 #

I used to require a "+..." on all emails. Any email that didn't have the "+..." was sent to Spam automagically. My family were whitelisted. I gave up, because too many websites (early on) refused to take the "+..." marker, so I ended up losing too much to Spam. It's easier to just let Google sort it out.

48. philipkglass ◴[18 Nov 24 21:28 UTC] No.42177258{6}[source]▶

>>42176828 #

If it ever returned time-sorted results without limit, that was long in the past. It has truncated results to one year for the last several years I have used Scholar.

replies(1): >>42177506 #

49. gnopgnip ◴[18 Nov 24 21:37 UTC] No.42177362{6}[source]▶

>>42176089 #

It's part of RFC 5233 Sieve Email Filtering: Subaddress Extension

50. teruakohatu ◴[18 Nov 24 21:40 UTC] No.42177399{3}[source]▶

>>42177049 #

Does sci-hub have up to date content these days?

Having pretty wide journal access through my institution means I don’t need to reach out to sci-hub.

replies(1): >>42177523 #

51. gray_-_wolf ◴[18 Nov 24 21:40 UTC] No.42177404{3}[source]▶

>>42175704 #

Most of the spam I get is from gmail. Maybe they should apply their so effective spam engine to outgoing mail as well...

replies(1): >>42177542 #

52. freefaler ◴[18 Nov 24 21:45 UTC] No.42177444{4}[source]▶

>>42177217 #

indeed... and use Zotero with the correct plugin to download them automagically

replies(1): >>42177489 #

53. arccy ◴[18 Nov 24 21:49 UTC] No.42177478{4}[source]▶

>>42177151 #

"take", these people would never have produced any data if gmaps wasn't there...

replies(1): >>42177599 #

54. epcoa ◴[18 Nov 24 21:50 UTC] No.42177489{5}[source]▶

>>42177444 #

sci-hub hasn't been updated in 4 years and the sources for annas-archive like nexus-stc are seriously hit or miss (depends on the field).

replies(2): >>42177528 #>>42185104 #

55. crazygringo ◴[18 Nov 24 21:52 UTC] No.42177506{7}[source]▶

>>42177258 #

It seems so intentionally "broken", I can only guess it is to prevent scraping? Since searching for generic-ish search terms and sorting by date is a common scraping strategy.

Still, you'd think they'd do a cutoff of e.g. 500 or 1,000 items rather than filter by the past year.

So I can't help but wonder if it's a contractual limitation insisted on by publishers? Since the publishers also don't want all their papers being spidered via Scholar? It feels kind of like a limitation a lawyer came up with.

replies(2): >>42181040 #>>42186337 #

56. ◴[18 Nov 24 21:53 UTC] No.42177517{5}[source]▶

>>42176416 #

57. epcoa ◴[18 Nov 24 21:54 UTC] No.42177523{4}[source]▶

>>42177399 #

sci-hub proper hasn't been updated since it's indefinite pause in december 2020. Alternatives are of variable success depending on field. It might be better for CS/Math, but medicine and life sciences it's pretty bad.

replies(1): >>42177628 #

58. freefaler ◴[18 Nov 24 21:54 UTC] No.42177528{6}[source]▶

>>42177489 #

Nothing lasts forever, but the model of buying a paper for 40$ from Elsevier isn't much better. Depending on the field there are other sources, but still a hit rate is about 85-90%.

59. crazygringo ◴[18 Nov 24 21:56 UTC] No.42177542{4}[source]▶

>>42177404 #

It's probably not. You can put any domain you want on the "from" address. Just because it says it was from Gmail doesn't mean it actually was, unless it's signed with DKIM etc.

I had a domain for a while that people got spam "from" all the time. It had nothing to do with me and there was nothing I could do about it.

replies(3): >>42177685 #>>42177697 #>>42179265 #

60. p4bl0 ◴[18 Nov 24 21:56 UTC] No.42177544[source]▶

>>42175023 (OP) #

I wish GScholar wouldn't embrace bibliometrics so much. Sort papers by date (most recent papers first) by default on an author's page rather than by citation count, or at least give author the choice to individually opt-in to sort by date by default.

61. crazygringo ◴[18 Nov 24 21:58 UTC] No.42177573[source]▶

>>42176534 #

Google employs a lot of people from academia. Scholar is used and loved by a lot of people within Google. It's been around for two decades. I really don't think it's going anywhere.

replies(1): >>42177696 #

62. hatthew ◴[18 Nov 24 22:01 UTC] No.42177599{5}[source]▶

>>42177478 #

At one point I contributed quite a bit to google maps, because it was the primary map system I was using at the time. Had I been using an OSM-based system, I would have made contributions there instead.

replies(1): >>42177773 #

63. whimsicalism ◴[18 Nov 24 22:03 UTC] No.42177609{3}[source]▶

>>42177049 #

scihub is dying unfortunately :( the good news is it is happening just as all the fields i'm interested in except for some experimental physics & biology have moved to OA

replies(2): >>42179063 #>>42181919 #

64. whimsicalism ◴[18 Nov 24 22:05 UTC] No.42177628{5}[source]▶

>>42177523 #

i believe they paused due to an indian court injunction and the case was heard this year, does anyone know any update?

replies(1): >>42180060 #

65. random3 ◴[18 Nov 24 22:08 UTC] No.42177658[source]▶

>>42175023 (OP) #

Fun fact about Google Scholar: it’s "free", but it’s just another soulless Google product - no clear strategy, no support, and a fragile proprietary dependency in what should be an open ecosystem. This creates inherent risks for the academic community. We need the equivalent of arXiv for Google Scholar

replies(8): >>42177738 #>>42178221 #>>42178675 #>>42179796 #>>42180759 #>>42181058 #>>42181064 #>>42183137 #

66. dekhn ◴[18 Nov 24 22:12 UTC] No.42177679[source]▶

>>42175714 #

I'd go to the card catalog (index), turn my question into a bag of words (tokenize), fetch all the cards matching each token (posting lists), drop cards which didn't include enough of the tokens (posting list intersection), ordering the cards by the number of tokens they matched (keyword match ranking), filter at some cutoff, and then reorder based on the h-index of the author (page rank). Then I would read each paper in order, following citations in a breadth-first manner.

(the above is a joke comparing old school library work to search engines circa 2000; I didn't actually do all those steps. I'd usually just find the most recent review article and read the papers it cited).

67. dpifke ◴[18 Nov 24 22:12 UTC] No.42177685{5}[source]▶

>>42177542 #

I run mail servers for myself, a couple of side projects, and some friends and family. A double-digit percentage of all spam caught by my filters is from Google's mail servers, not just forged @gmail.com addresses.

Of the "too big to block outright" spam senders, behind Twilio Sendgrid and Weebly, Google is currently #3. Amazon is a close #4. None of the top four currently have useful abuse reporting mechanisms... Sendgrid used to be OK, but they no longer seem to take any action. Google doesn't even accept abuse reports, which is ironic because "does not accept or act upon abuse reports" is criteria for being blocked by Google.

Most spam from Google is fake invoices and 419 scams. This is trivially filtered on my end, which makes it perplexing Google doesn't choose to do so. I can guarantee that exactly 0% of Gmail users sending out renewal invoices for "N0rton Anti-Virus" are legitimate.

68. dekhn ◴[18 Nov 24 22:14 UTC] No.42177696{3}[source]▶

>>42177573 #

Reader was used and loved by a LOT of people WITHIN google, but it was shut down (and the leadership that loved it even made arguments in front of the company why it "had to be shut down").

AFAICT Scholar remains because Anurag built up massive cred in the early years (he was a critically important search engineer) with Larry Page and kept his infra costs and headcount really small, while also taking advantage of search infra).

replies(1): >>42179157 #

69. gray_-_wolf ◴[18 Nov 24 22:14 UTC] No.42177697{5}[source]▶

>>42177542 #

I would hope google has DKIM and SPF set.

70. afandian ◴[18 Nov 24 22:20 UTC] No.42177738[source]▶

>>42177658 #

The Invest in Open site has a good directory of open tools.

https://infrafinder.investinopen.org/solutions

71. arccy ◴[18 Nov 24 22:25 UTC] No.42177773{6}[source]▶

>>42177599 #

indeed, osm can't paint itself like a victim, it needs good end products to bring in contributors.

72. thaumasiotes ◴[18 Nov 24 22:27 UTC] No.42177789{4}[source]▶

>>42175878 #

> Gmail spam prevention is great for the Google Apps orgs I manage.

Gmail is unlikely to let spam through.

But that doesn't make its spam filter great; it's also very prone to blocking personal communication on the grounds that it must actually have been spam. The principle of gmail's spam filter is just "don't let anything through".

It would be much better to get more spam and also not have my actual communications disappear.

73. theanonymousone ◴[18 Nov 24 22:33 UTC] No.42177834[source]▶

>>42175023 (OP) #

The post uses the expression "delve into" :-/

replies(3): >>42177980 #>>42180054 #>>42180902 #

74. kroltan ◴[18 Nov 24 22:34 UTC] No.42177840{6}[source]▶

>>42176067 #

Yeah, I've had to explain that a couple times already, usually when dealing with customer support or in-person registrations.

And a "malicious" actor can get away with pretending to be another company by spoofing the username if they know your domain works like that. I don't think this has reached spammers' repertoire yet, but I wouldn't be surprised.

Eventually I'd like to have a way of generating random email addresses that accept mail on demand, and put everything else in quaraintine automatically.

75. sourcepluck ◴[18 Nov 24 22:51 UTC] No.42177980[source]▶

>>42177834 #

Is this a jokey reference to that time Paul Graham upset large amounts of Nigerians on Twitter? Or, rather, genuine concern at the thought that the article may have been generated by chatbots?

replies(1): >>42178188 #

76. trash_cat ◴[18 Nov 24 23:11 UTC] No.42178188{3}[source]▶

>>42177980 #

It´s because Taylor Swift´s lates album uses a lot of ´delve´.

77. pkoird ◴[18 Nov 24 23:12 UTC] No.42178198[source]▶

>>42175023 (OP) #

Unpopular opinion but I really liked Microsoft Academic instead until they canned it, sadly.

replies(2): >>42178228 #>>42178765 #

78. kergonath ◴[18 Nov 24 23:14 UTC] No.42178221[source]▶

>>42177658 #

Yes. On one hand I’d like Google to improve things a bit. There are some rough edges, which is a shame because it indexes some things that are not in Scopus or Web of Knowledge, like theses and preprint repositories. On the other hand I worry that some manager somewhere would kill it if they realised that it is still around.

replies(2): >>42178417 #>>42178860 #

79. afandian ◴[18 Nov 24 23:15 UTC] No.42178228[source]▶

>>42178198 #

What do you make of OpenAlex, which inherited the dataset?

80. random3 ◴[18 Nov 24 23:32 UTC] No.42178417{3}[source]▶

>>42178221 #

Every 1-2 months when Chrome updates I get banned by their throttling mechanism because I their extension makes too many requests and they see "unusual traffic"

It can take 1-2 weeks to go away and be able to use it. There's no way to get in contact with anyone. Tried the Chrome extension email, support forums.

It's a good reality check. There's no real support behind it and it can go away just like Google Reader did.

I think the motivations behind it are laudable, but they should not be the answer to the actual problem.

replies(1): >>42191443 #

81. wbl ◴[18 Nov 24 23:45 UTC] No.42178530{4}[source]▶

>>42177151 #

I ran into trouble because Open Topo does not report a stream the 7.5" series does. There's serious data quality issues that can make it not work for some applications.

82. sitkack ◴[19 Nov 24 00:04 UTC] No.42178675[source]▶

>>42177658 #

And that is semantic scholar, https://www.semanticscholar.org/

replies(4): >>42178841 #>>42179369 #>>42181081 #>>42189606 #

83. photochemsyn ◴[19 Nov 24 00:10 UTC] No.42178719[source]▶

>>42175023 (OP) #

I've been using Google Scholar for a long time, but I'm finding ChatGPT search with well-crafted prompts gets more focused and relevant results than a complex keyword search on GS does. However it's often still easier to find a link to the pdf version of the paper using GS, but then scihub is still an option and can work when all else fails.

84. breuleux ◴[19 Nov 24 00:15 UTC] No.42178765[source]▶

>>42178198 #

I liked Microsoft Academic far better, if only because it actually had an API.

85. chris_wot ◴[19 Nov 24 00:15 UTC] No.42178767[source]▶

>>42175023 (OP) #

How long till they kill it?

86. looneysquash ◴[19 Nov 24 00:19 UTC] No.42178801[source]▶

>>42175023 (OP) #

Oh good, it's just a celebration and not an announcement that they're killing it.

87. mapmeld ◴[19 Nov 24 00:24 UTC] No.42178841{3}[source]▶

>>42178675 #

For people unfamiliar, Semantic Scholar is run by the Allen Institute and has been researching accurate AI summarization and semantic search for years. Also they have support for author name changes.

replies(1): >>42179115 #

88. griomnib ◴[19 Nov 24 00:27 UTC] No.42178860{3}[source]▶

>>42178221 #

I’m fairly sure they only exist because Larry/Sergei might give half a fuck if they killed it outright, and it has a small enough team that the cost savings for killing aren’t enough for Ruth to want to make that argument.

89. Fogest ◴[19 Nov 24 00:47 UTC] No.42179002{4}[source]▶

>>42176170 #

As much as I try to "de-google" myself and try to avoid being trapped in the Google eco system, I'd definitely choose it over MS Office. I am stuck in the MS Office eco system at work. Some of their products are starting to improve in MS Office, but you can still tell it's a lot of hacks ontop of old systems. Especially when it comes to the whole teams/onedrive/sharepoint side of things.

One of my biggest gripes right now is that we heavily rely on Microsoft Teams. A lot of our work laptops still are stuck on 8gb of ram. I find Microsoft Teams can easily suck back a full gig or more or ram, especially when in a video call. From my understanding, Teams is running essentially like an Electron app (except using an Edge browser packaged).

I have no problem with web based apps, but man, some optimization is called for.

replies(1): >>42179145 #

90. Loughla ◴[19 Nov 24 00:55 UTC] No.42179063{4}[source]▶

>>42177609 #

oa resources have really kicked it into high gear post covid. They used to be kind of a joke, but they're actually competitive now. It's nice to see.

replies(1): >>42179709 #

91. lr1970 ◴[19 Nov 24 00:57 UTC] No.42179071[source]▶

>>42176205 #

Sir Andre Geim [0], the only person in the world who received both the real Nobel prize in Physics and the Ig Nobel prize co-authored one of his articles [1] with his hamster Tisha.

[0] https://en.wikipedia.org/wiki/Andre_Geim

[1] https://repository.ubn.ru.nl//bitstream/handle/2066/249681/2...

92. crazygringo ◴[19 Nov 24 01:07 UTC] No.42179115{4}[source]▶

>>42178841 #

How does it compare with Google Scholar?

It advertises itself as "from all fields of science" -- does that includes fields like economics? Sociology? Political science? What about law journals? In other words, is the coverage as broad? And if it doesn't include certain fields, where is the "science" line drawn?

And I'm curious if people find it to be as useful (or more) just in terms of UX, features, etc.

replies(2): >>42179753 #>>42179807 #

93. nextos ◴[19 Nov 24 01:10 UTC] No.42179145{5}[source]▶

>>42179002 #

I use a decade-old NUC with plenty of RAM as a daily driver. It doesn't struggle with anything except MS Teams. It can churn through Zoom or Meet calls while compiling code. Teams is a bloated mess that makes the fans spin at max RPM.

It's crazy I can boot a kernel, with an entire graphics and network stack, X and a terminal in less than 200 MB but then the Teams webapp uses a massive amount of resources and grinds everything else to a halt.

Word 365 also becomes incredibly laggy on long documents with tons of comments, whereas Google Docs is just fine. But, apparently, this is also a thing on modern hardware. I guess these days Microsoft has little attention to detail.

replies(1): >>42179632 #

94. crazygringo ◴[19 Nov 24 01:13 UTC] No.42179157{4}[source]▶

>>42177696 #

If it matters, they cited declining usage of Reader as a reason for shutting it down.

It seems like Scholar has an overall upward trend, although their methodology notes make it hard to compare some periods directly:

https://trends.google.com/trends/explore?date=all&q=%2Fm%2F0...

I'm basically assuming this is the rate of growth of graduate school, and no competing products have had any real effect?

replies(1): >>42179209 #

95. AlienRobot ◴[19 Nov 24 01:16 UTC] No.42179179{3}[source]▶

>>42175704 #

"Google is evil, except for all the Google products Google produced"

Honestly, if we compare Google to Amazon, Microsoft, Apple, and Meta, isn't Google the least evil one?

replies(1): >>42180086 #

96. dekhn ◴[19 Nov 24 01:20 UTC] No.42179209{5}[source]▶

>>42179157 #

Reader usage was declining because the application was not being developed. The other reason they mentioned is that it would have required a lot of work to rewrite the app to be consistent with the new user data policies being put into place.

97. csomar ◴[19 Nov 24 01:31 UTC] No.42179265{5}[source]▶

>>42177542 #

The Spam I get from "gmail" and ends up in my spam folder is spoofed. The Spam I get from gmail and ends up in my inbox is from gmail. Spammers will mass-create accounts and mass-sell them to spammers.

98. bugglebeetle ◴[19 Nov 24 01:49 UTC] No.42179369{3}[source]▶

>>42178675 #

OpenAlex is a really good here too, including their API. They’re also the inheritors of the Microsoft Academic Graph, fully open source and open data:

https://openalex.org

replies(1): >>42223641 #

99. rnewme ◴[19 Nov 24 02:30 UTC] No.42179569[source]▶

>>42175023 (OP) #

Time goes by fast. It's interesting to think how authors son is now 20 as well.

Another interesting thing is little popup form at the end of post asking me if my opinion of Google changed for the better after reading the post. I mean maybe a bit, b the form definitely knocked the score back down.

100. Fogest ◴[19 Nov 24 02:43 UTC] No.42179632{6}[source]▶

>>42179145 #

It's funny because sometimes Teams uses more resources than the Edge browser. Despite Teams being Edge based for their application.

I think overall many companies have gotten lazy/sloppy when it comes to optimization. Game dev is even worse for this. I like how Microsoft products integrate with each other, but often the whole thing feels sloppy and unoptimized.

101. agnishom ◴[19 Nov 24 03:00 UTC] No.42179707[source]▶

>>42175023 (OP) #

Google Scholar is extremely valuable to the academic community. I am afraid that Google will decide to scrap it someday, and we will be left with a number of inferior alternatives.

replies(6): >>42179732 #>>42179920 #>>42180083 #>>42180113 #>>42181925 #>>42186069 #

102. Onawa ◴[19 Nov 24 03:00 UTC] No.42179709{5}[source]▶

>>42179063 #

I believe NIH's directive that all intramural and extramural research must be published OA has helped move things in that direction quite a lot.

103. 1propionyl ◴[19 Nov 24 03:03 UTC] No.42179723[source]▶

>>42175023 (OP) #

A reminder to everyone: if you want a "legal" copy of a paper you can always just try emailing one of the first authors. They will 99.99% send you back a PDF.

replies(2): >>42180358 #>>42186061 #

104. idunnoman1222 ◴[19 Nov 24 03:03 UTC] No.42179732[source]▶

>>42179707 #

Like annas archive?

105. Onawa ◴[19 Nov 24 03:08 UTC] No.42179753{5}[source]▶

>>42179115 #

Semantic Scholar's search is pretty good, but there are also a variety of other (paid) projects that expand on its API. Look at tools like Scite and LitMaps for what's possible with the semantic scholar dataset.

As for coverage, I think it focuses more on the life sciences, but I'm not positive about that.

106. kettlecorn ◴[19 Nov 24 03:16 UTC] No.42179796[source]▶

>>42177658 #

I miss the Google of yesteryear which had an altruistic streak and felt that enriching the world's ability to share and process information would ultimately accrue benefit to Google as well.

The Google of today is far more boring and less helpful.

replies(1): >>42180043 #

107. ninjin ◴[19 Nov 24 03:19 UTC] No.42179807{5}[source]▶

>>42179115 #

They are substantially smaller in coverage, but have higher quality in my experience. Remarkably, they are also willing to correct their data if you notify them. This of course in is stark contrast to Google Scholar where the metadata of papers is frequently wildly inaccurate. On top of this, Semantic Scholar shares their underlying data (although you need to request an API key). Overall, they have been growing slowly and steadily over the years and I have a lot of respect for what their team is doing for researchers such as myself.

Now for the less great.

They are pushing the concept of "Highly Influential Citations" [1] as their default metric, which to the best of my knowledge is based on a singular workshop publication that produced a classifier trained on about 500 training samples to classify citations. I am a very harsh critic of any metrics for scientific impact. But this is just utter madness. Guaranteeing that this metric is not grossly misleading is nearly impossible and it feels like the only reason they picked it is because Etzioni (AI2 head) is the last author of the workshop paper. It should have been at best a novelty metric and certainly not the default one.

[1]: https://webflow.semanticscholar.org/faq/influential-citation...

Recently, they introduced their Semantic Reader functionality and are now pushing it as a default way to access PDFs on the website. Forcing you to click on a drop down to access plain PDFs. It may or may not be a great tool, but it feels somewhat obvious that they are attempting to use shady patterns to push you in the direction they want.

Lastly, they have started using Google Analytics. Which is not great, but I can understand why they go for the industry default.

Overall, I use them nearly daily and they are the best offering out there for my area of research. Although, I at times feel tempted to grab the data and create an alternative (simpler) frontend with fewer distractions and "modern" web nonsense.

replies(1): >>42182914 #

108. thrdbndndn ◴[19 Nov 24 03:20 UTC] No.42179815{3}[source]▶

>>42177049 #

I'm a proud user of sci-hub but when I was still in academics, I have never used it. My school has access to all the journals I ever needed, plus more old non-digitized ones I can borrow from library (including interlibrary access).

replies(4): >>42180692 #>>42180777 #>>42183406 #>>42185587 #

109. domoritz ◴[19 Nov 24 03:43 UTC] No.42179920[source]▶

>>42179707 #

Semantic scholar is pretty good so I keep using it more and more.

110. ultimoo ◴[19 Nov 24 03:49 UTC] No.42179945[source]▶

>>42175023 (OP) #

“Now with AI outlines, you can quickly grasp the main points or delve into specific details that pique your interest”

is this a nod to pg’s delve blowup on twitter?

replies(1): >>42182090 #

111. MollyRealized ◴[19 Nov 24 04:02 UTC] No.42180008[source]▶

>>42175023 (OP) #

The availability of case law has been a massive bonus.

112. smgit ◴[19 Nov 24 04:10 UTC] No.42180043{3}[source]▶

>>42179796 #

Its a hard job to maintain systems in an altruistic state, cause opportunists and parasites are drawn in larger and larger numbers to where ever resources accumulate.

Google has a decent job not turning fully into an Oracle for example.

replies(1): >>42180075 #

113. Der_Einzige ◴[19 Nov 24 04:13 UTC] No.42180054[source]▶

>>42177834 #

LLMs linguistically colonized humans already so now humans use LLM slop in their day-to-day verbal communications.

Unironically the plot of MGS5 the Phantom Pain literally happened IRL. Skullface would be proud!

114. insane_dreamer ◴[19 Nov 24 04:15 UTC] No.42180060{6}[source]▶

>>42177628 #

How would an Indian court case have any jurisdiction in Russia (not to mention mirrors)?

replies(1): >>42180610 #

115. insane_dreamer ◴[19 Nov 24 04:20 UTC] No.42180075{4}[source]▶

>>42180043 #

That’s a really really low bar

116. insane_dreamer ◴[19 Nov 24 04:21 UTC] No.42180078[source]▶

>>42175024 #

Google Maps is a net positive as well

replies(1): >>42180135 #

117. jonas21 ◴[19 Nov 24 04:22 UTC] No.42180083[source]▶

>>42179707 #

Google employs thousands of researchers who would be less productive (and upset) if they scrapped it. That alone is probably enough to make it worthwhile to keep it going, at least until a good alternative emerges.

replies(2): >>42181553 #>>42190043 #

118. insane_dreamer ◴[19 Nov 24 04:22 UTC] No.42180086{4}[source]▶

>>42179179 #

No, I’d put them in this descending order of evilness: Meta, Amazon, Microsoft, Google, Apple.

119. jillesvangurp ◴[19 Nov 24 04:32 UTC] No.42180113[source]▶

>>42179707 #

Google employs a lot of academics that probably use it. And of course they have a few AI related products that are probably being trained on scientific content as well. I bet Google Scholar feeds data into that effort. My guess is that keeping google scholar up and running isn't breaking the bank for them and it is actually a valuable resource for them.

120. robertlagrant ◴[19 Nov 24 04:39 UTC] No.42180135{3}[source]▶

>>42180078 #

Some more:

- Google Search

- YouTube (more debateable, but I think it's a marvel)

- Google Books

- ChromeBooks

- Android

- Google Calendar

- Google Earth

- Google Drive

- Google Docs

- Waze

- Android Auto

- Google Pay

- Kubernetes

- Go

- VP8 / VP9

I'd rather take all those products than leave them.

replies(1): >>42186501 #

121. cipheredStones ◴[19 Nov 24 06:40 UTC] No.42180610{7}[source]▶

>>42180060 #

Sci-Hub complied with the order with the intent to actually argue their case (and possibly establish a legal justification for the site), rather than just defying the order and continuing to play cat-and-mouse with every authority.

replies(1): >>42183521 #

122. thrw42A8N ◴[19 Nov 24 06:57 UTC] No.42180692{4}[source]▶

>>42179815 #

My school has no such thing and yet requires me to find and cite research.

replies(1): >>42180781 #

123. consf ◴[19 Nov 24 07:17 UTC] No.42180781{5}[source]▶

>>42180692 #

I think access to research shouldn't be a luxury or dependent on where you study

replies(1): >>42182991 #

124. kome ◴[19 Nov 24 07:44 UTC] No.42180902[source]▶

>>42177834 #

lol, so what?

125. codeflo ◴[19 Nov 24 08:11 UTC] No.42181021[source]▶

>>42175024 #

I'll reserve judgement on its net effect until the moment they kill it.

126. codeflo ◴[19 Nov 24 08:13 UTC] No.42181032[source]▶

>>42175023 (OP) #

Pushing a half-abandoned but widely beloved project into the visibility of the bean counters at Google with a birthday announcement like that is a dangerous game. Best of luck.

replies(2): >>42187601 #>>42188320 #

127. eesmith ◴[19 Nov 24 08:13 UTC] No.42181040{8}[source]▶

>>42177506 #

Unlikely, since the easy work-around for scrapers is to search by date range and grab things that way. That's what I do now manually.

128. BlindEyeHalo ◴[19 Nov 24 08:17 UTC] No.42181064[source]▶

>>42177658 #

computer science has dblp.org which indexes all the relevant journals.

129. valusson ◴[19 Nov 24 08:19 UTC] No.42181081{3}[source]▶

>>42178675 #

It's nice, but OpenAlex is better. https://explore.openalex.org/ It also has a free API and people have built python libraries to access it. https://pypi.org/project/pyalex/

130. 2dvisio ◴[19 Nov 24 08:56 UTC] No.42181302[source]▶

>>42175023 (OP) #

20 years and still no API. In my past as an academic I've tried several times to build systems to depend on Scholar and was always taken aback by the lack of an API. I get it was not to be swallowed whole by other publishers etc, but that has reduced the potential of the product.

replies(2): >>42186093 #>>42188154 #

131. PeterStuer ◴[19 Nov 24 08:58 UTC] No.42181312[source]▶

>>42175023 (OP) #

I love it when I receive a scolar mail informing that there is a new citation of a 20+ year old long forgotten paper.

132. elAhmo ◴[19 Nov 24 09:35 UTC] No.42181553{3}[source]▶

>>42180083 #

Given that they have killed products with millions of users, including a lot of paying users, relying on this is optimistic. Google doesn't seem to care about major inconvenience they cause, like with the Google Domains sale Squarespace.

replies(1): >>42183262 #

133. foxbee ◴[19 Nov 24 09:38 UTC] No.42181575[source]▶

>>42175023 (OP) #

I found the post interestingly personable, something that I don't often find with Google. I've used Google Scholar for many years, before I used Elsevier and it was a gamechanger.

134. QuantumG ◴[19 Nov 24 09:40 UTC] No.42181595[source]▶

>>42175023 (OP) #

CiteSeer we barely knew you.

replies(1): >>42185164 #

135. kedarkhand ◴[19 Nov 24 10:27 UTC] No.42181919{4}[source]▶

>>42177609 #

Sorry but what is OA?

replies(1): >>42181964 #

136. kmmlng ◴[19 Nov 24 10:28 UTC] No.42181925[source]▶

>>42179707 #

Well, at least Google Scholar is aligned with Google's core business: search. It seems silly for Google to scrap search features. On the other hand, I'm not sure if Google Scholar is aligned with their real core business: ads.

137. bloak ◴[19 Nov 24 10:35 UTC] No.42181964{5}[source]▶

>>42181919 #

https://en.wikipedia.org/wiki/Open_access (I assume)

138. p4bl0 ◴[19 Nov 24 10:43 UTC] No.42182014[source]▶

>>42176959 #

Yet it still participates and encourages the bibliometrics game, which benefits the big publishers.

A simple way to make a step away from encouraging bibliometrics (which would be a step in the right direction) would be to list publications by date (most recent first) on authors pages rather than by citations count, or at least to let either users and/or authors choose the default sorting they want to use (when visiting a page for users, for their page by default for authors).

replies(4): >>42182844 #>>42182998 #>>42185407 #>>42189043 #

139. fforflo ◴[19 Nov 24 10:54 UTC] No.42182090[source]▶

>>42179945 #

Haha,that, or it's a validation of the blowup.

140. aorth ◴[19 Nov 24 11:07 UTC] No.42182174{6}[source]▶

>>42176089 #

Good resource on this trick from 2010. It's not Gmail specific.

https://people.cs.rutgers.edu/~watrous/plus-signs-in-email-a...

141. ◴[19 Nov 24 12:48 UTC] No.42182844{3}[source]▶

>>42182014 #

142. crazygringo ◴[19 Nov 24 12:57 UTC] No.42182914{6}[source]▶

>>42179807 #

Thank you so much!

143. slashtab ◴[19 Nov 24 13:07 UTC] No.42182991{6}[source]▶

>>42180781 #

Reminds me of Aaron Swartz.

replies(1): >>42185921 #

144. Scriddie ◴[19 Nov 24 13:08 UTC] No.42182998{3}[source]▶

>>42182014 #

this^10

145. ◴[19 Nov 24 13:23 UTC] No.42183137[source]▶

>>42177658 #

146. leemee ◴[19 Nov 24 13:37 UTC] No.42183262{4}[source]▶

>>42181553 #

I think the point was that Google is _sometimes_ willing to support projects if it helps their employees do their job, which might be the case here.

147. ryzvonusef ◴[19 Nov 24 13:51 UTC] No.42183406{4}[source]▶

>>42179815 #

It depends on the discipline, also the mode of learning (I'm distance learning so no physical library access).

My uni (Northampton) has access to a LOT of journals... but has a blindspot in management, specifically accountancy focus journals; am doing my lit review for my MSc dissertation and the number of times I hit a dead end is frustrating.

Sci-hub and Annas-Archive are also not interested in that segment, so double whammy.

But surprisingly Archive.org was able to help me out a bit, so thanks for that.

148. joshuaissac ◴[19 Nov 24 14:03 UTC] No.42183521{8}[source]▶

>>42180610 #

And this is because they have a chance of winning. The same court has previously adopted a broad interpretation of what constitutes fair dealing.

https://en.wikipedia.org/wiki/University_of_Oxford_v._Ramesh...

149. mateus1 ◴[19 Nov 24 16:17 UTC] No.42185104{6}[source]▶

>>42177489 #

Any alternatives?

150. esafak ◴[19 Nov 24 16:22 UTC] No.42185164[source]▶

>>42181595 #

I'm surprised there are so few comments about it. It had more features than Google Scholar.

151. SideQuark ◴[19 Nov 24 16:42 UTC] No.42185407{3}[source]▶

>>42182014 #

> the bibliometrics game

Bibliometrics, in use for over 150 years now, is not a game. That's like arguing there is no value in the PageRank algorithm, and no validity to trying to find out which journals or researchers or research teams publish better content using evidence to do so.

> which benefits the big publishers

Ignoring that it helps small researchers seems short sighted.

> A simple way to make a step ... would be to list publications by date

It's really that hard to click "year" and have that sorted?

It's almost a certainty when someone is looking for a scholar, they are looking for more highly cited work than not, so the default is probably the best use of reader times. I absolutely know when I look up an author, I am interested in what other work they did that is highly regarded more than any other factor. Once in a while I look to see what they did recently, which is exactly one click away.

replies(1): >>42186812 #

152. Suppafly ◴[19 Nov 24 16:59 UTC] No.42185587{4}[source]▶

>>42179815 #

>My school has access to all the journals I ever needed

I miss being on a university network and having paywalled journals and such just magically load.

153. Melatonic ◴[19 Nov 24 17:23 UTC] No.42185921{7}[source]▶

>>42182991 #

The legend

154. dredmorbius ◴[19 Nov 24 17:36 UTC] No.42186061[source]▶

>>42179723 #

Dead authors don't.

The friction is tremendously higher than on-demand downloadable options: LibGen, SciHub, ZLibrary, Anna's Archive, or even sources such as ArXiv, SocArXiv, SSRN, which are far more fragmentary and limited.

155. asdff ◴[19 Nov 24 17:37 UTC] No.42186069[source]▶

>>42179707 #

IMO pubmed is superior for life sciences, especially if you use their entrez direct. Really powerful query tooling.

156. asdff ◴[19 Nov 24 17:38 UTC] No.42186093[source]▶

>>42181302 #

What field are you in? If you are in life sciences the pubmed api (entrez direct) is pretty good.

157. asdff ◴[19 Nov 24 17:40 UTC] No.42186118{3}[source]▶

>>42175704 #

Do people use shared docs often in the workplace? I only used it on like two group projects in school and it probably made things more clunky than if we just wrote our portions and compiled them after. Maybe it works for some workflows but having multiple people editing the same document is chaotic, unless you delegate who does what, at which point there's no point in having it be a shared doc when the responsibilities are delegated.

replies(1): >>42187223 #

158. asdff ◴[19 Nov 24 17:53 UTC] No.42186276[source]▶

>>42175714 #

I had an old boss who did it in the old analog way. He had a secretary handle his email and transcribing stuff he hand wrote. He had print subscriptions to a couple nature journals, science, and a couple research niche specific journals and he read them basically cover to cover. He'd attend conferences and had many collaborators who would send him papers from their own lab to opine on.

I actually respect this style a lot. There is a firehose of papers coming onto google scholar each day. You type in some keyword you get 500 hits. This cut that down substantially for him in a way where he never missed anything big (reading nature and science), kept up with what the field has been doing (reading the more niche specific journals and keeping up with the labs who put out this niche work), and seeing what was coming up in the pipeline from the conferences or what sort of research new grants were requesting. I'm not sure that scholar would have helped much.

159. asdff ◴[19 Nov 24 17:58 UTC] No.42186337{8}[source]▶

>>42177506 #

pubmed is literally built for academic scraping. It even has a command line interface to access it. If publishers were worried about scraping they'd target that, but they don't. In fact when papers go on pubmed after a year they are rehosted by pubmed central and made freely available to anyone in the world.

160. insane_dreamer ◴[19 Nov 24 18:14 UTC] No.42186501{4}[source]▶

>>42180135 #

ok, but Search aside (which is Google's primary product; we were talking about side project), many of these are also-rans; they didn't really change the landscape the way Google Maps (and of course Search). OK, maybe Android, but that wasn't developed by Google. Neither was YouTube (groundbreaking), or Waze (not groundbreaking).

The only one I would take from your list would be Kubernetes and Google Earth, and Kubernetes being more of a dev tool would really count as far as impact and usefulness to society (Go would fit there).

Google Books _could_ have been great, but Google didn't take care of it. Same with Google Reader.

replies(2): >>42186807 #>>42192032 #

161. JW_00000 ◴[19 Nov 24 18:23 UTC] No.42186581{4}[source]▶

>>42175773 #

Too late for most people.

162. hfsh ◴[19 Nov 24 18:47 UTC] No.42186807{5}[source]▶

>>42186501 #

>which is Google's primary product

Which used to be Google's primary product, way waaaay back when. Their primary product now is advertising, and has been for a very long time.

163. mindcrime ◴[19 Nov 24 18:47 UTC] No.42186812{4}[source]▶

>>42185407 #

To be fair, you did hedge and say "almost a certainty" and maybe that's true. But speaking for myself, I generally couldn't care less about citation count. If anything, my interest in a document may be inversely proportional to the citation count. And that's because I'm often looking for either a. "lost gems" - things are are actually great/useful research, but that got overlooked for whatever reason, or b. historical references to obscure topics that I'm deep-diving into.

BUT... I'm not in formal academia, I care very little about publishing research myself (at least not from a bibliometric perspective. For me "publishing" might be writing a blog post or maybe submitting a pre-print somewhere) so I'm just not part of that whole (racket|game|whatever-you-want-to-call-it).

replies(2): >>42189080 #>>42194593 #

164. cryptozeus ◴[19 Nov 24 18:55 UTC] No.42186879[source]▶

>>42175023 (OP) #

Slightly unrelated but I also enjoyed google's magazines section

https://books.google.com/books/magazines/language/en

165. guwop ◴[19 Nov 24 18:57 UTC] No.42186894[source]▶

>>42175023 (OP) #

for people upset with google scholars lack of an API, check out openalex! awesome project. but crazy to think how much net positive google scholar has provided for the world..

166. mwest217 ◴[19 Nov 24 19:33 UTC] No.42187223{4}[source]▶

>>42186118 #

All the time, it is incredibly useful to send a doc for comments, which can be attached to the relevant piece of text. I use shared editing less often, but I find it's especially useful in incident response where there may be multiple investigation workstreams, and the incident commander needs to be able to see all of them.

replies(1): >>42200736 #

167. guappa ◴[19 Nov 24 20:11 UTC] No.42187586{3}[source]▶

>>42175704 #

Nah google maps shows/hides things with very obscure logic.

Like you can ask to find a restaurant and it won't point you to the closer one but to one that is few km away instead.

replies(1): >>42192312 #

168. uecker ◴[19 Nov 24 20:13 UTC] No.42187601[source]▶

>>42181032 #

Sadly, this is a very valid concern.

169. orochimaaru ◴[19 Nov 24 20:29 UTC] No.42187763{4}[source]▶

>>42177217 #

Not sure if researchgate is still a thing. I had it and uploaded all my papers there. They show up automatically on Google. I believe this is allowed since you’re allowed to share copies of your publication on your website.

The problem is my researchgate account was connected to my academic account. It’s been a while since I graduated so I’ve lost access to my own publications and page.

But I used to use researchgate and requests in researchgate quite a bit.

170. mkatx ◴[19 Nov 24 21:14 UTC] No.42188154[source]▶

>>42181302 #

You mean public, documented API's? Everything is/has an API.

replies(1): >>42191722 #

171. llm_trw ◴[19 Nov 24 21:32 UTC] No.42188320[source]▶

>>42181032 #

Google is a denger to the world, not because it's a monopoly but because it makes wonderful tools that are better than anything else available at the time. Everything else goes bust. Then google shutters tool and we're left worse off than if they did nothing.

172. 1vuio0pswjnm7 ◴[19 Nov 24 23:08 UTC] No.42189043{3}[source]▶

>>42182014 #

Like the old ISI Web of Science back in the day.

But Google remains focused on popularity because that is optimal for advertising, where large audiences are the only ones that matter and there is this insidious competition for top ranking (no expectation that anyone would ever want to dig deep into search results). That sort of focus is not ideal for non-commercial research, IMHO.

173. SideQuark ◴[19 Nov 24 23:14 UTC] No.42189080{5}[source]▶

>>42186812 #

Since it trivially does what you want with one click, and you’re not the audience, why the bizarre hatred of something you don’t understand?

It works great for its audience, likely better than any other product. Do you think your desire for rare outweighs the masses that don’t? If you want rare, why even use a tool designed for relevant? Go dig through the stacks at your favorite old library, bookstore, cellar, wherever.

I’d suspect if you were handed random low citation count articles you’d soon find they are not gems. They’re not cited for a reason.

Heck, want low citation count items? Go find a list of journal rankings (well crap, more rankings…) in the field you’re interested in, take the lowest rated ones, and go mine those crap journals for gems. Voila! Problem solved.

And I bet you find why they’re low ranked searching for gems in slop.

replies(1): >>42189127 #

174. mindcrime ◴[19 Nov 24 23:20 UTC] No.42189127{6}[source]▶

>>42189080 #

I have no idea where you got anything about hatred, or any idea that there's anything here I don't understand. I just wanted to make the point that there are, in fact, people out there who are not singularly focused on citation count.

That said, I personally don't have any problem with Google Scholar since you can, as you say, trivially sort by date.

175. random3 ◴[20 Nov 24 00:30 UTC] No.42189606{3}[source]▶

>>42178675 #

I did a test across all Google Scholar alternatives I could find a few months ago. I got the same feelign like after Google Reader seized to exist. Literally nothing filled the gap.

My conclusion is that any such system needs to be "complete" or almost complete to be useful. By system, I mean a service or some handcrafted system where I could track anything. In all fairness, Sci-Hub partially fits the bill here and it's a big plus to society.

But the point is Google Scholar is complete in the sense that with a high probability I will find any paper I'm looking for along with reliable metadata. That's great, but the fact that they go above and beyond to prevent sharing that data is IMO backwards, against all academic research principles and this should raise questions within the research communities that rely on it.

replies(1): >>42189645 #

176. sitkack ◴[20 Nov 24 00:36 UTC] No.42189645{4}[source]▶

>>42189606 #

The biggest use of Google Scholar is in finding multiple sources or academic troves for a paper that is already accessible on semanticscholar.

There is no one stop shopping, you need to use all of them.

177. agnishom ◴[20 Nov 24 01:42 UTC] No.42190043{3}[source]▶

>>42180083 #

Well, given how Google behaves I am not sure my perception of rationality aligns with their perception of rationality

178. kergonath ◴[20 Nov 24 07:09 UTC] No.42191443{4}[source]▶

>>42178417 #

I agree entirely.

179. juthen ◴[20 Nov 24 08:06 UTC] No.42191722{3}[source]▶

>>42188154 #

They temp ban by IP as fast or faster as any other google service.

180. robertlagrant ◴[20 Nov 24 09:04 UTC] No.42192032{5}[source]▶

>>42186501 #

I don't see why NIH affects anything. They're google products now. And as a further point: do you think Android and YouTube have the same codebases as they did when they were acquired? Of course not.

181. thepuppet33r ◴[20 Nov 24 09:55 UTC] No.42192312{4}[source]▶

>>42187586 #

I think that's all based on advertising dollars.

One of the nice things about Openstreetmap is that it doesn't do that weird behind the scenes manipulation.

182. thepuppet33r ◴[20 Nov 24 09:58 UTC] No.42192333[source]▶

>>42176534 #

I guess the question is: Can it be rebuilt somewhere else as an open source tool? Maybe this should be Duckduckgo's next big reveal? Drop the AI stuff and go hard on academic papers?

183. kurikuri ◴[20 Nov 24 15:15 UTC] No.42194593{5}[source]▶

>>42186812 #

> If anything, my interest in a document may be inversely proportional to the citation count.

This makes little sense to me. The citation count gives you an idea of what others are looking at and building upon. As far as I’ve seen, having a low citation count isn’t an uncommon phenomena, but having a high citation count is. In terms of information gained while triaging papers to read, a low citation count gives you almost no information.

replies(1): >>42197746 #

184. mindcrime ◴[20 Nov 24 20:22 UTC] No.42197746{6}[source]▶

>>42194593 #

Don't over-interpret what I'm saying here. I'm not on some "mission from God" to ignore all high citation count papers. My point is only that sometimes I want to pointedly look through things that aren't "what others are looking at and building on", on the basis that sometimes things get "lost" for whatever reason. Ideas show up, are maybe ahead of their time, or get published in the wrong journal, or get overshadowed by a "hot" contemporaneous item, etc., and then stay hidden due to path dependence. My goal is to make an active effort to break that path dependent flow and maybe dredge up something that is actually useful but that has remained "below the radar".

replies(1): >>42199091 #

185. slumpt_ ◴[20 Nov 24 23:07 UTC] No.42199091{7}[source]▶

>>42197746 #

The entire point is that experts are doing this triage for you, and building upon fruitful lanes of research.

To think that as an outsider to a field you are qualified to discover 'gems' (and between the lines here is a bit of an assumption that one is more qualified than researchers in the field, who are of course trying to discover 'gems') seems misguided.

replies(2): >>42200596 #>>42208214 #

186. ◴[21 Nov 24 02:49 UTC] No.42200596{8}[source]▶

>>42199091 #

187. asdff ◴[21 Nov 24 03:12 UTC] No.42200736{5}[source]▶

>>42187223 #

I’m shocked people are managing incidence response over a google dock… what happened to emailing things to people?

188. mindcrime ◴[21 Nov 24 20:21 UTC] No.42208214{8}[source]▶

>>42199091 #

I'm not an "outsider to the field" though. I'm just not affiliated with an academic institution and I don't make my living as an academic researcher.

But I am educated in my chosen field and I read the same books and journals and attend the same conferences, as the people you're referring to. The biggest difference is only in incentives and imposed constraints. I have a lot more freedom since I'm not operating within the "publish or perish" paradigm.

replies(1): >>42210611 #

189. SideQuark ◴[22 Nov 24 02:23 UTC] No.42210611{9}[source]▶

>>42208214 #

Pretty much all tenured researchers are beyond the publish or perish stage, and I’d conjecture they also publish most of the highly cited works, since writing good papers helped them get tenured. There’s millions of such people worldwide.

Assuming there’s some “incentives and imposed constraints” anywhere uniform to academics that you’re magically free from that lets you turn low cited papers into gems at a higher rate than all of academia combined is the most self delusional, simplistic, aggrandizing belief I’ve heard in a long time.

replies(1): >>42215221 #

190. mindcrime ◴[22 Nov 24 16:36 UTC] No.42215221{10}[source]▶

>>42210611 #

You are completely mis-interpreting what I'm saying. To the point that it seems almost intentional and lacking good faith. As such, I'm done with this conversation. Have a nice day.

191. sitkack ◴[23 Nov 24 20:27 UTC] No.42223641{4}[source]▶

>>42179369 #

I have been trying out openalex, it is very very good!

↑