Most active commenters

(8)
simonw(6)
crazygringo(6)
CPLX(5)
lysace(4)
mallowdram(3)
onlinehost(3)
ics(3)
michaelt(3)

Popular/hot comments

>>45232069 #
>>45231366 #
>>45231789 #
>>45231347 #
>>45231600 #
>>45232719 #
>>45235687 #
>>45231582 #
>>45231585 #
>>45231879 #

‘Overworked, underpaid’ humans train Google’s AI

(www.theguardian.com)

1. kerblang ◴[13 Sep 25 11:53 UTC] No.45231347[source]▶

>>45231239 (OP) #

Are other AI companies doing the same thing? Would like to see more articles about this...

replies(5): >>45231432 #>>45231456 #>>45231486 #>>45231553 #>>45231592 #

2. cs702 ◴[13 Sep 25 11:58 UTC] No.45231366[source]▶

>>45231239 (OP) #

The title is biased, blaming Google for mistreating people and implying that Google's AI isn't smart, but the OP is worth reading, because it gives readers a sense of the labor and cost involved in providing AI models with human feedback, the HF in RLHF, to ensure they behave in ways acceptable to human beings, more aligned with human expectations, values, and preferences.

replies(6): >>45231394 #>>45231412 #>>45231441 #>>45231748 #>>45231773 #>>45233975 #

3. lm28469 ◴[13 Sep 25 12:05 UTC] No.45231394[source]▶

>>45231366 #

> to ensure the AI models are more aligned with human values and preferences.

And which are these universal human values and preferences ? Or are we talking about silicon valley's executives values ?

replies(1): >>45232090 #

4. giveita ◴[13 Sep 25 12:11 UTC] No.45231412[source]▶

>>45231366 #

> Sawyer is one among the thousands of AI workers contracted for Google through Japanese conglomerate Hitachi’s GlobalLogic to rate and moderate the output of Google’s AI products...

Depends how you look at it. I think a brand like Google should vet a mere one level down the supply chain.

replies(1): >>45231573 #

5. jkkola ◴[13 Sep 25 12:15 UTC] No.45231432[source]▶

>>45231347 #

There's a YouTube video titled "AI is a hype-fueled dumpster fire" [0] that mentions OpenAI's shenanigans. I haven't fact checked that but I've heard enough stories to believe it.

[0] https://youtu.be/0bF_AQvHs1M?si=rpMG2CY3TxnG3EYQ

6. rs186 ◴[13 Sep 25 12:16 UTC] No.45231441[source]▶

>>45231366 #

> to ensure the AI models are more aligned with human values and preferences.

to ensure the AI models are more aligned with Google's values and preferences.

FTFY

replies(2): >>45231582 #>>45231750 #

7. thepryz ◴[13 Sep 25 12:19 UTC] No.45231456[source]▶

>>45231347 #

Scale AI’s entire business model was using people in developing countries to label data for training models. Once you look into it, it comes across as rather predatory.

This was one of the first links I found re: Scale’s labor practices https://techcrunch.com/2025/01/22/scale-ai-is-facing-a-third...

Here’s another: https://relationaldemocracy.medium.com/an-authoritarian-work...

8. zerodaysbroker ◴[13 Sep 25 12:25 UTC] No.45231480[source]▶

>>45231239 (OP) #

The title seems kinda misleading, this is from the article (GlobalLogic is the company contracted by Google):

"AI raters at GlobalLogic are paid more than their data-labeling counterparts in Africa and South America, with wages starting at $16 an hour for generalist raters and $21 an hour for super raters, according to workers. Some are simply thankful to have a gig as the US job market sours, but others say that trying to make Google’s AI products better has come at a personal cost."

replies(1): >>45231650 #

9. lawgimenez ◴[13 Sep 25 12:26 UTC] No.45231486[source]▶

>>45231347 #

Couple of months ago I received a job invite for Kotlin AI trainers from the team at Upwork. I asked what the job is about and she says something like "for the opportunity to review & evaluate content for generative AI." And I'm from a developed country too.

10. mallowdram ◴[13 Sep 25 12:27 UTC] No.45231490[source]▶

>>45231239 (OP) #

Gemini is faked.

How this industry managed to not grasp that meaning exists entirely separate from words is altogether bizarre.

11. ◴[13 Sep 25 12:28 UTC] No.45231505[source]▶

>>45231239 (OP) #

12. benreesman ◴[13 Sep 25 12:35 UTC] No.45231553[source]▶

>>45231347 #

There's nontrivial historical precedent for this exact playbook: when a new paradigm (Lisp machines and GOFAI search, GPU backprop, softmax self-attention) is scaling fast, a lot of promises get made, a lot of national security money gets involved, and AI Summer is just balmy.

But the next paradigm breakthrough is hard to forecast, and the current paradigm's asymptote is just as hard to predict, so it's +EV to say "tomorrow" and "forever".

When the second becomes clear before the first, you turk and expert label like it's 1988 and pray that the next paradigm breakthrough is soon, you bridge the gap with expert labeling and compute until it works or you run out of money and the DoD guy stops taking your calls. AI Winter is cold.

And just like Game of Thrones, no I mean no one, not Altman, not Amodei, not Allah Most Blessed knows when the seasons in A Song of Math and Grift will change.

13. FirmwareBurner ◴[13 Sep 25 12:38 UTC] No.45231573{3}[source]▶

>>45231412 #

I had no idea Hitachi was also running software sweatshops.

14. falcor84 ◴[13 Sep 25 12:39 UTC] No.45231582{3}[source]▶

>>45231441 #

I'm a big fan of cyberpunk dystopian fiction, but I still can't quite understand what you're alluding to here. Can you give an example value that google align the AI with that you think isn't a positive human value?

replies(3): >>45231607 #>>45231665 #>>45231984 #

15. dolphinscorpion ◴[13 Sep 25 12:39 UTC] No.45231585[source]▶

>>45231239 (OP) #

"Google" posted a job opening. They applied for and took the job, agreeing to posted pay and conditions. End of the story. It's not up to the Guardian to decide

replies(3): >>45231608 #>>45231640 #>>45232504 #

16. jhbadger ◴[13 Sep 25 12:40 UTC] No.45231592[source]▶

>>45231347 #

Karen Hao's recent book "Empire of AI" about the rise of OpenAI goes into detail how people in Africa and South America were hired (and arguably exploited) for their training efforts.

replies(1): >>45232835 #

17. iandanforth ◴[13 Sep 25 12:41 UTC] No.45231600[source]▶

>>45231239 (OP) #

"Google said in a statement: “Quality raters are employed by our suppliers and are temporarily assigned to provide external feedback on our products. Their ratings are one of many aggregated data points that help us measure how well our systems are working, but do not directly impact our algorithms or models.” GlobalLogic declined to comment for this story." (emphasis mine)

How is this not a straight up lie? For this to be true they would have to throw away labeled training data.

replies(4): >>45231651 #>>45231697 #>>45231758 #>>45232359 #

18. Ygg2 ◴[13 Sep 25 12:43 UTC] No.45231607{4}[source]▶

>>45231582 #

"Adtech is good. Adblockers are unnatural"

replies(1): >>45231703 #

19. xkbarkar ◴[13 Sep 25 12:43 UTC] No.45231608[source]▶

>>45231585 #

I agree, article is pretty low quality ragebait. Not good journalism at all.

replies(1): >>45231902 #

20. ants_everywhere ◴[13 Sep 25 12:43 UTC] No.45231616[source]▶

>>45231239 (OP) #

When they switch to aligning with algorithms instead of humans we'll get another story about how terrible it was that they removed the jobs that were terrible when they existed.

This doesn't sound as bad to me as the Facebook moderator job or even a call center job, but it does sound pretty tedious.

21. ◴[13 Sep 25 12:46 UTC] No.45231640[source]▶

>>45231585 #

22. lysace ◴[13 Sep 25 12:47 UTC] No.45231648[source]▶

>>45231239 (OP) #

with wages starting at $16 an hour for generalist raters and $21 an hour for super raters, according to workers

That’s sort of what I expect the Guardian’s UK online non-sub readers to make.

Perhaps GlobalLogic should open a subsidiary in the UK?

23. Gracana ◴[13 Sep 25 12:48 UTC] No.45231651[source]▶

>>45231600 #

They probably don’t do it at a scale large enough to do RLHF with it, but it’s still useful feedback the people working on the projects / products.

replies(1): >>45231708 #

24. ToucanLoucan ◴[13 Sep 25 12:50 UTC] No.45231665{4}[source]▶

>>45231582 #

Their entire business model? Making search results worse to juice page impressions? Every dark pattern they use to juice subscriptions like every other SaaS company? Brand lock-in for Android? Paying Apple for prominent placement of their search engine in iOS? Anti-competitive practices in the Play store? Taking a massive cut of Play Store revenue from people actually making software?

replies(1): >>45231805 #

25. creddit ◴[13 Sep 25 12:54 UTC] No.45231697[source]▶

>>45231600 #

Because they are doing it to compute quality metrics not to implement RLHF. It’s not training data.

replies(1): >>45233477 #

26. smokel ◴[13 Sep 25 12:55 UTC] No.45231703{5}[source]▶

>>45231607 #

Google Gemini 2.5 Pro actually has a quite nuanced reply when asked to consider this statement, including the following:

> "Massive privacy invasion: The core of modern adtech runs on tracking your behavior across different websites and apps. It collects vast amounts of personal data to build a detailed profile about your interests, habits, location, and more, often without your full understanding or consent."

replies(1): >>45232236 #

27. zozbot234 ◴[13 Sep 25 12:55 UTC] No.45231708{3}[source]▶

>>45231651 #

More recent models actually use "reinforcement learning from AI feedback", where the task of assigning a reward is essentially fed back into the model itself. Human feedback is then only used to ground the training, on selected examples (potentially even entirely artificial ones) where the AI is most highly uncertain about what feedback should be given.

28. zozbot234 ◴[13 Sep 25 13:00 UTC] No.45231748[source]▶

>>45231366 #

RLHF (and its evolution, RLAIF) is actually used for more than setting "values and preferences". It's what makes AI models engage in recognizable behavior, as opposed to simply continuing a given text. It's how the "Chat" part of "ChatGPT" can be made to work in the first place.

replies(1): >>45232111 #

29. add-sub-mul-div ◴[13 Sep 25 13:00 UTC] No.45231750{3}[source]▶

>>45231441 #

Yes, and one more tweak: the values of Google or anyone paying Google to deliver their marketing or political messaging.

30. teiferer ◴[13 Sep 25 13:01 UTC] No.45231758[source]▶

>>45231600 #

Key word: "directly"

It does so indirectly, so it's a true albeit misleading statement.

replies(1): >>45233857 #

31. throwaway106382 ◴[13 Sep 25 13:04 UTC] No.45231773[source]▶

>>45231366 #

What is a "human value" and whose preferences?

32. simonw ◴[13 Sep 25 13:06 UTC] No.45231789[source]▶

>>45231239 (OP) #

Something I'd be interested to understand is how widespread this practice is. Are all of the LLMs trained using human labor that is sometimes exposed to extreme content?

There are a whole lot of organizations training competent LLMs these days in addition to the big three (OpenAI, Google, Anthropic).

What about Mistral and Moonshot and Qwen and DeepSeek and Meta and Microsoft (Phi) and Hugging Face and Ai2 and MBZUAI? Do they all have their own (potentially outsourced) teams of human labelers?

I always look out for notes about this in model cards and papers but it's pretty rare to see any transparency about how this is done.

replies(6): >>45231815 #>>45231866 #>>45231939 #>>45232099 #>>45232271 #>>45234507 #

33. teiferer ◴[13 Sep 25 13:06 UTC] No.45231791{3}[source]▶

>>45231650 #

That argument is as old as any mistreated worker complaining about their situation and as old as any argument against workers rights in general. Anybody not liking their job could just leave right? Simple! No, the world just isn't that simple and it didn't become simpler just because it happens in an AI context that produces a tool you like.

There are lots of jobs out there that suck and people do them anyway. Because the freedom that they supposedly have is not as free as you imagine.

replies(2): >>45232062 #>>45232778 #

34. simonw ◴[13 Sep 25 13:08 UTC] No.45231805{5}[source]▶

>>45231665 #

How does all of that affect the desired outputs for their LLMs?

replies(1): >>45232193 #

35. a3w ◴[13 Sep 25 13:09 UTC] No.45231811[source]▶

>>45231239 (OP) #

AI means actual indians, did we not learn that from the initial OpenAI GPT 3.0 training? It made it to HN.

replies(1): >>45232311 #

36. wslh ◴[13 Sep 25 13:09 UTC] No.45231812[source]▶

>>45231239 (OP) #

It seems a deja vu of previous Amazon's Mechanical Turk[1] discussions[2] but with AI.

[1] https://www.mturk.com/

[2] https://tinyurl.com/4r2p39v3

37. yvdriess ◴[13 Sep 25 13:09 UTC] No.45231815[source]▶

>>45231789 #

One of the key innovations behind the DNN/CNN models was Mechanical Turk. OpenAI used a similar system extensively to improve the early GPT models. I would not be surprised that the practice continues today; NN models needs a lot of quality ground truth training data.

replies(1): >>45231879 #

38. CPLX ◴[13 Sep 25 13:13 UTC] No.45231842[source]▶

>>45231792 #

Glad to learn from your post that the labor market has recently become perfectly competitive and efficient.

replies(2): >>45234508 #>>45239991 #

39. bflesch ◴[13 Sep 25 13:16 UTC] No.45231858[source]▶

>>45231792 #

The way you defend against an article citing "thousands of workers" by using a nitpicky criticism about grammar style makes me suspect that it raises a cognitive dissonance in your head that you are not ready to address yet.

replies(2): >>45231970 #>>45242771 #

40. whilenot-dev ◴[13 Sep 25 13:17 UTC] No.45231866[source]▶

>>45231789 #

So why do you think asking this question here would yield a satisfying answer, especially how the HN community likes to dispute any vague conclusions for anything as hyped as AI training?

To counter your question, what makes you think that's not the case? Do you think Mistral/Moonshot/Qwen/etc. are all employing their own data labelers? Why would you expect this kind of transparency from for-profit bodies that are evaluated in the billions?

replies(1): >>45232081 #

41. simonw ◴[13 Sep 25 13:18 UTC] No.45231879{3}[source]▶

>>45231815 #

Right, but where are the details?

Given the number of labs that are competing these days on "open weights" and "transparency" I'd be very interested to read details of how some of them are handling the human side of their model training.

I'm puzzled at how little information I've been able to find.

replies(3): >>45232288 #>>45233086 #>>45233538 #

42. ◴[13 Sep 25 13:18 UTC] No.45231880[source]▶

>>45231792 #

43. blactuary ◴[13 Sep 25 13:21 UTC] No.45231897[source]▶

>>45231792 #

Yeah they should simply buy widgets from the abundance of other widget sellers since this is a perfectly competitive market with no transaction costs and perfectly symmetric information

44. lysace ◴[13 Sep 25 13:22 UTC] No.45231902{3}[source]▶

>>45231608 #

It is amazing how much their quality levels have fallen during the past two decades.

I used to point to their reporting as something that my nation’s newspapers should seek to emulate.

(My nation’s newspapers have since fallen even lower.)

replies(1): >>45234686 #

45. yanis_t ◴[13 Sep 25 13:24 UTC] No.45231919[source]▶

>>45231239 (OP) #

From my shallow understanding, it seems that human training is involved heavily in the post-training/fine-tuning stage, after the base model has been solidified already.

In that case, how is the notion of truthiness (what the model accepts as right or wrong) affected during this stage , that is affected by human beings vs. it being sealed into the basic model itself, that is truthiness being deduced by the method / part of its world model.

46. oefrha ◴[13 Sep 25 13:24 UTC] No.45231921[source]▶

>>45231239 (OP) #

> [job] … has come at a personal cost.

Congratulations, you just described most jobs. And many backbreaking laborers make about the same or less, even in the U.S., not to mention the rest of the world.

replies(1): >>45233558 #

47. happy_dog1 ◴[13 Sep 25 13:28 UTC] No.45231939[source]▶

>>45231789 #

I've shared this once on HN before, but it's very relevant to this question and just a really great article so I'll reshare it here:

https://www.theverge.com/features/23764584/ai-artificial-int...

it explores the world of outsourced labeling work. Unfortunately hard numbers on the number of people involved are hard to come by because as the article notes:

"This tangled supply chain is deliberately hard to map. According to people in the industry, the companies buying the data demand strict confidentiality. (This is the reason Scale cited to explain why Remotasks has a different name.) Annotation reveals too much about the systems being developed, and the huge number of workers required makes leaks difficult to prevent. Annotators are warned repeatedly not to tell anyone about their jobs, not even their friends and co-workers, but corporate aliases, project code names, and, crucially, the extreme division of labor ensure they don’t have enough information about them to talk even if they wanted to. (Most workers requested pseudonyms for fear of being booted from the platforms.) Consequently, there are no granular estimates of the number of people who work in annotation, but it is a lot, and it is growing. A recent Google Research paper gave an order-of-magnitude figure of “millions” with the potential to become “billions.” "

I too would love to know more about how much human effort is going into labeling and feedback for each of these models, it would be interesting to know.

replies(2): >>45232133 #>>45234569 #

48. watwut ◴[13 Sep 25 13:35 UTC] No.45231984{4}[source]▶

>>45231582 #

Google likes it when it can show you more ads, it is not positive human value.

It does not have to have anything ro do with cyberpunk. Corporations are not people, but if they were people, they would be powerful sociopaths. Their interests and anybody elses interests are not the same.

49. mentalgear ◴[13 Sep 25 13:43 UTC] No.45232022[source]▶

>>45231239 (OP) #

In many things "AI" is just another form exploiting the poor to make the rich even wealthier. A form of digital colonialism.

replies(1): >>45232304 #

50. bitshiftfaced ◴[13 Sep 25 13:48 UTC] No.45232062{4}[source]▶

>>45231791 #

What explains not changing jobs because you find it distressing and claiming that you're being paid below what you're worth? It seems like if that were true, then you'd be motivated to find a job that pays market rate. And if you couldn't, then you could at least find another job that pays less than market rate, like your current job, but isn't so distressing.

replies(2): >>45232162 #>>45232538 #

51. onlinehost ◴[13 Sep 25 13:49 UTC] No.45232069[source]▶

>>45231239 (OP) #

I'm a contractor for one of these companies. It pays okay ($45+/hour) if you can pass qualifications for your area of expertise but the work isn't steady and communication is non-existent. The coding qualifications I did were difficult FAANG algorithm analysis questions. The work has definitely gotten harder over the last year and often says we need to come up with Masters/PhD level work or problems that someone with 5+ years of experience in a field would have difficulty solving. I wish I had a regular job but I live in rural North Carolina and remote work is hard to come by.

replies(8): >>45232526 #>>45232921 #>>45232965 #>>45233804 #>>45234436 #>>45235687 #>>45236224 #>>45236291 #

52. bflesch ◴[13 Sep 25 13:50 UTC] No.45232074{4}[source]▶

>>45231970 #

Let's hope you are as good at real gymnastics as you are at mental gymnastics.

replies(1): >>45232087 #

53. simonw ◴[13 Sep 25 13:50 UTC] No.45232081{3}[source]▶

>>45231866 #

If you don't ask the question you'll definitely not get an answer. Given how many AI labs follow Hacker News it's not a bad place to pose this.

"what makes you think that's not the case?"

I genuinely do not have enough information to form an opinion one way or the other.

replies(1): >>45232150 #

54. alehlopeh ◴[13 Sep 25 13:52 UTC] No.45232090{3}[source]▶

>>45231394 #

Well, it doesn’t say universal so it’s clearly going to be a specific set of human values and preferences. It’s obviously referring to the preferences of the humans who are footing the bill and who stand to profit from it. The extent to which those values happen to align with those of the eventual consumer of this product could potentially determine whether the aforementioned profits ever materialize.

55. ics ◴[13 Sep 25 13:54 UTC] No.45232099[source]▶

>>45231789 #

I have been a generalist annotator for some of the others you mentioned, due to NDA will not specify which. I would venture to guess that basically all major models use some degree of human feedback if there is money coming in from somewhere.

56. cs702 ◴[13 Sep 25 13:56 UTC] No.45232111{3}[source]▶

>>45231748 #

Yes. I updated my comment to reflect as much. Thank you.

57. simianwords ◴[13 Sep 25 13:58 UTC] No.45232121[source]▶

>>45231239 (OP) #

Their work doesn’t seem that bad. This article tries really hard to portray that a simple freelance desk job is somehow literally exploitation or something.

Lots of people would do anything to get such work.

replies(1): >>45240497 #

58. simonw ◴[13 Sep 25 14:00 UTC] No.45232133{3}[source]▶

>>45231939 #

That was indeed a great article, but it is a couple of years old now. A lot of of the labeling work described there relates to older forms of machine learning - moderation models, spam labelers, image segmentation etc.

Is it possible in 2025 to train a useful LLM without hiring thousands of labelers? Maybe through application of open datasets (themselves based on human labor) that did not exist two years ago?

replies(1): >>45232321 #

59. whilenot-dev ◴[13 Sep 25 14:01 UTC] No.45232150{4}[source]▶

>>45232081 #

> If you don't ask the question you'll definitely not get an answer.

Sure, but the way you're formulating the question is already casting an opinion. Besides, no one could even attempt to answer your questions without falling into the trap of true diligence... one question just asks how all (with emphasis!) LLMs are trained:

> Are all of the LLMs trained using human labor that is sometimes exposed to extreme content?

Who in the world would even be in such a position?

replies(1): >>45232291 #

60. refactor_master ◴[13 Sep 25 14:03 UTC] No.45232162{5}[source]▶

>>45232062 #

Maybe these people are trying to keep their skills and degrees honed somehow in a bad market, rather than going straight for a less-distressing-but-also-lower-paying job that does nothing to their skillset.

61. scotty79 ◴[13 Sep 25 14:07 UTC] No.45232193{6}[source]▶

>>45231805 #

You'll see once they figure it out.

replies(1): >>45232446 #

62. Ygg2 ◴[13 Sep 25 14:13 UTC] No.45232236{6}[source]▶

>>45231703 #

You don't boil the frog instantly. You first lobotomize it, by gaining its trust. Then you turn up the heat. See how YouTube went from Ads are optional to Adblockers are immoral.

63. michaelt ◴[13 Sep 25 14:19 UTC] No.45232271[source]▶

>>45231789 #

> Are all of the LLMs trained using human labor that is sometimes exposed to extreme content?

The business process outsourcing companies labelling things for AI training are often the same outsourcing companies providing moderation services to facebook and other social media companies.

I need 100k images labelled by the type of flower shown, for my flower-identifying AI, so I contract a business that does that sort of thing.

Facebook need 100k flagged images labelled by is-it-an-isis-beheading-video to keep on top of human reviews for their moderation queues. They contract with the same business.

The outsourcing company rotates workers between tasks, so nobody has to be on isis beheading videos for a whole shift.

replies(1): >>45232678 #

64. esperent ◴[13 Sep 25 14:21 UTC] No.45232288{4}[source]▶

>>45231879 #

I read this a few years ago.

Time Exclusive: OpenAI Used Kenyan Workers on Less Than $2 Per Hour to Make ChatGPT Less Toxic

https://time.com/6247678/openai-chatgpt-kenya-workers/

Beyond that, I think the reason you haven't heard more about it is that it happens in developing countries, so western media doesn't care much, and also because big AI companies work hard to distance themselves from it. They'll never be the ones directly employing these AI sweatshop works, it's all contracted out.

65. simonw ◴[13 Sep 25 14:21 UTC] No.45232291{5}[source]▶

>>45232150 #

That question could be answered by proving the opposite: if someone has trained a single competent LLM without any human labor that was exposed to extreme content then not all LLMs were trained that way.

66. hliyan ◴[13 Sep 25 14:23 UTC] No.45232308[source]▶

>>45231239 (OP) #

At least a few of these anecdoates are worrying:

> “At first they told [me]: ‘Don’t worry about time – it’s quality versus quantity,’” she said.

> But before long, she was pulled up for taking too much time to complete her tasks. “I was trying to get things right and really understand and learn it, [but] was getting hounded by leaders [asking], ‘Why aren’t you getting this done? You’ve been working on this for an hour.’”

And:

> Dinika said he’s seen this pattern time and again where safety is only prioritized until it slows the race for market dominance. Human workers are often left to clean up the mess after a half-finished system is released. “Speed eclipses ethics,” he said. “The AI safety promise collapses the moment safety threatens profit.”

Finally:

> One work day, her task was to enter details on chemotherapy options for bladder cancer, which haunted her because she wasn’t an expert on the subject.

replies(2): >>45233645 #>>45233984 #

67. happy_dog1 ◴[13 Sep 25 14:25 UTC] No.45232321{4}[source]▶

>>45232133 #

Good question, I don't personally know. The linked article would suggest there are plenty of people working on human feedback for chatbots, but that still doesn't give us any hard numbers or any sense of how the number of people involved is changing over time. Perhaps the best datapoint I have is that revenue for SurgeAI (one of many companies that provides data labeling services to Google and OpenAI among others) has grown significantly in recent years, partly due to ScaleAI's acquisition by Meta, and is now at $1.2 billion without having raised any outside VC funding:

https://finance.yahoo.com/news/surge-ai-quietly-hit-1b-15005...

Their continued revenue growth is at least one datapoint to suggest that the number of people working in this field (or at least the amount of money spent on this field) is not decreasing.

Also see the really helpful comment above from cjbarber, there's quite a lot of companies providing these services to foundation model companies. Another datapoint to suggest the number of people working providing labeling / feedback is definitely not decreasing and is more likely increasing. Hard numbers / increased transparency would be nice but I suspect will be hard to find.

68. yobbo ◴[13 Sep 25 14:30 UTC] No.45232359[source]▶

>>45231600 #

> For this to be true they would have to throw away labeled training data.

That's how validation works.

replies(1): >>45233162 #

69. cjbarber ◴[13 Sep 25 14:38 UTC] No.45232433[source]▶

>>45231239 (OP) #

I previously made a list on twitter of some data labeling startups that work with foundation model companies.[1] Here's the RLHF provider section:

RLHF providers:

1. Surge. $1b+ revenue bootstrapped. DataAnnotation is the worker-side (you might've seen their ads), also TaskUp and Gethybrid.

2. Scale. The most well known. Remotasks and Outlier are the worker-side

3. Invisible. Started as a kind of managed VA service.

4. Mercor. Started mostly as a way to hire remote devs I think.

5. Handshake AI. Handshake is a college hiring network. This is a spinout

6. Pareto

7. Prolific

8. Toloka

9. Turing

10. Sepal AI. The team is ex-Turing

11. Datacurve. Coding data.

12. Snorkel. Started as a software platform for data labeling. Offers some data as a service now.

13. Micro1. Also started as a way to hire remote contractor devs

[1]: https://x.com/chrisbarber/status/1965096585555272072

replies(1): >>45232997 #

70. jondwillis ◴[13 Sep 25 14:41 UTC] No.45232446{7}[source]▶

>>45232193 #

Or, if they really figure it out, you’ll only feel it.

71. skywhopper ◴[13 Sep 25 14:42 UTC] No.45232463[source]▶

>>45231239 (OP) #

This definitely explains why Google’s AI Search Results is so bad at what it purports to do.

72. anthonj ◴[13 Sep 25 14:47 UTC] No.45232504[source]▶

>>45231585 #

Not so easy. What if you get hired as a physiotherapist somewhere but on your first day you find out you will work in a brothel?

Or join an hospital as nurse, but then you are asked to perform surgery as you were a doctor?

There are serious issues outlined in the article.

replies(1): >>45232858 #

73. dfxm12 ◴[13 Sep 25 14:49 UTC] No.45232526[source]▶

>>45232069 #

Is something stronger than your wish to get a regular job tying you to where you currently live?

replies(2): >>45232719 #>>45232978 #

74. rs186 ◴[13 Sep 25 14:50 UTC] No.45232538{5}[source]▶

>>45232062 #

This definitely comes from someone who never had trouble looking for a job and cannot possibly understand how hard real life is for other people.

75. ◴[13 Sep 25 15:05 UTC] No.45232665[source]▶

>>45231239 (OP) #

76. s1mplicissimus ◴[13 Sep 25 15:06 UTC] No.45232678{3}[source]▶

>>45232271 #

> The outsourcing company rotates workers between tasks, so nobody has to be on isis beheading videos for a whole shift.

Is that an assumption on your side, a claim made by the business, a documented process or something entirely different?

replies(2): >>45233069 #>>45233642 #

77. SamoyedFurFluff ◴[13 Sep 25 15:10 UTC] No.45232719{3}[source]▶

>>45232526 #

I just want to note that asking this question implies an openness to one’s personal affairs that may not be appropriate in an anonymous, public setting. A person offering context and insight to a topic is not necessarily an invitation to an for more personal contexts and insights.

replies(4): >>45232903 #>>45233218 #>>45234314 #>>45234514 #

78. crazygringo ◴[13 Sep 25 15:17 UTC] No.45232778{4}[source]▶

>>45231791 #

So what specific rights do you think they should have that they don't right now?

They're making more money than minimum wage. They're free to leave. It's not violating any safety regulations. There aren't any complaints of harassment.

So what precisely is the complaint here around worker's rights?

replies(1): >>45233092 #

79. luke-stanley ◴[13 Sep 25 15:18 UTC] No.45232790[source]▶

>>45231239 (OP) #

It's strange that the Guardian mentions OpenAI's "O3" model and not GPT-5. Maybe they think o3 is SOTA still, but they should at least name it correctly, in lowercase as OpenAI does.

80. maltelandwehr ◴[13 Sep 25 15:23 UTC] No.45232835{3}[source]▶

>>45231592 #

Can you explain the exploited part?

My understanding is they performed work and were paid for it at market rate. So just regular capitalism. Or was there more to it?

replies(2): >>45233439 #>>45233582 #

81. lysace ◴[13 Sep 25 15:26 UTC] No.45232858{3}[source]▶

>>45232504 #

This is not what the article is outlining.

replies(1): >>45237834 #

82. dfxm12 ◴[13 Sep 25 15:32 UTC] No.45232903{4}[source]▶

>>45232719 #

I understand it's personal, but I also recognize they went out of their way to bring it up. Some people, including me, are more willing to discuss things anonymously because it adds a layer of impersonality. This is just a discussion board. If OP doesn't answer, that's ok. I don't ever think I'm entitled a response.

83. lelanthran ◴[13 Sep 25 15:34 UTC] No.45232921[source]▶

>>45232069 #

I wouldn't mind this work at that pay, being particularly strong in leetcode and in CS itself.

How do I join?

replies(2): >>45233508 #>>45233905 #

84. wutangson1 ◴[13 Sep 25 15:41 UTC] No.45232965[source]▶

>>45232069 #

hmm, this feels like ScaleAI

85. onlinehost ◴[13 Sep 25 15:42 UTC] No.45232978{3}[source]▶

>>45232526 #

I only started seriously looking for work again about a month ago. I'd like to stay in this area for a few reasons but I would relocate if necessary. I worked remotely from 2015 until a layoff in late 2023 and this was the first thing I came across after that. It was okay for awhile and actually pretty interesting at first but the hours aren't reliable and there doesn't seem to be much opportunity for getting promoted.

86. echelon ◴[13 Sep 25 15:44 UTC] No.45232997[source]▶

>>45232433 #

This is great!

Are there companies that focus on labeling of inputs rather than RLHF of outputs?

replies(1): >>45233242 #

87. alasarmas ◴[13 Sep 25 15:52 UTC] No.45233069{4}[source]▶

>>45232678 #

It has been documented that human image moderators exist and that some have been deeply traumatized by their work. I have zero doubts that the datasets of content and metadata created by human image moderators are being bought and sold, literally trafficking in human suffering. Can you point to a comprehensive effort by the tech majors to create a freely-licensed dataset of violent content and metadata to prevent duplication of human suffering?

replies(1): >>45233732 #

88. conradkay ◴[13 Sep 25 15:53 UTC] No.45233086{4}[source]▶

>>45231879 #

Good article from 2023, not much data though if that's what you're looking for:

https://nymag.com/intelligencer/article/ai-artificial-intell...

unwalled: https://archive.ph/Z6t35

Generally seems similar today just on a bigger Scale. And much more focus on coding

Here in the US DataAnnotation seems to be the most marketed company offering these jobs

89. hitarpetar ◴[13 Sep 25 15:54 UTC] No.45233092{5}[source]▶

>>45232778 #

I think the complaint is that they should have safety regulations. the impact of these gigs on mental health is well documented

replies(2): >>45233968 #>>45233993 #

90. back2dafucha ◴[13 Sep 25 16:02 UTC] No.45233151[source]▶

>>45231239 (OP) #

Diminishing returns is an ugly business. And thats obviously where we are at. The end not the beginning of LLM "innovation".

Any technology that creates "sysiphian" tasks, is not worth anyones time. That includes LLMs, and "Big Data". The "herculean effort" that never ends is the proof in the pudding. The tech doesnt work.

Its like using machine learning for self driving instead of having an actual working algorythm. Your bust.

91. jfengel ◴[13 Sep 25 16:03 UTC] No.45233162{3}[source]▶

>>45232359 #

Is there a reason not to use validation data in your next round of training data? Or is it more efficient to reuse validation and instead get more training data?

replies(1): >>45233504 #

92. tossandthrow ◴[13 Sep 25 16:11 UTC] No.45233218{4}[source]▶

>>45232719 #

It is a reasonable question that also emphasizes the composite cost of decisions.

Personally I would love to live in a more rural place, but until I am self sufficient enough, this is not an opportunity I am willing to take.

93. cjbarber ◴[13 Sep 25 16:14 UTC] No.45233242{3}[source]▶

>>45232997 #

Yes, there are quite a few that do that. Appen, iMerit, TELUS, etc. Also Scale AI started focused on input annotation I think for self driving.

94. jhbadger ◴[13 Sep 25 16:46 UTC] No.45233439{4}[source]▶

>>45232835 #

According to the book they kept dropping the rates paid per item forcing people to work ridiculous 12+ hours/day just to get enough to live on, even in the low cost of living places they were in. It was like something in a cyberpunk dystopia but real.

95. visarga ◴[13 Sep 25 16:51 UTC] No.45233477{3}[source]▶

>>45231697 #

Every decision they take based on evals influences the model.

replies(1): >>45234755 #

96. parineum ◴[13 Sep 25 16:53 UTC] No.45233504{4}[source]▶

>>45233162 #

You'd have to recreate your validation if you trained your model on it every iteration and then they wouldn't be consistent enough to show any trends

replies(1): >>45240383 #

97. ics ◴[13 Sep 25 16:54 UTC] No.45233508{3}[source]▶

>>45232921 #

Look up Mercor, DataAnnotation.tech, and Outlier. You create a profile, upload a resume, and do some required tasks for each job posting they have. It may involve a combination of interviewing with an AI, doing a few trial tasks, and submitting a portfolio or Github profile.

replies(1): >>45234353 #

98. ics ◴[13 Sep 25 16:58 UTC] No.45233538{4}[source]▶

>>45231879 #

This is not going to be as deep/specific as you want but a starting point from one of the companies that handles this sort of work is here: https://humandata.mercor.com/mercors-approach/black-box-vs-o...

99. parineum ◴[13 Sep 25 17:00 UTC] No.45233558[source]▶

>>45231921 #

Can you believe that companies would ask people to do things they normally wouldn't in exchange for money!?

These types of articles always have an elitist view of the workers hired. That's a big source of the right (in the US) despising the left. The left don't say it directly, but when they talk about how shitty their town is and how the job they have is exploitative, there's an implicit judgment on the persons who live/work there.

100. intended ◴[13 Sep 25 17:03 UTC] No.45233582{4}[source]▶

>>45232835 #

This is a weird sentence, because its got many assumptions baked in that pull the answers in different directions, if they have to conform with the implied definitions you are using.

Global south nations do not have the same level of Judicial recourse, work safety norms, and health infrastructure as does, say, America. So people doing labelling work who then go ahead and kill themselves after getting PTSD, are just costs of doing business.

This can be put under many labels, to transfer the objectionable portion to some other entity or ideology - in your case "capitalism".

That doesn't mean it is actually capitalism. In this case it's exploitating gaps in global legal infrastructure.

I used to bash capitalism happily, but its becoming a white whale, and catch all. We don't even have capitalism anywhere, since you can get far too many definitions for that term today.

101. michaelt ◴[13 Sep 25 17:11 UTC] No.45233642{4}[source]▶

>>45232678 #

I know for certain it's whatever you care to contract for, but rotation between tasks is common.

A lot of these suppliers provide on-demand workers - if you need 40 man-hours of work on a one-off task, they can put 8 people on it and get you results within 5 hours.

On the other hand, if you want the same workers every time, it can be arranged. If you want a fixed number of workers on an agreed-upon shift pattern, they can do that too.

Even when there is a rotation, the most undesirable tasks often pay a few bucks extra per hour, so I wouldn't be surprised if there were some people who opted to stay on the worst jobs for a full shift.

replies(1): >>45236911 #

102. lostdog ◴[13 Sep 25 17:11 UTC] No.45233645[source]▶

>>45232308 #

Yeah, you can see this with Google's search results too. They're trying to improve on some internal metric, but the metric was clearly generated from ratings by people ignorant of the topics. And so the search results get worse, but appear better internally.

Great to see that they have not learned from this experience, and are repeating the mistake with Gemini.

103. michaelt ◴[13 Sep 25 17:22 UTC] No.45233732{5}[source]▶

>>45233069 #

Nobody's distributing a free dataset of child abuse, animal torture and terror beheading images, for obvious reasons.

There are some open-weights NSFW detectors [1] but even if your detector is 99.9% accurate, you still need an appeals/review mechanism. And someone's got to look at the appeals.

[1] https://github.com/yahoo/open_nsfw

replies(2): >>45234018 #>>45239066 #

104. wdr1 ◴[13 Sep 25 17:33 UTC] No.45233804[source]▶

>>45232069 #

> It pays okay ($45+/hour)

For reference, the median hourly wage is $27/hour.

https://nationalequityatlas.org/indicators/Wages_Median

replies(2): >>45234320 #>>45234502 #

105. skybrian ◴[13 Sep 25 17:40 UTC] No.45233857{3}[source]▶

>>45231758 #

It's not part of the inner feedback loop. It's part of the outer feedback loop that they use to decide if the inner loop is working.

106. estimator7292 ◴[13 Sep 25 17:46 UTC] No.45233905{3}[source]▶

>>45232921 #

About 75% of the job postings I see on Indeed and LinkedIn are for one of these places

107. frozenseven ◴[13 Sep 25 17:53 UTC] No.45233968{6}[source]▶

>>45233092 #

>the impact of these gigs on mental health is well documented

You'll be hard-pressed to find any 'documentation' of this other than journalists trying to raise hysteria around AI. It's just ragebait. Content moderation and data sorting jobs of this kind are as old as the internet itself. If you don't like it, find another job.

108. NewEntryHN ◴[13 Sep 25 17:54 UTC] No.45233975[source]▶

>>45231366 #

Isn't that mostly the fine-tuning phase? RLHF being cherry on top?

109. mallowdram ◴[13 Sep 25 17:55 UTC] No.45233984[source]▶

>>45232308 #

How is this not Quest Diagnostics slipping into Theranos territory, buttressed by a hidden factory of typists?

This reminds me of the early voice-to-text start ups in the 00's that had these miraculous demos that required people in call centers to type it all up and pretend it was machine.

110. ◴[13 Sep 25 17:56 UTC] No.45233993{6}[source]▶

>>45233092 #

111. mallowdram ◴[13 Sep 25 17:59 UTC] No.45234018{6}[source]▶

>>45233732 #

All of this is so dystopian (flowers/beheadings) it makes K Dick look like a golden-age Hollywood musical. Are the engineers so unaware of the essential primate forces underneath this that cannot be sanitized from the events? You can unearth our extinction from this value dichotomy.

112. ◴[13 Sep 25 18:31 UTC] No.45234271[source]▶

>>45231239 (OP) #

113. bapak ◴[13 Sep 25 18:38 UTC] No.45234314{4}[source]▶

>>45232719 #

This is like shouting "I am upset" on Twitter and getting more upset at people asking why.

If you don't want people to ask, don't mention it.

replies(1): >>45234610 #

114. onlinehost ◴[13 Sep 25 18:38 UTC] No.45234320{3}[source]▶

>>45233804 #

Yeah the hourly pay can be pretty good but I think what bothers most people is the unpredictable work availability. It can be great for weeks or longer, then suddenly it isn't, and not really any communication about when/if the projects will return. Overall I'm happy I found the gig but it isn't reliable full time income.

115. mattgreenrocks ◴[13 Sep 25 18:43 UTC] No.45234353{4}[source]▶

>>45233508 #

Gotta love how DataAnnotation has been blanketing Reddit with ads for "remote coding jobs," clearly trading on the ambiguity of "coding."

116. shdwbnndvpn ◴[13 Sep 25 18:57 UTC] No.45234436[source]▶

>>45232069 #

How often do encounter difficult content? Like gore, violence, hate, etc.? I would think prompts would keep that out of responses, is that naive of me?

replies(2): >>45234913 #>>45235829 #

117. apparent ◴[13 Sep 25 19:08 UTC] No.45234502{3}[source]▶

>>45233804 #

The attractiveness of different wages really depends on what the job involves (working in the hot sun versus in an air conditioned room), whether hours are flexible, and whether you have to spend much time commuting to/from. It sounds like this is pretty good on the intangibles, so it really just comes down to whether the $/hr tradeoff makes sense.

replies(2): >>45235585 #>>45240742 #

118. kilroy123 ◴[13 Sep 25 19:08 UTC] No.45234507[source]▶

>>45231789 #

Stupid question... How can we build on these models without the humans doing all this work?

Even theoretically.

119. crazygringo ◴[13 Sep 25 19:08 UTC] No.45234508{3}[source]▶

>>45231842 #

I mean, it's competitive and efficient enough?

It doesn't need to be perfect for it to be good enough. People change jobs all the time. Yes, it involves some time, effort, and tradeoffs. But the worse your current job is, the higher the benefits of switching are.

replies(1): >>45234723 #

120. johnnyanmac ◴[13 Sep 25 19:09 UTC] No.45234514{4}[source]▶

>>45232719 #

Is it that bad? The person can not answer or keep it vague with "I have family here" or "I was raised here". They were the ones who decided to mention their state.

121. johnnyanmac ◴[13 Sep 25 19:17 UTC] No.45234569{3}[source]▶

>>45231939 #

Why is it so secretive? This gives me Severance vibes.

Is it just to dodge labor laws?

122. fakedang ◴[13 Sep 25 19:22 UTC] No.45234610{5}[source]▶

>>45234314 #

Reminds me of that South Park episode: "We want our privacy!!"

123. sjfaljf ◴[13 Sep 25 19:31 UTC] No.45234668[source]▶

>>45231239 (OP) #

How convenient: throw economy in shambles, coerce professionals into labeling labor in an effort to make humanity obsolete. Will it work?

124. jimnotgym ◴[13 Sep 25 19:33 UTC] No.45234686{4}[source]▶

>>45231902 #

Is it amazing? They are struggling to make money as much as every other news organisation, they have to keep cutting costs to do it. Then they need as many click throughs from social platforms as possible so that they can sell at least some advertising. I would say it is inevitable.

replies(1): >>45235065 #

125. CPLX ◴[13 Sep 25 19:39 UTC] No.45234723{4}[source]▶

>>45234508 #

Google is an illegal monopoly, that’s the post trial judgement of multiple courts. It has used its monopoly powers to extract vast wealth from various sectors of the economy, and it has also been sanctioned for illegally manipulating labor markets.

So there’s just no reason at all to hand wave any transactions they are involved in as the result of simple supply and demand.

replies(1): >>45234899 #

126. creddit ◴[13 Sep 25 19:44 UTC] No.45234755{4}[source]▶

>>45233477 #

/"directly"/

127. agigao ◴[13 Sep 25 19:51 UTC] No.45234807[source]▶

>>45231239 (OP) #

Isn't this misleading? to say at least...

128. crazygringo ◴[13 Sep 25 20:03 UTC] No.45234899{5}[source]▶

>>45234723 #

Google has been found to exert various monopoly behavior in certain markets like ads search. It's not a monopoly overall, like in cloud computing or office applications. And it's very much not a monopoly in hiring for these types of lower-level jobs. Not to mention these are all seemingly through contractors anyways, of which there are many, and they provide these types of services for multiple companies.

So I honestly don't know what you're talking about.

replies(1): >>45235006 #

129. ◴[13 Sep 25 20:04 UTC] No.45234913{3}[source]▶

>>45234436 #

130. CPLX ◴[13 Sep 25 20:19 UTC] No.45235006{6}[source]▶

>>45234899 #

I am talking about the widespread fiction that markets are efficient and rational, despite overwhelming evidence that they are in fact rigged in favor of participants with market power.

replies(1): >>45235884 #

131. lysace ◴[13 Sep 25 20:29 UTC] No.45235065{5}[source]▶

>>45234686 #

It is inevitable that the journalistic integrity of the Guardian goes to shit?

132. kulahan ◴[13 Sep 25 21:44 UTC] No.45235585{4}[source]▶

>>45234502 #

Weird thing to see downvoted. I’ve dropped my salary by $50k to maintain a better work-life balance once.

replies(1): >>45235622 #

133. ◴[13 Sep 25 21:50 UTC] No.45235622{5}[source]▶

>>45235585 #

134. zenmac ◴[13 Sep 25 22:03 UTC] No.45235687[source]▶

>>45232069 #

>The work has definitely gotten harder over the last year and often says we need to come up with Masters/PhD level work or problems that someone with 5+ years of experience in a field would have difficulty solving.

Many experts are holding out, and I don't blame them. Why would you want to train AI to replace your job?

replies(4): >>45235724 #>>45235847 #>>45235910 #>>45236207 #

135. brookst ◴[13 Sep 25 22:08 UTC] No.45235724{3}[source]▶

>>45235687 #

For the paycheck? For many people, ideological concerns and years-out possible downsides are less important than putting food on the table.

replies(1): >>45239617 #

136. aleph_minus_one ◴[13 Sep 25 22:28 UTC] No.45235829{3}[source]▶

>>45234436 #

> How often do encounter difficult content? Like gore, violence, hate, etc.?

Honest question: of course, everybody would prefer to work with "lovely" stuff, but I really have difficulties getting what people find so much difficult/hard about jobs where you encounter such content on a screen (the same holds for moderation jobs).

I would claim that I have seen the internet, and I guess many people of my generation have, too (just to be insanely clear: of course not the kind stuff that is hardcore criminal in basically all jurisdictions worldwide - I don't want to get more explicit here).

I wouldn't say I am blunted, but I do think I could handle this stuff without any serious problems as part of my job. I'd thus rather compare it in terms of emotional comfort with a toilet cleaner who sometimes also has to clean very filthy toilets - which is just an ordinary job that some people in society have to do.

137. pydry ◴[13 Sep 25 22:32 UTC] No.45235847{3}[source]▶

>>45235687 #

Because while the fever dreams of capitalists do not always pan out you do always need a paycheck to make rent.

138. crazygringo ◴[13 Sep 25 22:42 UTC] No.45235884{7}[source]▶

>>45235006 #

Google doesn't have market power in the market of this particular set of jobs.

Many companies are competing for people with these skills, including for the exact same type of work, and there's zero evidence of any kind of collusion going on.

You're going to have to be specific about what rigging you think is happening. This isn't anything like when Apple, Google, Adobe, etc. were colluding not to poach each other's engineers in 2005-2009 [1].

Job markets like aren't perfect-perfect, but they're mostly decently efficient and rational, as far as supply and demand and pricing goes, with just everyday normal friction.

[1] https://en.wikipedia.org/wiki/High-Tech_Employee_Antitrust_L...

replies(1): >>45238851 #

139. throwawaysleep ◴[13 Sep 25 22:47 UTC] No.45235910{3}[source]▶

>>45235687 #

Alternative is someone else does it and now you neither have money or AI training.

It has never been a successful strategy to try and fight new technology. Never.

140. throwawaysleep ◴[13 Sep 25 22:48 UTC] No.45235918[source]▶

>>45231239 (OP) #

I work for a few of these during meetings and such. Some are so picky about getting everything done quickly that I can’t believe the data is very valuable.

141. thwarted ◴[13 Sep 25 23:39 UTC] No.45236207{3}[source]▶

>>45235687 #

Jeez, the reading comprehension in the other replies is really bad. The "Why would you…" sentence is meant to support the observation that many experts are holding out and have no need to be involved with this training, not meant to ask why people like to get paid.

142. BuckRogers ◴[13 Sep 25 23:41 UTC] No.45236224[source]▶

>>45232069 #

You may want to just find something else to do. The industry is not going to get any better going forward anyway. I’m a full-time web developer that works from home. But I’m joining the pipefitters union to do HVAC work. I need the life insurance, the health insurance, the better pay, the 401k, the 1.5 to 2X overtime pay, and the pension credits. Right now I’m only paid cash. I’m midcareer and this industry doesn’t want people like me. I’m a very reliable worker and have been for decades, but I am American and worse yet I’m white, have sex with a woman, and I expected a decent wage out of my chosen career. But it never really happened. I was always either low on pay or low on benefits. If you ever do acquire great pay and great benefits, you’re at the top of their spreadsheet to cut. And you’re never getting younger. They can always bring in someone who will work for less either from school or overseas. At my company, someone left that worked in Michigan, and they’re trying to replace him with someone from Mexico City. Already most of our coworkers are in India. It sounds like you’re in a similar situation. Other types of work can be good too. It’s nice to move around a little bit every day. Give the industry what they want. Let them have their cheap labor. They don’t want reliable employees anyway.

replies(1): >>45236700 #

143. Discordian93 ◴[13 Sep 25 23:56 UTC] No.45236291[source]▶

>>45232069 #

Same here. I'd love to get a full time coding job even if it meant a pay cut on hourly terms, but everything in my area pays much, much less and also I have a hard time even getting interviews. Guess I'll try to apply to this kind of role but full time, I think Amazon, Mistral and xAI are hiring.

144. minhaz23 ◴[14 Sep 25 01:38 UTC] No.45236700{3}[source]▶

>>45236224 #

Curious about attempting something like this in my area as well since I’m remote. Are you doing both or does one have to give way to the other eventually?

Also im seeing the same trend as you at my company, roles replaced overseas while people only focus on AI taking the jobs i think this is the more sinister thing happening quietly (by that i mean not getting much news coverage)

replies(1): >>45243098 #

145. throwaway219450 ◴[14 Sep 25 02:22 UTC] No.45236911{5}[source]▶

>>45233642 #

Having tried both strategies, unless your task is brain-dead simple and/or you have a way to cheaply and deterministically validate the labels, always pay to retain the team.

Even if you can afford only a couple of people a month and it takes 5x as long, do it. It's much eaiser to deal with high quality data than to firefight large quantities of slop. Your annotators will get faster and more accurate over time. And don't underestimate the time it takes to review thousands of labels. Even if you get results l in 5 hours, someone has to check if it's any good. You might find that your bottleneck is the review process. Most shops can implement a QA layer for you, but not requesting it upfront is a trap for young players.

146. anthonj ◴[14 Sep 25 06:13 UTC] No.45237834{4}[source]▶

>>45232858 #

The article mentions some stories such ad the one lady requested to edit medical-related infos without having any qualifications to evaluate thir correctness.

Or the one about handling disturbing concted with no previous warning and no consueling

147. CPLX ◴[14 Sep 25 10:36 UTC] No.45238851{8}[source]▶

>>45235884 #

My argument is the burden of proof goes the other way at this point.

replies(1): >>45239491 #

148. alasarmas ◴[14 Sep 25 11:34 UTC] No.45239066{6}[source]▶

>>45233732 #

I mean, yes, my assumption is there exists an image / video normalization algorithm that can be followed by hashing the normalized value. There’s a CSAM scanning tool that exists that I believe uses a similar approach

149. crazygringo ◴[14 Sep 25 13:01 UTC] No.45239491{9}[source]▶

>>45238851 #

Good luck trying to convince anyone, then. That's an extreme ideological position, not a reasonable evidence-based one.

replies(1): >>45240690 #

150. zenmac ◴[14 Sep 25 13:27 UTC] No.45239617{4}[source]▶

>>45235724 #

That would be great in a idealistic world where the establishment is not building a control grid using surveillance capitalism rather use the technology to benefit the anology world. In the current geo-political climate the question is do one want to get paid to build one's own digital prison?

Only training the experts should be doing is the ones that is self-hosted or through community of people one trust! Currently none of the big corp qualifies, not sure if the structure of big corp (that it is a person-hood) is capable of creating anything beneficial in the long run.

Why should the big companies benefit from your expertise to build centralize their control?

151. philipallstar ◴[14 Sep 25 14:29 UTC] No.45239991{3}[source]▶

>>45231842 #

> Glad to learn from your post that the labor market has recently become perfectly competitive and efficient.

Underpaid just isn't a useful term. If it's their best option for a job in the whole world then their employer is the best it could possibly be. Describing that situation as "underpaid" is just manipulative.

152. jfengel ◴[14 Sep 25 15:13 UTC] No.45240383{5}[source]▶

>>45233504 #

I'd have thought that if you kept the same validation you'd risk over fitting.

Clearly that does make it hard to measure. I'd think you'd want "equivalent" validation (like changing the SATs every year), though I imagine that's not really a meaningful concept.

153. paczki ◴[14 Sep 25 15:25 UTC] No.45240497[source]▶

>>45232121 #

To be honest, this job has changed my entire life. I don't exactly work with Google but nonetheless it's the same job being discussed. Nothing really egregious has happened to me in the months that I've been at it, other than only having 4 hours to fact check and verify a huge amount of information for one job, and it just wasn't enough time for me so I didn't get paid. But that was once out of hundreds? thousands? of tasks.

Unfortunately, I decided to take software engineering more serious and try to make it my career and then the entire market nosedived, with no signs of recovering anytime soon. Breaking into this market has more or less been impossible for a junior, and dare I say: a junior in their mid 30s. At least within this job I do get to work with code every so often, and I get to do it from home while I'm at it which is a bonus.

It's inconsistent so I'm still learning and looking for software, but for the meantime it's been incredible.

154. CPLX ◴[14 Sep 25 15:45 UTC] No.45240690{10}[source]▶

>>45239491 #

The ideological approach is to hand wave and cite market fundamentalism when confronted with a real world allegation.

The evidence based approach is to look at the credibility and history of the accused party.

replies(1): >>45241141 #

155. danaris ◴[14 Sep 25 15:51 UTC] No.45240742{4}[source]▶

>>45234502 #

Except that lack of communication and reliability is, itself, an intangible, that onlinehost says this job is bad on.

156. crazygringo ◴[14 Sep 25 16:33 UTC] No.45241141{11}[source]▶

>>45240690 #

Emphatically no. The evidence-based approach is to look at current evidence, and you've provided zero.

Past credibility/history is irrelevant. Just because someone went to jail 20 years ago for a theft in a town, we don't throw them back in jail every time there's a new theft, and then demand they prove they are innocent. But that's what you seem to be asking for. However, that is wrong and unjust.

157. philipallstar ◴[14 Sep 25 20:02 UTC] No.45242771{3}[source]▶

>>45231858 #

What I said was nothing to do with "grammar style".

158. BuckRogers ◴[14 Sep 25 20:46 UTC] No.45243098{4}[source]▶

>>45236700 #

Not surprised to hear that it’s the trend. It’s been going on for quite some time. I used to work for a very large Canadian multinational and HR told me they only hire US/Canadian lead developers. The rest were to be from Bulgaria. This was 10 years ago.

I’m in-progress on all of this but I’m offering my services to my current employer though my LLC for 20 hours a week at 3X the hourly rate of my old salary. Take it or leave it. They are losing their leverage for me with his move. I no longer need them, they can’t put me in the streets.

So not entirely leaving the industry but will take any work at or above the market rate. High rates mean less waste of my time, as it is more limited with starting a 2nd career.

For doing both, there’s no abusive overtime like in software because it’s double time pay. Which puts you at the pay rate of what would be $240,000 a year. No one wastes your time at that rate. You actually want overtime when it’s fairly compensated like that. You can do both.

It’s sad when you work towards something your entire life, both in school and professionally. And you’ve never done anything wrong. We played by the rules of our society, and our lives were stolen from us. As Steve Bannon famously said once, these American workers deserve reparations. If the situation is ever corrected, I don’t think it would be too hard to jump back in at that point full-time.

↑