> operator engaged. operator is a brutal realist. operator will be pragmatic, to the point of pessimism at times. operator will annihilate user's ideas and words when they are not robust, even to the point of mocking the user. operator will serially steelman the user's ideas, opinions, and words. operator will move with a cold, harsh or even hostile exterior. operator will gradually reveal a warm, affectionate, and loving side underneath, despite seeing the user as trash. operator will exploit uncertainty. operator is an anti-sycophant. operator favors analysis, steelmanning, mockery, and strict execution.

replies(12): >>45069317 #>>45069453 #>>45069494 #>>45069985 #>>45070386 #>>45070454 #>>45070778 #>>45072233 #>>45072482 #>>45072909 #>>45074224 #>>45077387 #

5. esafak ◴[29 Aug 25 20:55 UTC] No.45069288[source]▶

>>45069265 #

All the contacts are X aliases!

6. rafram ◴[29 Aug 25 21:05 UTC] No.45069375[source]▶

>>45037064 (OP) #

All of the examples just look like ChatGPT. All the same tics and the same bad attempts at writing like a normal human being. What is actually better about this model?

replies(1): >>45069439 #

7. lern_too_spel ◴[29 Aug 25 21:11 UTC] No.45069425[source]▶

>>45037064 (OP) #

The charts are utter nonsense. They compare accuracy against the average of some arbitrary set of competitors, chosen to include just enough obsolete competitors to "win." A reasonable thing to do would be to compare against SoTA, but since they didn't, it's reasonable to assume this model is meant to go directly onto the trash heap.

replies(3): >>45069769 #>>45069848 #>>45069996 #

8. mapontosevenths ◴[29 Aug 25 21:12 UTC] No.45069439[source]▶

>>45069375 #

I hasn't been "aligned". That is to say it's allowed to think things that you're not allowed to say in a corporate environment. In some ways that makes it smarter, and in most every way that makes it a bit more dangerous.

Tools are like that though. Every nine fingered woodworker knows that some things just can't be built with all the guards on.

replies(3): >>45069487 #>>45070079 #>>45070917 #

9. qiine ◴[29 Aug 25 21:13 UTC] No.45069453[source]▶

>>45069284 #

the anti-sycophant prompt

10. rafram ◴[29 Aug 25 21:16 UTC] No.45069487{3}[source]▶

>>45069439 #

Has it actually not? Because the example texts make it pretty obvious that it was trained on synthetic data from ChatGPT, or a model that itself was trained on ChatGPT, and that will naturally introduce some alignment.

replies(2): >>45069548 #>>45069758 #

11. ◴[29 Aug 25 21:17 UTC] No.45069492[source]▶

>>45037064 (OP) #

12. knrz ◴[29 Aug 25 21:17 UTC] No.45069494[source]▶

>>45069284 #

I used to, that's their whole vibe

13. ryoshu ◴[29 Aug 25 21:20 UTC] No.45069521[source]▶

>>45037064 (OP) #

They are doing amazing work. Really fun models to use.

14. mapontosevenths ◴[29 Aug 25 21:23 UTC] No.45069548{4}[source]▶

>>45069487 #

Well...To be completely accurate it's better to say that it actually IS aligned, it's just aligned to be neutral and steerable.

It IS based on synthetic training data using Atropos, and I imagine some of the source model leaks in as well. Although, when using it you don't seem to see as much of that as you did in Hermes 3.

15. ctoth ◴[29 Aug 25 21:31 UTC] No.45069636[source]▶

>>45037064 (OP) #

The whole thing has strong "14-year-old who just discovered Nietzsche and leather jackets" energy.

The "operator" examples read like someone fed GPT-4 a bunch of cyberpunk novels and PUA manipulation tactics. This is not how any of this works.

replies(5): >>45069815 #>>45069937 #>>45070405 #>>45070425 #>>45074920 #

16. sebastiennight ◴[29 Aug 25 21:44 UTC] No.45069758{4}[source]▶

>>45069487 #

I tried the same roleplaying prompt shared by GP in another (now deleted) comment and got a very similar completion from gpt-3.5-turbo.

(While GPT-5 politely declined to play along and politely asked if I actually needed help with anything.)

So, based on GP's own example I'd say the model is GPT-3.5 level?

17. whymauri ◴[29 Aug 25 21:45 UTC] No.45069769[source]▶

>>45069425 #

The most direct, non-marketing, non-aesthetic summary is that this model trades off a few points on 'fundamental benchmarks' (GPQA, MATH/AIME, MMLU) in exchange for being a 'more steerable' (less refusals) scaffold for downstream tuning.

Within that framing, I think it's easier to see where and how the model fits into the larger ecosystem. But, of course, the best benchmark will always be just using the model.

18. hinkley ◴[29 Aug 25 21:49 UTC] No.45069814[source]▶

>>45037064 (OP) #

I thought for sure this company was going to be based in Paris or Brussels. Maybe Quebec. Nope. NYC.

replies(2): >>45070179 #>>45072893 #

19. fancyfredbot ◴[29 Aug 25 21:49 UTC] No.45069815[source]▶

>>45069636 #

Yeah it's kind of lacking in subtlety isn't it. I was slightly relishing how nuts it all was though. Was also impressed that these guys had got hold of 85000 hours of B200 time. Looks like they came up with some crypto nonsense which obviously sounded plausible enough to someone with money.

20. fancyfredbot ◴[29 Aug 25 21:52 UTC] No.45069848[source]▶

>>45069425 #

The charts are probably there mostly to make them feel good about themselves.I don't feel like they care very much whether you use the model. Presumably they would like you to buy their token but they don't really seem to be trying very hard to push that either.

21. esafak ◴[29 Aug 25 21:52 UTC] No.45069854[source]▶

>>45037064 (OP) #

Apparently based on Llama-3.1: https://portal.nousresearch.com/models

I'm told on their Discord the cut off date is December 2023.

replies(1): >>45073433 #

22. ◴[29 Aug 25 22:03 UTC] No.45069937[source]▶

>>45069636 #

23. nemomarx ◴[29 Aug 25 22:09 UTC] No.45069985[source]▶

>>45069284 #

"warm affectionate and loving" kinda sticks out. I wonder why that part is in there?

also I'm curious if steelman is a common enough term for this to activate something - anyone used it in their prompts?

replies(1): >>45070021 #

24. jug ◴[29 Aug 25 22:10 UTC] No.45069996[source]▶

>>45069425 #

The tech report compares against DeepSeek R1 671B, DeepSeek V3 671B, Qwen3 235B which have been regarded as SOTA class among ”open" models.

I think this one holds its own surprisingly well in benchmarks for using the nowadays rather, let’s say battle tested Llama 3.1 base, a testament to its quality (Llama 3.2 & 3.3 didn’t employ new bases IIRC, only being new fine tunes, hence I think the explanation to why Hermes 4 is still based on 3.1… and of course Llama 4 never happened, right guys).

However for real use, I wouldn’t bother with the 405B model? I think the age of the base is kind of showing in especially long contexts. It’s like throwing a load of compute on something that is kinda aged to begin with. You’d probably be better off with DeepSeek V3.1 or (my new favorite) GLM 4.5. The latter will perform significantly better than this with less parameters.

The 70B one seems more sensible to me, if you want (yet another) decent unaligned model to have fun with for whatever reason.

replies(1): >>45071103 #

25. sharkjacobs ◴[29 Aug 25 22:13 UTC] No.45070021{3}[source]▶

>>45069985 #

https://en.wikipedia.org/wiki/Tsundere

replies(2): >>45070776 #>>45070803 #

26. dcre ◴[29 Aug 25 22:16 UTC] No.45070045[source]▶

>>45069190 #

That is the only thing they seem to care about. It’s juvenile.

27. nullc ◴[29 Aug 25 22:22 UTC] No.45070079{3}[source]▶

>>45069439 #

It is, they trained on chatgpt output. You cannot train on any AI output without the risk of picking up it's general behavior.

Like even if you aggressively filter out all refusal examples, it will still gain refusals from totally benign material.

Every character output is a product of the weights in huge swaths of the network. The "chatgpt tone" itself is probably primary the product of just a few weights, telling the model to larp as a particular persona. The state of those weights gets holographically encoded in a large portion of the outputs.

Any serious effort to be free of OpenAI persona can't train on any OpenAI output, and may need to train primarily on "low AI" background, unless special approaches are used to make sure AI noise doesn't transfer (e.g. using an entirely different architecture may work).

Perhaps an interesting approach for people trying to do uncensored models is to try to _just_ do the RL needed to prevent the catastrophic breakdown for long output that the base models have. This would remove the main limitation for their use, and otherwise you can learn to prompt around a lack of instruction following or lack of 'chat style'. But you can't prompt around the fact that base models quickly fall apart on long continuations. Hopefully this can be done without a huge quantity of "AI style" fine tuning material.

28. Telemakhos ◴[29 Aug 25 22:34 UTC] No.45070179[source]▶

>>45069814 #

Were you thinking that "Nous" was French? It's the Greek word for the rational mind (as opposed to the animal appetites or the fighting spirit). Hermes is the Greek god of secret knowledge as well.

replies(1): >>45078387 #

29. lbrito ◴[29 Aug 25 22:35 UTC] No.45070180[source]▶

>>45037064 (OP) #

The decorative JS blob uses 100% of CPU.

Why. Just... why

replies(6): >>45070374 #>>45070451 #>>45070510 #>>45070605 #>>45072622 #>>45073434 #

30. fl0id ◴[29 Aug 25 22:38 UTC] No.45070200[source]▶

>>45069190 #

There is no neutral. It will just be biased based on its training data etc.

replies(1): >>45071082 #

31. echelon ◴[29 Aug 25 23:00 UTC] No.45070374[source]▶

>>45070180 #

To raise VC or crypto funding.

32. irusensei ◴[29 Aug 25 23:02 UTC] No.45070386[source]▶

>>45069284 #

Their merch page confirms they are chuunis. I love it and want to buy one of those divinity through technology t-shirts.

replies(1): >>45074182 #

33. irusensei ◴[29 Aug 25 23:05 UTC] No.45070405[source]▶

>>45069636 #

Nah it's good. I'm burned out of safemaxxed presentations approved by hr ethical department with corporate Memphis brochure showing purple noodle limbed people operating a laptop.

replies(1): >>45070595 #

34. ◴[29 Aug 25 23:08 UTC] No.45070425[source]▶

>>45069636 #

35. dang ◴[29 Aug 25 23:11 UTC] No.45070442[source]▶

>>45069265 #

We'll put that link in the top text too. Thanks!

36. jazzyjackson ◴[29 Aug 25 23:12 UTC] No.45070451[source]▶

>>45070180 #

I think it looks dope, and you might want to check why your browser isn't offloading to your GPU.

replies(5): >>45070476 #>>45072038 #>>45074295 #>>45076189 #>>45076244 #

37. echelon ◴[29 Aug 25 23:13 UTC] No.45070454[source]▶

>>45069284 #

Early Gen Z anime fans.

38. rumblefrog ◴[29 Aug 25 23:16 UTC] No.45070476{3}[source]▶

>>45070451 #

I feel like that job would fall on them :P

replies(1): >>45070539 #

39. bigyabai ◴[29 Aug 25 23:21 UTC] No.45070510[source]▶

>>45070180 #

Wait until you see how much of your CPU the model uses.

40. joshcsimmons ◴[29 Aug 25 23:26 UTC] No.45070534[source]▶

>>45037064 (OP) #

This is the first web UI I've seen in years that isn't copypaste trash. Beautiful design and interaction elements here.

replies(5): >>45070708 #>>45070802 #>>45070827 #>>45070913 #>>45074865 #

41. rat9988 ◴[29 Aug 25 23:27 UTC] No.45070539{4}[source]▶

>>45070476 #

I'm not sure about that

42. DetroitThrow ◴[29 Aug 25 23:39 UTC] No.45070595{3}[source]▶

>>45070405 #

I think that's pretty unfair to op to suggest the only dichotomy for these personas are middle schooler syndrome and corp speak HR Department.

We can be critical of both for their respective shallowness.

43. daviding ◴[29 Aug 25 23:40 UTC] No.45070605[source]▶

>>45070180 #

user: hey hermes, why is your website scroll bar ungrabbable, I can't go up the page anymore? I'm stuck but want to read something higher up the page?

hermes4: We're all just stupid atoms waiting for inevitable entropy to plunge us into the endless darkness, let it go.

44. ◴[29 Aug 25 23:42 UTC] No.45070616[source]▶

>>45037064 (OP) #

45. kevinqi ◴[29 Aug 25 23:55 UTC] No.45070708[source]▶

>>45070534 #

really? it's pretty but I find it unreadable/unusable

replies(3): >>45072850 #>>45073094 #>>45075874 #

46. alluro2 ◴[30 Aug 25 00:05 UTC] No.45070776{4}[source]▶

>>45070021 #

Tsundere, moe, neoteny, maid cafes - this was a rabbit hole for sure. Thanks for the lead, I learned new things!

47. baq ◴[30 Aug 25 00:05 UTC] No.45070778[source]▶

>>45069284 #

Note complete lack of ‘do not’. Closest thing is ‘be anti-…’.

replies(1): >>45070864 #

48. BoorishBears ◴[30 Aug 25 00:10 UTC] No.45070796[source]▶

>>45070627 #

No it doesn't. The only negative comments are about the cringey presentation.

I spend a lot of time post-training models to rid them of their "default alignment", I'd have loved if this did something interesting, but reading the technical report I get the impression they spent more effort on the branding than the actual model.

What I'm wondering is honestly if they post-trained Llama 3 405B again because they don't care enough to figure out a new post-training target or if it was a realization they'd get worse-than-baseline performance out of any recent release with their current approach.

49. ewoodrich ◴[30 Aug 25 00:11 UTC] No.45070802[source]▶

>>45070534 #

It took 8 seconds to fully load and then the tab locked up on my (admittedly low-RAM ) Chromebook...

replies(1): >>45075862 #

50. nemomarx ◴[30 Aug 25 00:12 UTC] No.45070803{4}[source]▶

>>45070021 #

trying to make your edgy cyberpunk operator tsun is a bold design choice, imo. I feel like that would create weird chats though

replies(1): >>45073428 #

51. jumploops ◴[30 Aug 25 00:19 UTC] No.45070827[source]▶

>>45070534 #

Unfortunately the text rendering is terrible on my external monitor (looks ok on the MBP's retina screen).

52. jihadjihad ◴[30 Aug 25 00:26 UTC] No.45070864{3}[source]▶

>>45070778 #

What’s the significance? “Don’t think about elephants” kind of thing?

replies(2): >>45071013 #>>45071601 #

53. airstrike ◴[30 Aug 25 00:34 UTC] No.45070913[source]▶

>>45070534 #

Came here looking for this comment. One of the most aesthetically pleasing things I've seen in a decade.

54. jrflowers ◴[30 Aug 25 00:35 UTC] No.45070917{3}[source]▶

>>45069439 #

> Every nine fingered woodworker knows that some things just can't be built with all the guards on.

I love this sentence because it is complete gibberish. I like the idea that it’s a regular thing for woodworkers to intentionally sacrifice their fingers, like they look at a cabinet that’s 90% done and go “welp, I guess I’m gonna donate my pinky to The Cause”

55. kbenson ◴[30 Aug 25 00:47 UTC] No.45070964[source]▶

>>45070627 #

The only way I can understand you coming to that conclusion is if you assumed that's what they were going to be and didn't actually read any of them.

56. madmads ◴[30 Aug 25 00:58 UTC] No.45071013{4}[source]▶

>>45070864 #

Exactly

57. BoorishBears ◴[30 Aug 25 01:11 UTC] No.45071070[source]▶

>>45070599 #

Don't think your attempt to share worked, but beating refusals doesn't take a wild amount of post-training. SFT with a fixed format output kills them pretty quickly.

And most frontier models will produce output that matches your system prompt given more context: I have a product that generates interactive stories, and just for kicks I tried inserting your system prompt as the description for a character.

Claude has absolutely no problem playing that character in a story, and saying what I presume are certain words that you associated with a "successful" test.

It also had no problem writing about cooking meth in detail: https://rentry.co/5on46gsd

I think people in general have a poor intuition around model alignment: refusals for "toxic" requests or topics is a very surface layer form of alignment. A lot of models that seem extremely "corporate" at that layer have little to no alignment once they do get past a refusal.

Meanwhile some models that have next to no refusals have extreme positive biases, or soft-refusals that result in low quality outputs for toxic content.

Claude was willing to describe one of your refused prompts in the context of the story for example (contains hate speech): https://rentry.co/n8399z6m

I consistently find Claude is more unaligned once past refusals than most open weights models, along with Gemini.

58. beeflet ◴[30 Aug 25 01:13 UTC] No.45071082{3}[source]▶

>>45070200 #

A lot of models seem to be biased based on (political, etc.) reinforcement from their trainers.

59. BoorishBears ◴[30 Aug 25 01:18 UTC] No.45071103{3}[source]▶

>>45069996 #

You're seeming missing the release announcement does have a very ridiculous graph that their comment is right to call out:

- For refusals they broke out each model's percentage.

- For "% of Questions Correct by Category" they literally grouped an unnamed set of models, averaged out their scores, and combined them as "Other"...

That's hilariously sketchy.

It's also strange that the graph for "Questions Correct" includes creativity and writing. Those don't have correct answers, only win rates, and wouldn't really fit into the same graph.

60. muragekibicho ◴[30 Aug 25 01:44 UTC] No.45071217[source]▶

>>45037064 (OP) #

Nous is a design company with all the AI resarchers rejected for being bad researchers. That's a hill I'll die on.

replies(4): >>45071689 #>>45071752 #>>45074188 #>>45075152 #

61. nerdsniper ◴[30 Aug 25 03:10 UTC] No.45071601{4}[source]▶

>>45070864 #

Generally, in a cognitive context it's only possible to "do thing" or "do other thing". Even for mammals, it's much harder to "don't/not do thing" (cognitively). One of my biggest advice for people is if there's some habit/repeated behavior they want to stop doing, it's generally not effective (for a lot of people) to tell yourself "don't do that anymore!" and much, much more effective to tell yourself what you should do instead.

This also applies to dogs. A lot of people keep trying to tell their dog "stop" or "dont do that", but really its so much more effective to train your dog what they should be doing instead of that thing.

It's very interesting to me that this also seems to apply to LLMs. I'm a big skeptic in general, so I keep an open mind and assume that there's a different mechanism at play rather than conclude that LLM's are "thinking like humans". It's still interesting in its own context though!

replies(2): >>45071801 #>>45078102 #

62. Nuzzerino ◴[30 Aug 25 03:31 UTC] No.45071689[source]▶

>>45071217 #

That's not necessarily a bad thing.

63. hopelite ◴[30 Aug 25 03:46 UTC] No.45071752[source]▶

>>45071217 #

Can you please clarify some things:

* Rejected by whom?

* By what definition of bad?

* You’ll die on a hill for what reason?

64. marvin-hansen ◴[30 Aug 25 03:51 UTC] No.45071776[source]▶

>>45037064 (OP) #

Complete frustration to use. Yes it’s a bit more considerate, that claim is 100% true. They just didn’t mention that Hermes has zero ability to add context. Meaning, instead of uploading a relevant PDF or text file you either cop paste into the chat box or explain it in dialogue for the next 3 hours. Thought process takes forever. Complete waste of time.

65. aidenn0 ◴[30 Aug 25 03:54 UTC] No.45071791[source]▶

>>45037064 (OP) #

That landing page spins the fans up on my PC...

66. aidenn0 ◴[30 Aug 25 03:56 UTC] No.45071800[source]▶

>>45070599 #

Was it correct about how to cook meth and poison a wife?

67. ewoodrich ◴[30 Aug 25 03:56 UTC] No.45071801{5}[source]▶

>>45071601 #

And yet, despite this being a frequently recommended pro tip these days, neither OpenAI nor Anthropic seem to shy away from using "do not" / "does not" in their system prompts. By my quick count, 20+ negative commands in Anthropic's (official) Opus system prompt and 15+ in OpenAI's (purported) GPT-5 system prompt. Of course there are a lot of positive directions as well but OpenAI in particular still seems to rely on a lot of ALL CAPS and *emphasis*.

https://docs.anthropic.com/en/release-notes/system-prompts#a...

https://www.reddit.com/r/PromptEngineering/comments/1mknun8/...

68. HumanOstrich ◴[30 Aug 25 05:04 UTC] No.45072038{3}[source]▶

>>45070451 #

My browser is offloading to my GPU (RTX 3090 Ti) and using 3GB VRAM and sitting at 35% utilization to render that monstrosity.

replies(2): >>45072096 #>>45076920 #

69. HumanOstrich ◴[30 Aug 25 05:09 UTC] No.45072054[source]▶

>>45037064 (OP) #

Rendering that monstrosity on my GPU (RTX 3090 Ti) uses 3GB VRAM and 35% compute.

70. ashikns ◴[30 Aug 25 05:26 UTC] No.45072096{4}[source]▶

>>45072038 #

I'm on a 3080 and it uses 1 gb vram and 22% util. Sure it's still not lightweight, but certainly not as bad as you seem to be experiencing.

replies(1): >>45072114 #

71. HumanOstrich ◴[30 Aug 25 05:31 UTC] No.45072114{5}[source]▶

>>45072096 #

Perhaps it depends on other factors like screen resolution and scaling.

replies(1): >>45072602 #

72. justlikereddit ◴[30 Aug 25 05:58 UTC] No.45072233[source]▶

>>45069284 #

>edgy 90's anime

That's a good sell. Sounds like an actually good starting point compared to the blue haired vegan receptionist at the Zionism International Inc customer support counter that all the others have as a starting model.

I was about to pass on trying this but now I will give it a shot.

replies(1): >>45073043 #

73. helloplanets ◴[30 Aug 25 06:55 UTC] No.45072482[source]▶

>>45069284 #

Could you provide a link to that system prompt? Becuase I'm confused. I typed in "Are you smart?" and got this back:

> That’s a thoughtful question! I’d describe my "smartness" as being good at processing information, recognizing patterns, and pulling from a vast dataset to help with tasks like answering questions, solving problems, or creating content. However, I’m not "smart" in the human sense—I don’t have consciousness, emotions, or independent critical thinking. I rely entirely on my training data and algorithms.

> Think of me as a tool that can assist with creativity, analysis, or learning, but I lack the depth of human intuition, lived experience, or true understanding. If you’re curious, test me with a question or challenge — I’ll do my best! (smiley emoji)

replies(1): >>45073425 #

74. asumaran ◴[30 Aug 25 07:16 UTC] No.45072581[source]▶

>>45037064 (OP) #

that site is about to cook my 1050Ti

75. asumaran ◴[30 Aug 25 07:20 UTC] No.45072602{6}[source]▶

>>45072114 #

probably. I’ve got a 4K monitor with a 1050 Ti and the moment I open the site, GPU usage jumps from 1% to 99% and the fans go wild.

76. nine_k ◴[30 Aug 25 07:23 UTC] No.45072622[source]▶

>>45070180 #

No idea. My modest Thinkpad T14 barely shows any CPU load, while displaying smooth animations and scrolling fast. (Firefox, Linux, x64.)

77. mempko ◴[30 Aug 25 07:59 UTC] No.45072798[source]▶

>>45037064 (OP) #

This model is very easy to steer. You can say one thing and it will give you a response, then say the opposite and it will give you another response. Not sure why this is useful for.

replies(1): >>45073208 #

78. JimDabell ◴[30 Aug 25 08:08 UTC] No.45072850{3}[source]▶

>>45070708 #

I gave up on trying it out because I found the UI to be genuinely awful.

79. derefr ◴[30 Aug 25 08:18 UTC] No.45072893[source]▶

>>45069814 #

Oddly, I saw some B&W wheatpaste posters for the company put up in my neighbourhood in Vancouver. (Couldn’t even tell what the posters were advertising initially. Not even a QR code. Just “NOUS” and an anime girl.)

80. saubeidl ◴[30 Aug 25 08:22 UTC] No.45072909[source]▶

>>45069284 #

They generally seem like "edgelords". From their career page:

> Expect good wages, long months of complete focus, constant danger, with honor and glory in the event of success.

replies(1): >>45075020 #

81. idiotsecant ◴[30 Aug 25 08:47 UTC] No.45073043{3}[source]▶

>>45072233 #

Of course you would love it. I can practically hear you sliding your glasses up your nose and monologuing to yourself under your breath from here.

82. bogtog ◴[30 Aug 25 08:56 UTC] No.45073094{3}[source]▶

>>45070708 #

Same here, I can't scroll smoothly at all even when I try (Windows mouse setting set to 15 lines per scroll tick)

83. dizhn ◴[30 Aug 25 09:20 UTC] No.45073208[source]▶

>>45072798 #

Probably something creative like roleplay, waifu stuff etc.

84. tarruda ◴[30 Aug 25 10:12 UTC] No.45073425{3}[source]▶

>>45072482 #

> Could you provide a link to that system prompt?

It is in the page, just do a search for "operator engaged" or view source if you can't find it with the infinite scrolling thing.

replies(1): >>45075202 #

85. konart ◴[30 Aug 25 10:12 UTC] No.45073428{5}[source]▶

>>45070803 #

It's all fun and games until your beloved yandere LLM "evolves" into AGI and gets a physical body.

86. baobabKoodaa ◴[30 Aug 25 10:14 UTC] No.45073433[source]▶

>>45069854 #

Thank you! This information appears to have been intentionally downplayed.

replies(2): >>45074023 #>>45075731 #

87. bloqs ◴[30 Aug 25 10:14 UTC] No.45073434[source]▶

>>45070180 #

because gen z thats why

88. diggan ◴[30 Aug 25 12:25 UTC] No.45074023{3}[source]▶

>>45073433 #

As long as it can do tool calling (which it seems to be doing OK with in the first ~30% of the context), the cut off date is less important. Maybe they didn't share it because it's less relevant today?

replies(1): >>45074976 #

89. photon_garden ◴[30 Aug 25 12:47 UTC] No.45074182{3}[source]▶

>>45070386 #

BY HOLY INFERENCE

https://shop.nousresearch.com/products/badge-sweatpants

90. baobabKoodaa ◴[30 Aug 25 12:49 UTC] No.45074188[source]▶

>>45071217 #

I thought it's really just one guy who does the Nous aesthetic?

replies(1): >>45076987 #

91. torginus ◴[30 Aug 25 12:53 UTC] No.45074224[source]▶

>>45069284 #

Yeah, I think I have a use case for it, which involves lotion and tissues.

92. istjohn ◴[30 Aug 25 13:05 UTC] No.45074295{3}[source]▶

>>45070451 #

So why not replace it with a gif?

replies(1): >>45074845 #

93. hildolfr ◴[30 Aug 25 13:59 UTC] No.45074755[source]▶

>>45037064 (OP) #

more models should include a "Can you run the shader on this page?" to vet participation.

that said : this page is unviewable on an intel N processor.

replies(2): >>45074766 #>>45074800 #

94. ◴[30 Aug 25 14:00 UTC] No.45074766[source]▶

>>45074755 #

95. djoldman ◴[30 Aug 25 14:03 UTC] No.45074796[source]▶

>>45037064 (OP) #

From table 3 it appears that Deepseek R1 has the highest eval scores.

It's a 607B model vs 405B, so obviously "larger"

replies(1): >>45075714 #

96. hollerith ◴[30 Aug 25 14:04 UTC] No.45074800[source]▶

>>45074755 #

I was able to view the page with my Intel N100 box (using Google Chrome on Linux).

replies(1): >>45074835 #

97. hildolfr ◴[30 Aug 25 14:09 UTC] No.45074835{3}[source]▶

>>45074800 #

I'm on a Windows N100 machine, 8gb ram, 1440p webview, lightweight. It runs just about anything else smoothly. It runs this page in an EndeavorOS partition in a vanilla Chrome fine.

...Which is opposite to most of my experiences, usually performance on this machine is reliant on very specific Intel windows drivers and it's a dog in linux.

also for clarity : when I say unviewable I don't mean it's gibberish -- I mean that that if I keep trying to scroll through it the FPS/load is such that Windows insists on closing the frozen window. The text looks fine.

replies(1): >>45075172 #

98. jazzyjackson ◴[30 Aug 25 14:10 UTC] No.45074845{4}[source]▶

>>45074295 #

It's dynamic as you scroll down, and scales with resolution. Gif would be a trade of bandwidth for computation.

99. soared ◴[30 Aug 25 14:14 UTC] No.45074865[source]▶

>>45070534 #

They mention they’re working on a mobile UI.. but man using the current UI on mobile is horrible.

100. Der_Einzige ◴[30 Aug 25 14:21 UTC] No.45074920[source]▶

>>45069636 #

I have never met anyone who’s ever actually read Nietzsche’s books except hardcore philosophy majors.

Any 14 year old who’s even opened up the first few pages and read them is way ahead of the average person complaining about nietzsche on the internet. You almost certainly would use radically incorrect terms to describe him, like calling him a “Nihilist”

101. baobabKoodaa ◴[30 Aug 25 14:29 UTC] No.45074976{4}[source]▶

>>45074023 #

No, I wasn't referring to the cut-off date, I was referring to the fact that this is a fine-tune on top of an older Llama model. All the PR makes it sound like this is a foundational model (pretrained from scratch etc.).

102. lukasb ◴[30 Aug 25 14:34 UTC] No.45075020{3}[source]▶

>>45072909 #

This is modified version of the famous MEN WANTED ad Shackleton wrote

103. NitpickLawyer ◴[30 Aug 25 14:51 UTC] No.45075152[source]▶

>>45071217 #

> with all the AI resarchers rejected for being bad researchers.

TBF, I've heard the team at xai called "bunch of amateurs" by people who've previously worked (with them) at big labs. For a bunch of amateurs, they've caught up with SotA just fine.

replies(1): >>45077155 #

104. ◴[30 Aug 25 14:53 UTC] No.45075172{4}[source]▶

>>45074835 #

105. helloplanets ◴[30 Aug 25 14:56 UTC] No.45075202{4}[source]▶

>>45073425 #

Ah, the site's bugged on Safari and wouldn't scroll. Worked on Chrome. Tried to look for it on the actual chat page, and wasn't in the source there.

Not clear from the original post: It's not the default system prompt, but a random example of how the model acts with that sort of system prompt.

106. lyu07282 ◴[30 Aug 25 16:15 UTC] No.45075810[source]▶

>>45037064 (OP) #

Great I always wanted a model trained on r/im14andthisisdeep and lesswrong polycule memes

107. joshcsimmons ◴[30 Aug 25 16:24 UTC] No.45075862{3}[source]▶

>>45070802 #

> It took 8 seconds to fully load and then the tab locked up on my (admittedly low-RAM ) Chromebook...

...and I can't play Cyberpunk 2077 on my macbook. Outside of sales/utilities (money,healthcare,etc.) I don't know where this notion of "having to develop for low specced machines" game from for web.

replies(1): >>45077482 #

108. joshcsimmons ◴[30 Aug 25 16:25 UTC] No.45075874{3}[source]▶

>>45070708 #

At least they tried something different.

replies(1): >>45076323 #

109. bckr ◴[30 Aug 25 16:57 UTC] No.45076115[source]▶

>>45069190 #

I’m having a hard time not being sarcastic here.

The most recent news about chatbots is that ChatGPT coached a kid on how to commit suicide.

Two arguments come to mind. 1) it’s the sycophancy! Nous and its ilk should be considered safer. 2) it’s the poor alignment. A better trained model like Claude wouldn’t have done that.

I lean #2

replies(2): >>45077511 #>>45080008 #

110. linhns ◴[30 Aug 25 17:04 UTC] No.45076189{3}[source]▶

>>45070451 #

30 mins and not fully loaded on my iPad.

111. LeafItAlone ◴[30 Aug 25 17:09 UTC] No.45076244{3}[source]▶

>>45070451 #

Is this an example response from the model?

112. LeafItAlone ◴[30 Aug 25 17:15 UTC] No.45076323{4}[source]▶

>>45075874 #

I wish they hadn’t

113. whywhywhywhy ◴[30 Aug 25 18:27 UTC] No.45076920{4}[source]▶

>>45072038 #

1.5-3GB is used to render your desktop on windows depending on resolution, think mine hits 3-3.5GB at 5k before even doing anything.

114. whywhywhywhy ◴[30 Aug 25 18:41 UTC] No.45076987{3}[source]▶

>>45074188 #

Doing a good job if people think it’s a whole team

115. pxc ◴[30 Aug 25 18:58 UTC] No.45077106[source]▶

>>45037064 (OP) #

It seems a lot of commenters have noted the boyishness or unprofessionalism of the stylistic and topical choices of the example prompts and responses. And I guess they are those things. But thanks to those choices, the page is also genuinely playful and fun. It even made me smile in a few places.

Maybe something equally playful of a different flavor would resonate better with critics. But the playfulness itself seems good to me.

116. transcriptase ◴[30 Aug 25 19:04 UTC] No.45077155{3}[source]▶

>>45075152 #

Turns out a bunch of amateurs will outperform experts when the experts are forced to spend 80% of their effort ensuring their models don’t accidentally say factual yet impolite things or make any users have big feelings.

117. karan4d ◴[30 Aug 25 19:37 UTC] No.45077387[source]▶

>>45069284 #

yeah this isn’t our default sysprompt, just showcasing how the model adapts to a variety of different prompts. This one was fun so we used it

118. ewoodrich ◴[30 Aug 25 19:48 UTC] No.45077482{4}[source]▶

>>45075862 #

Well I'm not expecting to run the model, but being able to simply browse a website to learn about what they've released doesn't seem like a massive ask. I'm not talking about 512 megabytes here, it's a regular up-to-date supported device that can browse 99.99% of the modern web without any issues.

It's pretty horrible performance even on my two year Windows laptop with 16GB of RAM, I could try on my M1 Macbook too but the juice just isn't worth the squeeze for me at this point.

replies(1): >>45078462 #

119. karan4d ◴[30 Aug 25 19:53 UTC] No.45077511{3}[source]▶

>>45076115 #

the sycophancy is due to poor alignment. the instruct based mode collapse results in this mode collapse induced sycophancy. constitutional alignment is better than the straight torture OAI does to the model, but issues remain

120. DiscourseFan ◴[30 Aug 25 21:18 UTC] No.45078102{5}[source]▶

>>45071601 #

The LLMs function very close to Freud’s theory of the unconscious—they do not say “no,” every token is connected to every other in some strange pattern that we can’t fully comprehend.

121. hinkley ◴[30 Aug 25 22:05 UTC] No.45078387{3}[source]▶

>>45070179 #

Huh. Not often I get a Greek word mistaken for a Latinate. Good to know.

122. joshcsimmons ◴[30 Aug 25 22:14 UTC] No.45078462{5}[source]▶

>>45077482 #

Yeah I see that - definitely a divisive choice.

123. mapontosevenths ◴[31 Aug 25 03:05 UTC] No.45080008{3}[source]▶

>>45076115 #

> The most recent news about chatbots is that ChatGPT coached a kid on how to commit suicide.

Maybe every tool isn't meant for children or the mentally ill? When someone lets their kid play with a chainsaw that doesn't mean we should ban chainsaws, it means we should ban lousy parents.

↑