Most active commenters
  • futurisold(6)

←back to thread

222 points futurisold | 17 comments | | HN request time: 1.329s | source | bottom
1. sram1337 ◴[] No.44400463[source]
This is the voodoo that excites me.

Examples I found interesting:

Semantic map lambdas

  S = Symbol(['apple', 'banana', 'cherry', 'cat', 'dog'])
  print(S.map('convert all fruits to vegetables'))
  # => ['carrot', 'broccoli', 'spinach', 'cat', 'dog']

comparison parameterized by context

  # Contextual greeting comparison
  greeting = Symbol('Hello, good morning!')
  similar_greeting = 'Hi there, good day!'

  # Compare with specific greeting context
  result = greeting.equals(similar_greeting, context='greeting context')
  print(result) # => True

  # Compare with different contexts for nuanced evaluation
  formal_greeting = Symbol('Good morning, sir.')
  casual_greeting = 'Hey, what\'s up?'

  # Context-aware politeness comparison
  politeness_comparison = formal_greeting.equals(casual_greeting, context='politeness level')
  print(politeness_comparison) # => False
bitwise ops

  # Semantic logical conjunction - combining facts and rules
  horn_rule = Symbol('The horn only sounds on Sundays.', semantic=True)
  observation = Symbol('I hear the horn.')
  conclusion = horn_rule & observation # => Logical inference
`interpret()` seems powerful.

OP, what inspired you to make this? Where are you applying it? What has been your favorite use case so far?

replies(3): >>44400592 #>>44401514 #>>44401936 #
2. futurisold ◴[] No.44400592[source]
That's gonna be a very, very, long answer. What's funny is that not much changed since 2022 (eoy) when the projected started; the models just got better, but we had a good chunk of the primitives since gpt-3.

What's more recent is the DbC contribution which I think is unique. It literally solved anything agent related I've thrown at it -- especially because I can chain contracts together and the guardrails propagate nicely.

I've built most the custom tools myself. For instance, not only perplexity was rendered useless by openai's web search, but openai's web search itself is not good enough for what you can customize yourself. To this end, I've built my own deep research agent. Here's a thread with some results from the first day it was working: https://x.com/futurisold/status/1931751644233945216

I'm also running a company, and we've built an e2e document generation just from contracts (3 contracts chained together in this case). Here's an output (sorry about the PDF rendering, that's not what we serve, it's just what I've quickly hacked for local dev): https://drive.google.com/file/d/1Va7ALq_N-fTYeumKhH4jSxsTrWD...

This was the input:

---

Prompt:

> I want the files to be analyzed and I am interested in finding patterns; feel free to make suggestions as well. I want to understand how different providers use their system prompts, therefore things like: what kind of tags do they use – are they XML, markdown, etc, are they prone toward sycophancy or trying to manipulate the user, are they using tools and if so how, etc. I want the tech report to deconstruct and synthesize and compare the information, find interesting patterns that would be hard to spot.

Generated instructions:

(a) Query: Conduct a comparative analysis of system prompts across major AI providers (OpenAI, Google, Anthropic, xAI, etc.) to identify structural patterns, linguistic frameworks, and operational constraints that shape AI behavior and responses.

(b) Specific Questions:

1. What syntactic structures and formatting conventions (XML, markdown, JSON, etc.) are employed across different AI system prompts, and how do these technical choices reflect different approaches to model instruction?

2. To what extent do system prompts encode instructions for deference, agreeability, or user manipulation, and how do these psychological frameworks vary between commercial and research-focused models?

3. How do AI providers implement and constrain tool usage in their system prompts, and what patterns emerge in permission structures, capability boundaries, and function calling conventions?

4. What ethical guardrails and content moderation approaches appear consistently across system prompts, and how do implementation details reveal different risk tolerance levels between major AI labs?

5. What unique architectural elements in specific providers' system prompts reveal distinctive engineering approaches to model alignment, and how might these design choices influence downstream user experiences?

---

Contracts were introduced in March in this post: https://futurisold.github.io/2025-03-01-dbc/

They evolved a lot since then, but the foundation and motivation didn't change.

replies(2): >>44400743 #>>44400879 #
3. futurisold ◴[] No.44400743[source]
Btw, besides the prompt, the other input to the technical report (the gdrive link) was this repo: https://github.com/elder-plinius/CL4R1T4S/tree/main
4. futurisold ◴[] No.44400879[source]
One last comment here on contracts; an excerpt from the linked post I think it's extremely relevant for LLMs, maybe it triggers an interesting discussion here:

"The scope of contracts extends beyond basic validation. One key observation is that a contract is considered fulfilled if both the LLM’s input and output are successfully validated against their specifications. This leads to a deep implication: if two different agents satisfy the same contract, they are functionally equivalent, at least with respect to that specific contract.

This concept of functional equivalence through contracts opens up promising opportunities. In principle, you could replace one LLM with another, or even substitute an LLM with a rule-based system, and as long as both satisfy the same contract, your application should continue functioning correctly. This creates a level of abstraction that shields higher-level components from the implementation details of underlying models."

replies(1): >>44403730 #
5. haileys ◴[] No.44401514[source]
Why is carrot the vegetablefication of apple?
replies(3): >>44401518 #>>44401567 #>>44401606 #
6. herval ◴[] No.44401518[source]
Also if you run it twice, is it gonna be a carrot again?
replies(3): >>44401564 #>>44403465 #>>44405055 #
7. ◴[] No.44401564{3}[source]
8. pfdietz ◴[] No.44401567[source]
Are you asking for the root cause?
9. HappMacDonald ◴[] No.44401606[source]
I think it's interpreting the command as "replace each fruit with a vegetable", and it might intuit "make the resulting vegetables unique from one another" but otherwise it's not trying to find the "most similar" vegetable to every fruit or anything like that.
replies(1): >>44403378 #
10. lmeyerov ◴[] No.44401936[source]
You might enjoy Lotus: https://github.com/lotus-data/lotus

It takes all the core relational operators and makes an easy semantic version of each as a python dataframe library extension . Each call ends up being a 'model' point in case you also want to do fancier things later like more learning based approaches. Afaict, snowflake and friends are moving in this direction for their cloud SQLs as well.

We ended up doing something similar for louie.ai , where you use AI notebooks/dashboards/APIs (ex: MCP) to talk to your data (splunk, databricks, graph db, whatever), and it'll figure out symbolic + semantic operators based on the context. Super helpful in practice.

My 80% case here is:

- semantic map: "get all the alerts from splunk index xyz, add a column flagging anything suspicious and another explaining why" <--- generates an enriched dataframe

- semantic map => semantic reduce: "... then summarize what you found" <--- then tells you about it in natural text

11. futurisold ◴[] No.44403378{3}[source]
This is the correct view. Since the instruction was ambiguous, the LLM did its best to satisfy it -- and it did.
12. futurisold ◴[] No.44403465{3}[source]
It's subjected to randomness. But you're ultimately in control of the LLMs's hyperparams -- temperature, top_p, and seed -- so, you get deterministic outputs if that's what you need. However, there are downsides to this kind of LLM deterministic tweaks because of the inherent autoregressive nature of the LLM.

For instance, with temperature 1 there *could be* a path that satisfies your instruction which otherwise gets missed. There's interesting work here at the intersection of generative grammars and LLMs, where you can cast the problem as an FSM/PA automaton such that you only sample from that grammar with the LLM (you use something like logits_bias to turn off unwanted tokens and keep only those that define the grammar). You can define grammars with libs like lark or parsimonious, and this was how people solved JSON format with LLMs -- JSON is a formal grammar.

Contracts alleviate some of this through post validation, *as long as* you find a way to semantically encode your deterministic constraint.

13. dogcomplex ◴[] No.44403730{3}[source]
Anyone interested in this from a history / semiotics / language-theory perspective should look into the triad concepts of:

Sign (Signum) - The thing which points Locus - The thing being pointed to Sense (Sensus) - The effect/sense in the interpreter

Also known by: Representation/Object/Interpretation, Symbol/Referent/Thought, Signal/Data/User, Symbol/State/Update. Same pattern has been independently identified many many times through history, always ending up with the triplet, renamed many many times.

What you're describing above is the "Locus" essential object being pointed to, fulfilled by different contracts/LLMs/systems but the same essential thing always being eluded to. There's an elegant stability to it from a systems design pov. It makes strong sense to build around those as the indexes/keys being pointed towards, and then various implementations (Signs) attempting to achieve them. I'm building a similar system atm.

replies(1): >>44403784 #
14. futurisold ◴[] No.44403784{4}[source]
Thanks for bringing this up. I'm fairly familiar with Peirce's triadic semiotics and Montague's semantics, and they show up in some of my notes. I haven't turned those sketches into anything applied yet, but the design space feels *huge* and quite promising intuitively.
replies(1): >>44405289 #
15. d0100 ◴[] No.44405055{3}[source]
Since these seem like short prompts, you can send as context data that was correct on past prompts

You can create a test suite for your code that will compile correct results according to another prompt or dictionary verification

  t.test(
     Symbol(['apple', 'banana', 'cherry', 'cat', 'dog']).map('convert all fruits to vegetables'),
     "list only has vegetable and cat,dog"
  )
16. VinLucero ◴[] No.44405289{5}[source]
Agreed. This is a very interesting discussion! Thanks for bringing it to light.

Have you read Escher, Bach, Gödel: the Eternal Golden Braid?

replies(1): >>44461571 #
17. dogcomplex ◴[] No.44461571{6}[source]
Of course! And yes, a Locus appears to be very close in concept to a strange attractor. I am especially interested in the idea of the holographic principle, where each node has its own low-fidelity map of the rest of the (graph?) system and can self-direct its own growth and positioning. Becomes more of a marketplace of meaning, and useful for the fuzzier edges of entity relationships that we're working with now.