←back to thread

1401 points alankay | 2 comments | | HN request time: 0s | source

This request originated via recent discussions on HN, and the forming of HARC! at YC Research. I'll be around for most of the day today (though the early evening).
Show context
wdanilo ◴[] No.11941656[source]
Hi Alan! I've got some assumptions regarding the upcoming big paradigm shift (and I believe it will happen sooner than later):

1. focus on data processing rather than imperative way of thinking (esp. functional programming)

2. abstraction over parallelism and distributed systems

3. interactive collaboration between developers

4. development accessible to a much broader audience, especially to domain experts, without sacrificing power users

In fact the startup I'm working in aims exactly in this direction. We have created a purely functional visual<->textual language Luna ( http://www.luna-lang.org ).

By visual<->textual I mean that you can always switch between code, graph and vice versa.

What do you think about these assumptions?

replies(2): >>11942789 #>>11945722 #
alankay ◴[] No.11945722[source]
What if "data" is a really bad idea?
replies(3): >>11945869 #>>11956981 #>>11984719 #
richhickey ◴[] No.11945869[source]
Data like that sentence? Or all of the other sentences in this chat? I find 'data' hard to consider a bad idea in and of itself, i.e. if data == information, records of things known/uttered at a point in time. Could you talk more about data being a bad idea?
replies(2): >>11946532 #>>11948698 #
alankay ◴[] No.11946532[source]
What is "data" without an interpreter (and when we send "data" somewhere, how can we send it so its meaning is preserved?)
replies(3): >>11946764 #>>11957966 #>>11959640 #
richhickey ◴[] No.11946764[source]
Data without an interpreter is certainly subject to (multiple) interpretation :) For instance, the implications of your sentence weren't clear to me, in spite of it being in English (evidently, not indicated otherwise). Some metadata indicated to me that you said it (should I trust that?), and when. But these seem to be questions of quality of representation/conveyance/provenance (agreed, important) rather than critiques of data as an idea. Yes, there is a notion of sufficiency ('42' isn't data).

Data is an old and fundamental idea. Machine interpretation of un- or under-structured data is fueling a ton of utility for society. None of the inputs to our sensory systems are accompanied by explanations of their meaning. Data - something given, seems the raw material of pretty much everything else interesting, and interpreters are secondary, and perhaps essentially, varied.

replies(2): >>11946935 #>>11946989 #
alankay ◴[] No.11946935[source]
There are lots of "old and fundamental" ideas that are not good anymore, if they ever were.

The point here is that you were able to find the interpreter of the sentence and ask a question, but the two were still separated. For important negotiations we don't send telegrams, we send ambassadors.

This is what objects are all about, and it continues to be amazing to me that the real necessities and practical necessities are still not at all understood. Bundling an interpreter for messages doesn't prevent the message from being submitted for other possible interpretations, but there simply has to be a process that can extract signal from noise.

This is particularly germane to your last paragraph. Please think especially hard about what you are taking for granted in your last sentence.

replies(5): >>11947046 #>>11947082 #>>11947341 #>>11947809 #>>11958689 #
richhickey ◴[] No.11947809[source]
Without the 'idea' of data we couldn't even have a conversation about what interpreters interpret. How could it be a "really bad" idea? Data needn't be accompanied by an interpreter. I'm not saying that interpreters are unimportant/uninteresting, but they are separate. Nor have I said or implied that data is inherently meaningful.

Take a stream of data from a seismometer. The seismometer might just record a stream of numbers. It might put them on a disk. Completely separate from that, some person or process, given the numbers and the provenance alone (these numbers are from a seismometer), might declare "there is an earthquake coming". But no object sent an "earthquake coming" "message". The seismometer doesn't "know" an earthquake is coming (nor does the earth, the source of the 'messages' it records), so it can't send a "message" incorporating that "meaning". There is no negotiation or direct connection between the source and the interpretation.

We will soon be drowning in a world of IoT sensors sending context-or-provenance-tagged but otherwise semantic-free data (necessarily, due to constraints, without accompanying interpreters) whose implications will only be determined by downstream statistical processing, aggregation etc, not semantic-rich messaging.

If you meant to convey "data alone makes for weak messages/ambassadors", well ok. But richer messages will just bottom out at more data (context metadata, semantic tagging, all more data) Ditto, as someone else said, any accompanying interpreter (e.g. bytecode? - more data needing interpretation/execution). Data remains a perfectly useful and more fundamental idea than "message". In any case, I thought we were talking about data, not objects. I don't think there is a conflict between these ideas.

replies(1): >>11948729 #
alankay ◴[] No.11948729[source]
2nd Paragraph: How do they know they are even bits? How do they know the bits are supposed to be numbers? What kind of numbers? Relating to what?

Etc

replies(2): >>11949133 #>>11956853 #
richhickey ◴[] No.11949133[source]
It contravenes the common and historical use of the word 'data' to imply undifferentiated bits/scribbles. It means facts/observations/measurements/information and you must at least grant it sufficient formatting and metadata to satisfy that definition. The fact that most data requires some human involvement for interpretation (e.g. pointing the right program at the right data) in no way negates its utility (we've learned a lot about the universe by recording data and analyzing it over the centuries), even though it may be insufficient for some bootstrapping system you envision.
replies(3): >>11957323 #>>11957898 #>>11958852 #
jsprogrammer ◴[] No.11958852[source]
Your selection of data is arbitrary.

Not only is your perception based on an interpreter, but how can you be sure that you were even given all of the relevant bits? Or, even what the bits really meant/are?

replies(1): >>11978703 #
1. madmax96 ◴[] No.11978703{3}[source]
Of course the selection of data is arbitrary -- but Rich gives us a definition, which he makes abundantly clear and uses consistently. All definitions can be considered arbitrary. He's not making any claim that we have all the relevant bits of data or that we can be sure what the data really means or represents.

But we can expound on this problem in general. In any experiment where we gather data, how can we be sure we have collected a sufficient quantity to justify conclusions (and even if we are using statistical methods that our underlying assumptions are indeed consistent with reality) and that we have accrued all the necessary components? What you're really getting at is an __epistemological__ problem.

My school of thought is that the only way to proceed is to do our best with the data we have. We'll make mistakes, but that's better than the alternative (not extrapolating on data at all.)

replies(1): >>11984230 #
2. jsprogrammer ◴[] No.11984230[source]
I hope we can do our best, I'm just not sure there is really a satisfactory way to define/measure/judge that we have actually done so....