←back to thread

171 points voat | 4 comments | | HN request time: 2.304s | source
Show context
thenaturalist ◴[] No.42158900[source]
I don't want to come off as too overconfident, but would be very hard pressed to see the value of this.

At face value, I shudder at the syntax.

Example from their tutorial:

EmployeeName(name:) :- Employee(name:);

Engineer(name:) :- Employee(name:, role: "Engineer");

EngineersAndProductManagers(name:) :- Employee(name:, role:), role == "Engineer" || role == "Product Manager";

vs. the equivalent SQL:

SELECT Employee.name AS name

FROM t_0_Employee AS Employee

WHERE (Employee.role = "Engineer" OR Employee.role = "Product Manager");

SQL is much more concise, extremely easy to follow.

No weird OOP-style class instantiation for something as simple as just getting the name.

As already noted in the 2021 discussion, what's actually the killer though is adoption and, three years later, ecosystem.

SQL for analytics has come an extremely long way with the ecosystem that was ignited by dbt.

There is so much better tooling today when it comes to testing, modelling, running in memory with tools like DuckDB or Ibis, Apache Iceberg.

There is value to abstracting on top of SQL, but it does very much seem to me like this is not it.

replies(4): >>42158997 #>>42159072 #>>42159873 #>>42162215 #
aseipp ◴[] No.42159072[source]
Logica is in the Datalog/Prolog/Logic family of programming languages. It's very familiar to anyone who knows how to read it. None of this has anything to do with OOP at all and you will heavily mislead yourself if you try to map any of that thinking onto it. (Beyond that, and not specific to Logica or SQL in any way -- comparing two 3-line programs to draw conclusions is effectively meaningless. You have to actually write programs bigger than that to see the whole picture.)

Datalog is not really a query language, actually. But it is relational, like SQL, so it lets you express relations between "facts" (the rows) inside tables. But it is more general, because it also lets you express relations between tables themselves (e.g. this "table" is built from the relationship between two smaller tables), and it does so without requiring extra special case semantics like VIEWs.

Because of this, it's easy to write small fragments of Datalog programs, and then stick it together with other fragments, without a lot of planning ahead of time, meaning as a language it is very compositional. This is one of the primary reasons why many people are interested in it as a SQL alternative; aside from your typical weird SQL quirks that are avoided with better language design (which are annoying, but not really the big picture.)

replies(3): >>42159145 #>>42159449 #>>42159858 #
1. cess11 ◴[] No.42159449[source]
Right, so that's what they claim, that you'll get small reusable pieces.

But: "Logica compiles to SQL".

With the caveat that it only kind of does, since it seems constrained to three database engines, probably the one they optimise the output to perform well on, one where it usually doesn't matter and one that's kind of mid performance wise anyway.

In light of that quote it's also weird that they mention that they are able to run the SQL they compiled to "in interactive time" on a rather large dataset, which they supposedly already could with SQL.

Arguably I'm not very good with Datalog and have mostly used Prolog, but to me it doesn't look much like a Datalog. Predicates seems to be variadic with named parameters, making variables implicit at the call site so to understand a complex predicate you need to hop away and look at how the composite predicates are defined to understand what they return. Maybe I misunderstand how it works, but at first glance that doesn't look particularly attractive to me.

Can you put arithmetic in the head of clauses in Datalog proper? As far as I can remember, that's not part of the language. To me it isn't obvious what this is supposed to do in this query language.

replies(1): >>42159693 #
2. aseipp ◴[] No.42159693[source]
For the record, I don't use Logica myself so I'm not familiar with every design decision or feature -- I'm not a Python programmer. I'm speaking about Datalog in general.

> making variables implicit at the call site

What example are you looking at? The NewsData example for instance seems pretty understandable to me. It seems like for any given predicate you can either take the implicit name of the column or you can map it onto a different name e.g. `date: date_num` for the underlying column on gdelt-bq.gdeltv2.gkg.

Really it just seems like a way to make the grammar less complicated; the `name: foo` syntax is their way of expressing 'AS' clauses and `name:` is just a shorthand for `name: name`

> In light of that quote it's also weird that they mention that they are able to run the SQL they compiled to "in interactive time" on a rather large dataset, which they supposedly already could with SQL.

The query in question is run on BigQuery (which IIRC was the original and only target database for Logica), and in that setup you might do a query over 4TB of data but get a response in milliseconds due to partitioning, column compression, parallel aggregation, etc. This is actually really common for many queries. So, in that kind of setup the translation layer needs to be fast so it doesn't spoil the benefit for the end user. I think the statement makes complete sense, tbh. (This also probably explains why they wrote it in Python, so you could use it in Jupyter notebooks hooked up to BigQuery.)

replies(1): >>42162228 #
3. cess11 ◴[] No.42162228[source]
They define a NewsData/5, but use a NewsData/2.

Are you aware of any SQL transpilers that spend so much time transpiling that you get irritated? I'm not.

replies(1): >>42165693 #
4. aseipp ◴[] No.42165693{3}[source]
Ah, I see what you mean. I'm not sure predicates like NewsData can actually be overloaded by arity, I'd have to check the docs. It mostly just seems like a shorter way to write the predicate with unbound variables.

> Are you aware of any SQL transpilers that spend so much time transpiling that you get irritated? I'm not.

Again, when you are running a tool on something that returns results in ~millisecond time, it is important the tool does not spoil that. Even 100-200ms is noticeable when you're typing things out. They could have worded it differently, it's probably just typical "A programmer wrote these docs" stuff, so it's just bad copy. A dedicated technical writer would probably do something different.