←back to thread

171 points voat | 1 comments | | HN request time: 0s | source
Show context
thenaturalist ◴[] No.42158900[source]
I don't want to come off as too overconfident, but would be very hard pressed to see the value of this.

At face value, I shudder at the syntax.

Example from their tutorial:

EmployeeName(name:) :- Employee(name:);

Engineer(name:) :- Employee(name:, role: "Engineer");

EngineersAndProductManagers(name:) :- Employee(name:, role:), role == "Engineer" || role == "Product Manager";

vs. the equivalent SQL:

SELECT Employee.name AS name

FROM t_0_Employee AS Employee

WHERE (Employee.role = "Engineer" OR Employee.role = "Product Manager");

SQL is much more concise, extremely easy to follow.

No weird OOP-style class instantiation for something as simple as just getting the name.

As already noted in the 2021 discussion, what's actually the killer though is adoption and, three years later, ecosystem.

SQL for analytics has come an extremely long way with the ecosystem that was ignited by dbt.

There is so much better tooling today when it comes to testing, modelling, running in memory with tools like DuckDB or Ibis, Apache Iceberg.

There is value to abstracting on top of SQL, but it does very much seem to me like this is not it.

replies(4): >>42158997 #>>42159072 #>>42159873 #>>42162215 #
aseipp ◴[] No.42159072[source]
Logica is in the Datalog/Prolog/Logic family of programming languages. It's very familiar to anyone who knows how to read it. None of this has anything to do with OOP at all and you will heavily mislead yourself if you try to map any of that thinking onto it. (Beyond that, and not specific to Logica or SQL in any way -- comparing two 3-line programs to draw conclusions is effectively meaningless. You have to actually write programs bigger than that to see the whole picture.)

Datalog is not really a query language, actually. But it is relational, like SQL, so it lets you express relations between "facts" (the rows) inside tables. But it is more general, because it also lets you express relations between tables themselves (e.g. this "table" is built from the relationship between two smaller tables), and it does so without requiring extra special case semantics like VIEWs.

Because of this, it's easy to write small fragments of Datalog programs, and then stick it together with other fragments, without a lot of planning ahead of time, meaning as a language it is very compositional. This is one of the primary reasons why many people are interested in it as a SQL alternative; aside from your typical weird SQL quirks that are avoided with better language design (which are annoying, but not really the big picture.)

replies(3): >>42159145 #>>42159449 #>>42159858 #
thenaturalist ◴[] No.42159145[source]
> but it is more general, because it also lets you express relations between tables themselves (e.g. this "table" is built from the relationship between two smaller tables), and it does so without requiring extra special case semantics like VIEWs.

If I understand you correctly, you can easily get the same with ephemeral models in dbt or CTEs generally?

> Because of this, it's easy to write small fragments of Datalog programs, and then stick it together with other fragments, without a lot of planning ahead of time, meaning as a language it is very compositional.

This can be a benefit in some cases, I guess, but how can you guarantee correctness with flexibility involved?

With SQL, I get either table or column level lineage with all modern tools, can audit each upstream output before going into a downstream input. In dbt I have macros which I can reuse everywhere.

It's very compositional while at the same time perfectly documented and testable at runtime.

Could you share a more specific example or scenario where you have seen Datalog/ Logica outperform a modern SQL setup?

Generally curious.

I am not at all familiar with the Logica/Datalog/Prolog world.

replies(4): >>42159326 #>>42159431 #>>42159555 #>>42160072 #
aseipp ◴[] No.42159555[source]
> If I understand you correctly, you can easily get the same with ephemeral models in dbt or CTEs generally?

You can bolt on any number of 3rd party features or extensions to get some extra thing, that goes for any tool in the world. The point of something like Datalog is that it can express a similar class of relational programs that SQL can, but with a smaller set of core ideas. "Do more with less."

> I guess, but how can you guarantee correctness with flexibility involved?

How do you guarantee the correctness of anything? How do you know any SQL query you write is correct? Well, as the author, you typically have a good idea. The point of being compositional is that it's easier to stick together arbitrary things defined in Datalog, and have the resulting thing work smoothly.

Going back to the previous example, you can define any two "tables" and then just derive a third "table" from these, using language features that you already use -- to define relationships between rows. Datalog can define relations between rules (tables) and between facts (rows), all with a single syntactic/semantic concept. While SQL can only by default express relations between rows. Therefore, raw SQL is kind of "the bottom half" of Datalog, and to get the upper half you need features like CTEs, VIEWs, etc, and apply them appropriately. You need more concepts to cover both the bottom and top half; Datalog covers them with one concept. Datalog also makes it easy to express things like e.g. queries on graph structures, but again, you don't need extra features like CTEs for this to happen.

There are of course lots of tricky bits (e.g. optimization) but the general idea works very well.

> Could you share a more specific example or scenario where you have seen Datalog/ Logica outperform a modern SQL setup?

Again, Datalog is not about SQL. It's a logic programming language. You need to actually spend time doing logic programming with something like Prolog or Datalog to appreciate the class of things it can do well. It just so happens Datalog is also good for expressing relational programs, which is what you do in SQL.

Most of the times I'm doing logic programming I'm actually writing programs, not database queries. Trying to do things like analyze programs to learn facts about them (Souffle Datalog, "can this function ever call this other function in any circumstance?") or something like a declarative program as a decision procedure. For example, I have a prototype Prolog program sitting around that scans a big code repository, figures out all 3rd party dependencies and their licenses, then tries to work out whether they are compatible.

It's a bit like Lisp, in the sense that it's a core formulation of a set of ideas that you aren't going to magically adopt without doing it yourself a bunch. I could show you a bunch of logic programs, but without experience all the core ideas are going to be lost and the comparison would be meaningless.

For the record, I don't use Logica with SQL, but not because I wouldn't want to. It seems like a good approach. I would use Datalog over SQL happily for my own projects if I could. The reasons I don't use Logica for instance are more technical than anything -- it is a Python library, and I don't use Python.

replies(1): >>42160577 #
1. kthejoker2 ◴[] No.42160577{3}[source]
CTEs aren't really an "extra" feature they just are a composable reusable subquery. This just adds the benefit of storing CTEs as function calls aka table valued functions (TVFs) ... also not really an "extra" feature.

The main advantage to any non SQL language is its ability to more efficiently express recursion (graph / hierarchical queries) and dynamic expressions like transposition and pivots.

You can do those in SQL it's just clunky.