←back to thread

61 points captaintobs | 9 comments | | HN request time: 0.632s | source | bottom
1. whinvik ◴[] No.41853594[source]
Can someone who understands it explain what dbt is and how it is used. I hear a lot about it but I just haven't figured out what it is useful for.
replies(4): >>41853616 #>>41853867 #>>41853925 #>>41855656 #
2. bitlad ◴[] No.41853616[source]
I am not sure if it is that popular these days. Couple of years ago it was pretty popular.
replies(2): >>41854582 #>>41854808 #
3. tiew9Vii ◴[] No.41853867[source]
Some opinionated conventions around defining templated SQL queries in YAML files for ETL.

Then it provides additional tooling around that, GUI’s, governance, everything your average large corporate asks for.

4. gkapur ◴[] No.41853925[source]
Basically people are constantly calculating metrics based on existing tables. Think something as simple as a moving average or the sum of two separate columns in a table. Once upon a time you would set up a cronjob and populate these every day as a SQL query in some python or Perl script.

Dbt introduced a language for managing these “metrics” at scale including the ability to use variables and more complex templates (Jinja.)

Then you do dbt run (https://docs.getdbt.com/reference/commands/run) and kapow the metric is populated in your database.

More broadly dbt did two other things: 1. It pushed the paradigm from ETL to ELT (so stick all the data in your warehouse and then transform it rather than transform it at extraction time.) 2. It created the concept of an “analytics engineer” (previously know as guy who knows SQL or business analyst.)

replies(1): >>41854395 #
5. ◴[] No.41854395[source]
6. riku_iki ◴[] No.41854582[source]
sounds like a typical hype-tech lifecycle.
7. jburbank ◴[] No.41854808[source]
The hype may have gone down, but it's usage is good. It's used where I work. It has a slack channel that's pretty busy.
replies(1): >>41855681 #
8. christoff12 ◴[] No.41855656[source]
I built the first half of my career as "a guy who knows SQL" (and Excel macros but I digress). I then rode the early wave of Analytics Engineering.

dbt is kinda like Vite (dbt = data build tool) for folks working with data warehouses. Their biggest contribution was a mindset shift that applied principles of the SDLC to the traditional BI/Analytics space.

Almost overnight, analysts went from building business logic in GUIs like Talend or Tableau to code-based models (SQL) checked into git repos instead. It took what Looker was doing with LookML and generalized it across the BI stack.

This shift (+ associated tooling) resulted in less brittle data pipelines, increased uptime for dashboards/reporting, and more sanity when working with more than 2-3 people in a data environment.

Imagine a situation where you're at an e-commerce company and need to reconcile orders from Woocommerce with shipments in ShipStation, returns from tickets in HubSpot, and refunds issued in Stripe. dbt simplifies the management of the relationships between these various systems.

Based on this, you can build data models that allow you and, increasingly, your business stakeholders to answer questions like "Which SKUs have seen an uptick in refunds due to reason X this quarter?" and "Where were they shipped?"

The benefit of having standard abstractions means you can build metrics on top of the models as [gkapur](https://news.ycombinator.com/item?id=41853925) mentions such that "revenue" is the same when marketing pulls it for calculating CAC as when finance pulls it their monthly reports, etc.

9. christoff12 ◴[] No.41855681{3}[source]
dbt isn't going anywhere. It's the standard.

That said, SQLMesh and other tools are pretty interesting and I look forward to new growth in the space.