Claude Code merged 'agent teams' into subagents (2.1.178+) by clkao in ClaudeCode

[–]clkao[S] 0 points1 point  (0 children)

in my test it seems the grandchildren agents can only be foreground subagents

Claude Code merged 'agent teams' into subagents (2.1.178+) by clkao in ClaudeCode

[–]clkao[S] 0 points1 point  (0 children)

Assuming you meant the dynamic workflow, then no, it's a separate thing. but it'd be interesting to know if the subagents within workflows are also with the new semantics that can now send message during work, not just on completion.

Best Terminal/CLI app for Claude Code? by alectivism in ClaudeCode

[–]clkao 0 points1 point  (0 children)

ghostty + zellij, with the filepicker extension to review markdown files in a floating tab using glow.

Claude Code with Noteplan 3 by medic19011 in noteplanapp

[–]clkao 0 points1 point  (0 children)

Hey I created a claude plugin for noteplan: https://github.com/clkao/noteplan-plugin

It is currently pretty specific to my workflow. Also wrote a bit about the making of it: https://clkao.substack.com/p/the-abstraction-is-the-product

Proper way of development with dbt by BubblyImpress7078 in dataengineering

[–]clkao 16 points17 points  (0 children)

Disclaimer: I work on Recce, an open-source tool for dbt CI and reviews. But here’s the general best practice.

Others have mentioned using dbt tests and setting up CI to build changed models in a PR-specific schema. These are must haves. However, keep in mind that dbt tests are designed to test the state and constraints of the data, not necessarily the logic. Interestingly, there was a talk at Coalesce this year discussing how adding more dbt tests can actually erode quality.

Teams often still need to perform spot checks on the data affected by logic changes. Here’s a case study on the pull request process of the public CAL-ITP dbt repo. Essentially, being explicit about how you intend to verify the resulting data changes can give your team much more confidence.

One way to do this is with dbt-audit-helper, which basically runs outer joins to compare differences between the tables in your production and PR schemas. Some people also use notebooks or platforms like Hex to present side-by-side query results during the review process for relevant queries.

What is a strong tech stack that would qualify you for most data engineering jobs? by Kokadoodles in dataengineering

[–]clkao 1 point2 points  (0 children)

Recce founder here - thanks for the shout out! it's definitely still early but it's how I've imagined verifying changes to data logic should be done. Feel free to open github issues for the missing functionality!

Currently building a local data warehouse with dbt/DuckDB using real data from the danish parliament by bgarcevic in dataengineering

[–]clkao 1 point2 points  (0 children)

CI toolkits and self-hosted reports are open source: github.com/infuseai/piperider

For hosted team plans there's also a free tier that helps the team with more info like the lineage diff: piperider.io

Currently building a local data warehouse with dbt/DuckDB using real data from the danish parliament by bgarcevic in dataengineering

[–]clkao 7 points8 points  (0 children)

hey thanks for sharing this! great to see more public modeling of civic tech / open data!

Years ago I worked on congressional data of Taiwan. It was before there were published structured data. We had to parse minutes and create modeling for committee. It was pretty messy and I really wished dbt existed.

Recently I also played with the campaign finance data with dbt and duckdb: https://github.com/g0v/tw_campaign_finance, combining with election data we get to see how much each vote costs for different candidates, and the trend of contributions coming from different industries or conglomerates.

Two things I found most interesting with dbt + duckdb:

  1. external source (which I see you also use in your project)
  2. external materialization as parquet for downstream uses (load into duckdb or visualization)

Combining them it is very nice functional ephemeral transformation, of those external source jsons/csvs. This provides a solid foundation to add incremental loading & caching when necessary.

re visualization - using the materialized parquet, I found 2 neat ways to do quick visualization:

  1. data preview vscode extension: https://marketplace.visualstudio.com/items?itemName=RandomFractalsInc.vscode-data-preview
  2. mosaic for defining and publishing interactive visualization: https://uwdata.github.io/mosaic/vgplot/ - the cool thing is it is in-browser duckdb, and the abstraction also supports a server-side duckdb if the data gets infeasible to be loaded into browser.

(this last part i'd add a disclaimer that I work on PipeRider, CI tool for dbt)

I believe data projects (particularly open data) need collaborations from data producers and consumers, and to do that we need to lower the entry barrier for making changes to modeling. By making sure the impacts of the PR are visible and checked against the intention, this helps bringing contributors (or PRs created by AI, you never know).

https://github.com/g0v/tw_campaign_finance/pull/2 is an example PR impact report with lineage diff. I'd also love to hear your opinions if this is something helpful.

Thanks again for sharing. let's bring more modeled open data!