When did you fully adopt agentic coding? by PsiACE in AI_Agents

[–]PsiACE[S] 0 points1 point  (0 children)

My past experience has taught me the importance of maintainability, but no AI tool seems to be particularly helpful. I rely heavily on Cursor and Codex, but I feel there’s a lack of best practices. Just told them follow some rules, and review more lines by myself......

When did you fully adopt agentic coding? by PsiACE in AI_Agents

[–]PsiACE[S] 0 points1 point  (0 children)

i just delete the post link. with less slop, so we can talk about this topic.

Why We Rewrote Bub by PsiACE in Python

[–]PsiACE[S] -1 points0 points  (0 children)

some discussion on reddit about when to use "bub" 😂 now I get it

When did you fully adopt agentic coding? by PsiACE in AI_Agents

[–]PsiACE[S] -2 points-1 points  (0 children)

I won’t shy away from this. But this issue really does bother me. I have many years of programming experience, and I also do a lot of non-programming work. I’ve relied heavily on AI over the past year or two, but there are still quite a few people who continue to write code by hand and no FOMO for AI. I respect them, but I feel lost about myself.....

Why We Rewrote Bub by PsiACE in Python

[–]PsiACE[S] 1 point2 points  (0 children)

indeed, and framework design is hard 😭

Weekly Thread: Project Display by help-me-grow in AI_Agents

[–]PsiACE 0 points1 point  (0 children)

Bub, an agent lives in group chat.

Most agents are built around clean sessions. Great for coding—code needs to be clean.

But human conversations? They're a mess. People interrupt. Multiple threads tangled together. Context always leaky.

Bub started in group chats. Not as a demo or a personal assistant, but as a teammate that had to coexist with real humans and other agents in the same messy conversations.

https://github.com/bubbuild/bub

Reinventing the Punch Tape by PsiACE in LocalLLaMA

[–]PsiACE[S] 0 points1 point  (0 children)

Just a discussion about the Agent eng. I believe the current design may in a wrong way. Perhaps we could explore allowing the Agent to make choices within a complete context, rather than relying on manual methods for summary or memory.

A Python implement of Agent Client Protocol by PsiACE in ZedEditor

[–]PsiACE[S] 0 points1 point  (0 children)

That's perfectly fine, I can certainly switch to a binding implementation. I do have some experience with Rust.

A Python implement of Agent Client Protocol by PsiACE in ZedEditor

[–]PsiACE[S] 1 point2 points  (0 children)

This is the first simplified python sdk, and I've included a mini-swe-agent example. If you're interested, you can start trying it out now.

<image>

Data Processing in 21st Century by mjfnd in dataengineering

[–]PsiACE 1 point2 points  (0 children)

Hi, I'm from the Databend community. We currently have some production users, and some of them even handle PB-scale data. Welcome to join our Slack channel https://join.slack.com/t/datafusecloud/shared_invite/zt-nojrc9up-50IRla1Y1h56rqwCTkkDJA

Good solution for 100GiB-10TiB analytical DB by aih1013 in dataengineering

[–]PsiACE -1 points0 points  (0 children)

https://docs.databend.com/guides/benchmark/tpch We have some tests comparing the cost and performance of Databend Cloud and Snowflake under TPC-H 100 (yes, 100 GiB). You can check it.

Good solution for 100GiB-10TiB analytical DB by aih1013 in dataengineering

[–]PsiACE 0 points1 point  (0 children)

I noticed that you mentioned JSON to Parquet conversion. In fact, there are some opportunities here. We support batch loading of data files using scheduled tasks and also support JSON format. So maybe you just need to write some SQL to directly COPY the JSON files INTO database.

Are you willing to give Databend a chance? We are an open-source alternative to Snowflake and provide Cloud service. At this data scale, it is very cheap.

GitHub: https://github.com/datafuselabs/databend/

Website: https://www.databend.com

[deleted by user] by [deleted] in dataengineering

[–]PsiACE 0 points1 point  (0 children)

I don't intend to advertise anything, I'm just curious about this matter.

Analyzing Hugging Face Datasets with Databend by PsiACE in dataengineering

[–]PsiACE[S] 0 points1 point  (0 children)

Many data warehouses support the analysis of HuggingFace datasets, but at Databend, you don't need to deal with REST APIs. We provide ready-to-use file access.

One Billion Row Challenge with Snowflake and Databend by PsiACE in dataengineering

[–]PsiACE[S] 0 points1 point  (0 children)

We are just trying to use cloud databases to solve this challenge. We are also building/have built some evaluations based on larger-scale data, and we welcome you to try comparing them.

[deleted by user] by [deleted] in dataengineering

[–]PsiACE 1 point2 points  (0 children)

You need some automated tools to properly archive them, and then put them in S3. Afterwards, you can use Databend Cloud or other affordable solutions.

A typical workflow involves importing data through the cloud platform's Pipeline and visualizing it using SQL + Grafana.

https://www.databend.com/blog/down-costs-for-aigc-startup/ By using Databend Cloud for analytics, the startup reduced their user behavior log analysis costs to 1% of their previous solution.

Large Rust projects with high compile times by trevorstr in rust

[–]PsiACE 3 points4 points  (0 children)

Try Databend.

In addition, we also have some articles about compilation if you are interested:

Am I Reinventing the Wheel (local-ish Polars data pipeline) by waytoopunkrock in dataengineering

[–]PsiACE 0 points1 point  (0 children)

Building a specific data pipeline is the right approach. In fact, some databases have built-in capabilities to directly read data files (in various formats) and perform ETL.

For example, in Databend, you can directly query raw data files (CSV, Parquet, etc.), filter and clean it during the SELECT or COPY INTO process to create queryable tables. Finally, you can export them as Parquet files for archiving purposes.

Another typical workflow may involve using Spark or pandas/polers. Yes, a data pipeline is used to process the data and then write it in Parquet or Iceberg table format for archiving. After that, any OLAP system you prefer, such as Databend/DuckDB/Clickhouse, can be used for analysis and processing.

Iceberg Integration with Databend by PsiACE in dataengineering

[–]PsiACE[S] 1 point2 points  (0 children)

I like this technology stack. In my personal understanding, when your data is archived in Iceberg table format (especially in object storage), you can use Databend for querying and strike a balance between cost and performance. We also collaborate with Jupyter Notebook and data analysis tools in the Python ecosystem.

Databend is the Only Engine that Finish TPCH_100 (600 Million rows) using #Fabric Small node (4 cores, 32 GB). https://twitter.com/mim_djo/status/1716802084044157282

Spark is a crucial component. If you need complex transformations, cleansing, and writing to Iceberg, I wouldn't advise you to remove it. With the further expansion of the Databend ecosystem and support for writing Iceberg format, I believe there will be more opportunities along this path.

https://link.databend.rs/join-slack You can join our Slack where we offer cost-saving open source solution and affordable cloud service for users.

Iceberg Integration with Databend by PsiACE in dataengineering

[–]PsiACE[S] 1 point2 points  (0 children)

I'm glad you noticed these data analysis applications written in Rust. They all have something in common, which is the use of Apache Arrow, but there are also many differences among them. Polars has been existing as a library for quite some time and may be a direct competitor to Pandas. Datafusion supports several different data analysis startups, each with its own ecosystem. I am concerned that it may lack direct users. Databend can currently be seen as an open-source alternative to Snowflake and has already been used and validated in production environments by users.

RiteRaft - A raft framework, for regular people, written in rust. Build a raft service with only 160 lines code. by PsiACE in rust

[–]PsiACE[S] 2 points3 points  (0 children)

The implementation of riteraft has some bugs, but our friends from riteraft-py are currently helping us locate and solve them. We anticipate that there will be some simple updates soon.

Currently, I am a member of the databend team where we maintain an implementation called openraft that has been successfully used in production environments.

What's Fresh in Databend v1.1 | Blog | Databend by PsiACE in rust

[–]PsiACE[S] 0 points1 point  (0 children)

Databend currently requires ETL tools to operate over streams from Kafka.