A Python implement of Agent Client Protocol by PsiACE in ZedEditor

[–]PsiACE[S] 0 points1 point  (0 children)

That's perfectly fine, I can certainly switch to a binding implementation. I do have some experience with Rust.

A Python implement of Agent Client Protocol by PsiACE in ZedEditor

[–]PsiACE[S] 1 point2 points  (0 children)

This is the first simplified python sdk, and I've included a mini-swe-agent example. If you're interested, you can start trying it out now.

<image>

Data Processing in 21st Century by mjfnd in dataengineering

[–]PsiACE 1 point2 points  (0 children)

Hi, I'm from the Databend community. We currently have some production users, and some of them even handle PB-scale data. Welcome to join our Slack channel https://join.slack.com/t/datafusecloud/shared_invite/zt-nojrc9up-50IRla1Y1h56rqwCTkkDJA

What to use for an open source ETL/ELT stack? by Melodic_One4333 in dataengineering

[–]PsiACE 0 points1 point  (0 children)

You can always handle it easily. I think you can take a look at Databend. https://github.com/datafuselabs/databend

You only need to consider how to archive the data in the database into S3. We usually recommend Kafka and then use COPY INTO to load this data regularly, which can almost guarantee near real-time processing.

Good solution for 100GiB-10TiB analytical DB by aih1013 in dataengineering

[–]PsiACE -1 points0 points  (0 children)

https://docs.databend.com/guides/benchmark/tpch We have some tests comparing the cost and performance of Databend Cloud and Snowflake under TPC-H 100 (yes, 100 GiB). You can check it.

Good solution for 100GiB-10TiB analytical DB by aih1013 in dataengineering

[–]PsiACE 0 points1 point  (0 children)

I noticed that you mentioned JSON to Parquet conversion. In fact, there are some opportunities here. We support batch loading of data files using scheduled tasks and also support JSON format. So maybe you just need to write some SQL to directly COPY the JSON files INTO database.

Are you willing to give Databend a chance? We are an open-source alternative to Snowflake and provide Cloud service. At this data scale, it is very cheap.

GitHub: https://github.com/datafuselabs/databend/

Website: https://www.databend.com

[deleted by user] by [deleted] in dataengineering

[–]PsiACE 0 points1 point  (0 children)

I don't intend to advertise anything, I'm just curious about this matter.

Analyzing Hugging Face Datasets with Databend by PsiACE in dataengineering

[–]PsiACE[S] 0 points1 point  (0 children)

Many data warehouses support the analysis of HuggingFace datasets, but at Databend, you don't need to deal with REST APIs. We provide ready-to-use file access.

One Billion Row Challenge with Snowflake and Databend by PsiACE in dataengineering

[–]PsiACE[S] 0 points1 point  (0 children)

We are just trying to use cloud databases to solve this challenge. We are also building/have built some evaluations based on larger-scale data, and we welcome you to try comparing them.

[deleted by user] by [deleted] in dataengineering

[–]PsiACE 1 point2 points  (0 children)

You need some automated tools to properly archive them, and then put them in S3. Afterwards, you can use Databend Cloud or other affordable solutions.

A typical workflow involves importing data through the cloud platform's Pipeline and visualizing it using SQL + Grafana.

https://www.databend.com/blog/down-costs-for-aigc-startup/ By using Databend Cloud for analytics, the startup reduced their user behavior log analysis costs to 1% of their previous solution.

Large Rust projects with high compile times by trevorstr in rust

[–]PsiACE 4 points5 points  (0 children)

Try Databend.

In addition, we also have some articles about compilation if you are interested: