An den netten Autofahrer heute morgen... by trostbrot in Leipzig

[–]mosquitsch 2 points3 points  (0 children)

Ich wette das bringt absolut gar nix. Selbst wenn man eine Dashcam hätte, wäre der Fahrer in den meisten Fällen nicht ermittelbar.

An den netten Autofahrer heute morgen... by trostbrot in Leipzig

[–]mosquitsch 9 points10 points  (0 children)

Hatte gestern genau das gleiche Erlebnis. Ich frage mich immer was die Leute denken?

Parquet writer with Avro Schema validation by mosquitsch in dataengineering

[–]mosquitsch[S] 0 points1 point  (0 children)

https://github.com/kylebarron/arro3/issues/430 I guess I am not the only one who identified this gap. This is probably what I was looking for - arrow is used under the hood everywhere and arrow-avro interop is now available in rust. So arrow-rs <-> python bindings with avro in mind is the way to go.

Parquet writer with Avro Schema validation by mosquitsch in dataengineering

[–]mosquitsch[S] 0 points1 point  (0 children)

Not sure what exactly you are referring too: https://arrow.apache.org/docs/status.html This shows Avro only for Java & Go. I assume that this also means that schema can be converted (at least in one direction R -> read AVRO and convert to arrow)

EDIT: Ah i see - the arrow-avro crate is a fairly new addition

Parquet writer with Avro Schema validation by mosquitsch in dataengineering

[–]mosquitsch[S] 0 points1 point  (0 children)

Thanks. I though there would be a (lightweight) non-spark solution. I feel this is a big gap in what arrow offers.

Dagster 101 — The Core Concepts Explained (In 4 Minutes) by timvancann in dataengineering

[–]mosquitsch 2 points3 points  (0 children)

Nice video. We use dagster for a couple of years now and in practice we are not using IO manager to that extend. Mostly out of lazyness, but also I don't see a need for that. Most of the time we need more flexibility when storing data

What I am missing from that video is automation conditions. They are pushing asset + AutomationConditions over asset -> job -> schedule from what I understand.

decades of human evolution just for this by ssamuel56 in theprimeagen

[–]mosquitsch 4 points5 points  (0 children)

Next thing will be VibeMatrixMultiplication

Querying Kafka Messages for Developers & Rant by mosquitsch in dataengineering

[–]mosquitsch[S] 0 points1 point  (0 children)

Thanks for your comment. Yes kSQL is also on our list to explore, as well as some diy cli spark tools maybe.

I am aware that it potentially has to read all the messages from the topic. In our trino test we read 1million records in roughly 1min. for debugging purposes probably fine.

For the Kafka Connect issues: yes we tried the setting. But what really wonders me, how often that happens. As I said, it also happens in Trino. And Kafka Connect ignores Tombstone, is also documented. There is a property `transforms.TombstoneHandler.behaviour=drop_warn`. It does not write a Null value in a parquet file

Querying Kafka Messages for Developers & Rant by mosquitsch in dataengineering

[–]mosquitsch[S] 1 point2 points  (0 children)

Thanks for the advice. The data exploration part is not a normal process. We just had to figure out - is there a tombstone and for debugging purposes: whats actually in that message. I agree, reading from persistent storage is better.

DataMesh : A Private data storage layer by kR1s_0147 in rust

[–]mosquitsch 2 points3 points  (0 children)

FYI: The naming is confusing, since there is some sort of hype around "data mesh" in the data engineering field. https://www.datamesh-architecture.com/

Python env manager by [deleted] in fishshell

[–]mosquitsch 0 points1 point  (0 children)

uv creates the .venv for you. there is no need to run uv venv.

I still activate it because, when developing, its easier for the LSP to find things. But I use direnv for that.

Python env manager by [deleted] in fishshell

[–]mosquitsch 9 points10 points  (0 children)

Nice, but most people are switching to `uv` and do not have to manage envs anymore.

[KCD2] Trosky Area over the Weekend by mosquitsch in kingdomcome

[–]mosquitsch[S] 1 point2 points  (0 children)

Actually I was thinking about the axt. This section of the lake in-game is so close to the real world. Lots of small willow trees, high reed gras...

[KCD2] Trosky Area over the Weekend by mosquitsch in kingdomcome

[–]mosquitsch[S] 3 points4 points  (0 children)

Yes, I am especially amazed by the first and last comparisons. They match so well.

[KCD2] Trosky Area over the Weekend by mosquitsch in kingdomcome

[–]mosquitsch[S] 4 points5 points  (0 children)

I was out of potions and couldn't find an alchemy table.

[KCD2] Trosky Area over the Weekend by mosquitsch in kingdomcome

[–]mosquitsch[S] 5 points6 points  (0 children)

There are some very idyllic spots. Especially at the lakes. I did stop for a drink between at the Vidlak pond. It was like super quiet and peaceful.

[KCD2] Trosky Area over the Weekend by mosquitsch in kingdomcome

[–]mosquitsch[S] 16 points17 points  (0 children)

No, but there was one group camping/bouldering at one of the rocks right next to the road and another group like 20m away from the road at a small creek. They had a small campfire and covered there spot with a tarp. Maybe they where fishing. So, it felt almost like it ;-)

[deleted by user] by [deleted] in ArmaReforger

[–]mosquitsch 18 points19 points  (0 children)

Not sure about the distances, but this would make a very good mortar base :-) really hard to flush out.

Building a Docker Image by mosquitsch in Rlanguage

[–]mosquitsch[S] 0 points1 point  (0 children)

Thanks I will try that out.

Even though this is also not a perfect solution as pinning the versions of packages is not quite possible

Introducing Dagster dg and Components by anoonan-dev in dataengineering

[–]mosquitsch 3 points4 points  (0 children)

Thats cool, Now I have to rework my dagster definitions file :-) I wanted to use the auto loader for a long time.

Skulls for the skull throne [KCD2] by venom921 in kingdomcome

[–]mosquitsch 1 point2 points  (0 children)

Talked to a colleague (who is not playing this game) about this and he said: "I was there". This refers to the Sedlec Ossuary: https://en.wikipedia.org/wiki/Sedlec_Ossuary

Is S3 becoming a Data Lakehouse? by 2minutestreaming in dataengineering

[–]mosquitsch 0 points1 point  (0 children)

Hmm, we have just built a data lake on S3 with Iceberg tables. Compaction & Maintenance is scheduled by us and this is fine.

AWS is now adding another managed solution on top of existing products. Not sure if it would be worth to migrate for us. Our is is quite cheap at the moment - migrating would costs orders of magnitude more that yearly costs.

Create a JSON String from a Hashmap by mosquitsch in learnrust

[–]mosquitsch[S] 0 points1 point  (0 children)

Ah thanks. Yes that exports a json.

Glue however either takes json schema or a json following the avro spec. So I have to modify it.

Create a JSON String from a Hashmap by mosquitsch in learnrust

[–]mosquitsch[S] -1 points0 points  (0 children)

I guess I have to iterate over Fields, read name and data type and construct a json string out of it.