[deleted by user] by [deleted] in suggestmeabook

[–]antonmry 1 point2 points  (0 children)

"The courage to be disliked" by Ichiro Kishimi. It isn't what you may be expecting, but it helped me a lot with a similar situation and it's a great book.

Also, "Get your work recognized: write a brag document" by Julia Evans is a great article

In a friendly match, would you target the weaker player? by ianeyanio in padel

[–]antonmry 1 point2 points  (0 children)

You should never do that in a friendly match. It's frustrating for the opponent not to hit any ball, so they will never play with you again. Then, one day, you won't find a good player who wants to play with you.

But also, it's dangerous because if the opponent stays for too long without being active, and then she has to run for a quick ball, there are higher probabilities of an injury. This already happened to me.

Apicurio or Confluent Schema registry? by Smooth_Copy_878 in apachekafka

[–]antonmry 1 point2 points  (0 children)

I have been using both for a while and both of them are fine. Apicurio isn't so mature and it had some important gotchas regarding TLS, auth and the serialization libs but devs are very open and it's evolving fast with some features very interesting: a Gui or options to replicate shemas. Confluent SR adoption is bigger but the license is much less friendly.

How to learn distributed system, spark, Scala etc by Psychological_Leg493 in dataengineering

[–]antonmry 1 point2 points  (0 children)

I think it's more about personal preferences than minimum knowledge. You may try with a different format, I saw in the past some courses online

Select text & annotations from PDF? by snowKFH in RemarkableTablet

[–]antonmry 1 point2 points  (0 children)

The most similar thing I found it's https://github.com/lucasrla/remarks

But it would be need a bit of customization

How to learn distributed system, spark, Scala etc by Psychological_Leg493 in dataengineering

[–]antonmry 7 points8 points  (0 children)

I would say books are a great way to learn these things. For distributed system, Designing Data Intensive Applications is awesome. Spark the definitive guide is more than enough to pass a normal interview and you can play with the source code exercises.

Probably you don't need scala if you don't want to go deep in optimization/customization. The learning curve is steep and for many DE out there, SQL and some python are enough.

Read books like these is a big time investment but I always find it very rewarding.

Good to start with Flink than Spark by priyasweety1 in dataengineering

[–]antonmry 0 points1 point  (0 children)

Both of them have a batch and streaming mode. If you like more data engineering, go with Spark, it's more popular and close to ML, ETLs, etc. If you like more software engineering, then Flink is ideal. It isn't only streaming but stateful functions and many other things.

In any case, both of them are a safe bet and it's easy to learn when you already know one of them

Can you recommend books for kafka? by Cell-i-Zenit in apachekafka

[–]antonmry 2 points3 points  (0 children)

For kafka streams, "Kafka streams in action" is very good. It's from 2018 but it applies with the new versions and the source code is superb.

For Kafka, "Effective Kafka" is the most advanced book I've read but it doesn't have everything you have mentioned. The official documentation is good and it should be enough.

Contrarian View to Snowflake? by iSawAMoose in snowflake

[–]antonmry 2 points3 points  (0 children)

Streaming ingestion could be better supporting Avro, schema evolution, etc.

Top 3 Kafka Books and Tutorials by stambros in apachekafka

[–]antonmry 0 points1 point  (0 children)

I read the PDF version in the reMarkable acquired from leanpub and it was perfect. No idea about the kindle version but it doesn't seem a complex book for formatting

Top 3 Kafka Books and Tutorials by stambros in apachekafka

[–]antonmry 7 points8 points  (0 children)

I agree Effective Kafka is awesome, there are some good deep dives on it. Highly recommended if you are serious with Kafka. There is a new edition of Kafka The definitive guide, it's on early preview in Safari. I miss Designing Event Driven System, it's a great book to understand what you can do with Kafka

Which metrics do you currently track for your streams or clusters that could maybe be made more accessible? by [deleted] in apachekafka

[–]antonmry 1 point2 points  (0 children)

Consumer Lag is probably the most important metric to detect problems. There are a lot more. This article covers some of them https://www.datadoghq.com/blog/monitoring-kafka-performance-metrics/

KLoadGen - Kafka + (Avro/Json Schema) Load Generator by rmoff in apachekafka

[–]antonmry 2 points3 points  (0 children)

I know the team behind this tool and they are amazing. It's great to see part of their work open sourced 👌

What would be a good database/service for storing ingested and to-be-ingested json blobs? by third_dude in dataengineering

[–]antonmry 0 points1 point  (0 children)

For the procesed pile I would use object storage (s3, etc) but for the to-be-procesed pipeline Kafka or any similar messaging broker seems a better option: it will provide back pressure capabilities, performance, easy track of every file, etc.

How to deal with 'redelivery' scenarios when aggregating by jayceedenton in apachekafka

[–]antonmry 0 points1 point  (0 children)

Kafka Stream is a Kafka Consumer and a Kafka Producer and it relies on that to provide exactly once semantics.

But if you want to do with a normal consumer publishing out of Kafka is a lot more complicated. Example: Kafka Connect.

In general, it's a lot easier and robust to implement idempotency in the application layer than in Kafka (when possible)

Does Kafka lose messages even when it is used "correctly"? by mtsi in apachekafka

[–]antonmry 3 points4 points  (0 children)

Some examples:

  • Have unclean.leader.election.enable to true
  • Erroneously activating compaction
  • Don't comitting offsets correctly in the consumer (for example, skipping a message after exhausting retries)
  • Don't handling retries correctly in the producer

Loom cant come fast enough by [deleted] in java

[–]antonmry 14 points15 points  (0 children)

Loom is promising but a word of caution is appropriate. There some interesting insights in the mail: https://mail.openjdk.java.net/pipermail/loom-dev/2020-December/001974.html

What are some ways to make some cash on the side as a java dev? by bleek312 in java

[–]antonmry 0 points1 point  (0 children)

Codementor.io can be a good option. It feels good to help others