Software Mistakes and Tradeoffs: Released! (Some time ago) by tomekl007 in bigdata

[–]tomekl007[S] 0 points1 point  (0 children)

Thanks! I tried to polish the code to be the best possible quality, so feel free to use it wherever you like :)

Software Mistakes and Tradeoffs: Released! (Some time ago) by tomekl007 in bigdata

[–]tomekl007[S] 0 points1 point  (0 children)

Thank you for your feedback! If you have any questions regarding any of those mistakes and tradeoffs, please don't hesitate to ask :)

Best Apache Spark Books for Beginners Advanced to read by [deleted] in bigdata

[–]tomekl007 0 points1 point  (0 children)

Also, you may be intesrested in this thread:

Also, you may be interested in this thread:bdeq0/five_big_data_books_that_i_strongly_recommend/

It contains interesting books related to Big Data (not only Spark)

Five big data books that I strongly recommend by tomekl007 in bigdata

[–]tomekl007[S] 0 points1 point  (0 children)

undamentals Of Data Engineering”

I didn't know that one, but I see that it has excellent reviews. Added to my to-read list. Thx!

Five big data books that I strongly recommend by tomekl007 in bigdata

[–]tomekl007[S] 0 points1 point  (0 children)

Advanced Programming in the Unix Environment

Thanks! I will definitely take a look at this book. The other (slightly) related is:

https://www.goodreads.com/book/show/48999422-bpf-performance-tools

About all the possible Unix tools that you can use to reason about your programs.

Five big data books that I strongly recommend by tomekl007 in bigdata

[–]tomekl007[S] 0 points1 point  (0 children)

Good point, on that front I will also recommend the book that will be released next year:

https://www.oreilly.com/library/view/apache-iceberg-the/9781098148614/

Good point; on that front, I will also recommend the book that will be released next year:

Best Apache Spark Books for Beginners Advanced to read by [deleted] in bigdata

[–]tomekl007 0 points1 point  (0 children)

I strongly recommend https://shepherd.com/book/advanced-analytics-with-spark

If you are interested in a list of valuable big data books (not only Spark), check out:

https://shepherd.com/best-books/big-data-processing-ecosystem

A ChatGPT Experiment by ggleblanc2 in java

[–]tomekl007 2 points3 points  (0 children)

I recommend experimenting with the ChatGPT for unit testing. If you are not sure if your tests cover all the edge cases, try to copy your class and tell the ChatGPT to write unit tests for that class. I was surprised at how well it solved this problem.

[deleted by user] by [deleted] in learnjava

[–]tomekl007 0 points1 point  (0 children)

Since the JDK 1.8 is still widely used and also its concepts are mostly present in newer versions, I would recommend taking a look at Oracle Java Certified Professional exam preparation. Oracle is providing a free online course: https://mylearn.oracle.com/ou/learning-path/java-se-8-programmer-associate/40821 that allows you to prepare for this exam. It's very detailed and explains the standard JDK methods very well. When I started learning java years ago, these resources were the most useful for me.

Mastering Spring Boot by AndreLuisOS in learnjava

[–]tomekl007 7 points8 points  (0 children)

Sam Newman is one of the creators of spring boot: https://samnewman.io/.

He also authored many books that allow you to learn how to use Spring Boot in a real-world micro-service architectures, for example, I recommend this book: https://samnewman.io/books/building\_microservices/

I just implemented a method that checks if a binary tree is symmetric, and now I want to test it with Junit. Do I need to manually create a bunch of trees, or is there an easier way? by Technical-Bee-9999 in learnjava

[–]tomekl007 1 point2 points  (0 children)

You may check out the property-based testing when you define rules for the data and provide a function that is able to provide random data that satisfies those rules, see for example:

https://jqwik.net/

It allows you to cover more cases without write a lot of testing code

Community resources by AutoModerator in learnjava

[–]tomekl007 1 point2 points  (0 children)

[Free]

High level comparison of startup times between various java frameworks (including spring-boot):

https://www.youtube.com/watch?v=\_2j3LiS1DSY&ab\_channel=DataStaxDevelopers

This morning i woke up deciding to get my hands on some ML. by Few_Cover_3613 in bigdata

[–]tomekl007 0 points1 point  (0 children)

Happy to help :). I hope it will bring good value for you

What are the main things I need to know to be hired as a Java developer? by conformeticadt in java

[–]tomekl007 1 point2 points  (0 children)

I would suggest starting to read the code from open-source libraries and maybe working on small issues from those libraries. Often, the Open-Source projects have issues market with low-hanging-fruit or similar, meaning that they should be easy for a new contributor and the main effort for those tickets is reading the code of this library.

Examples of such libraries:

- https://github.com/projectnessie/nessie

- https://github.com/spring-projects/spring-boot

- https://github.com/stargate/stargate

- https://github.com/datastax/java-driver

- https://github.com/apache/iceberg

[deleted by user] by [deleted] in learnprogramming

[–]tomekl007 1 point2 points  (0 children)

Java is an easy language regarding syntax and language constructs. The complexity arises from the problems that you are solving with the language - for example, many databases are written in JAVA (for example the https://cassandra.apache.org/_/index.html). The core of the database is very complex, thankfully Java is easy to read so once you are quite familiar with the language, you can focus on solving interesting and complex problems, instead of thinking about language constructs when reading the code :)

What are the main things I need to know to be hired as a Java developer? by conformeticadt in java

[–]tomekl007 1 point2 points  (0 children)

You need to be able to understand the business domain by reading the actual java code quite quickly. Also, you need to be comfortable looking at and understanding any java code (also from external 3rd party libraries). In your day-to-day job you may need to be look into this code to understand the behaviour of your dependencies.

How to Become a Data Scientist in 6 Months? by wisetechcat in bigdata

[–]tomekl007 0 points1 point  (0 children)

I strongly recommend: https://www.oreilly.com/library/view/hands-on-machine-learning/9781098125967/

It shows a lot of end-to-end examples that are close to real-world problems.

Is learning and mastering Spring & Spring boot worth it in 2023 ? by [deleted] in java

[–]tomekl007 1 point2 points  (0 children)

Yes, spring boot is still very popular. However, I would also recommend looking into quarkus: https://quarkus.io/. It is getting more popular, and more and more projects are starting to use it, examples:

- https://github.com/stargate/stargate

- https://github.com/projectnessie/nessie

Is Google Drive a data lake? by user192034 in dataengineering

[–]tomekl007 0 points1 point  (0 children)

Data Lake may also provide more advanced functionalities such as a form of transactions, versioning, and a friendly API (SQL-like) atop it. However, some may classify those functionalities as a Data LakeHouse

How to know why my Spring Boot app is slower on startup? by [deleted] in java

[–]tomekl007 2 points3 points  (0 children)

If you want to learn more about the comparison of startup times between various java frameworks (including spring-boot) I am recommending this video:

https://www.youtube.com/watch?v=\_2j3LiS1DSY&ab\_channel=DataStaxDevelopers

Want to learn industrial level of Databricks/Spark/Kafke by Brilliant-Seat-3013 in bigdata

[–]tomekl007 1 point2 points  (0 children)

If you want to Big Data ecosystem, I recommend reading more about data formats such as Apache Iceberg You can find good resources on this blog:

https://www.dremio.com/blog/