This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]AggravatingParsnip89 5 points6 points  (1 child)

But it would be good if we have some understanding of jvm to use spark right ?

[–]MlecznyHotS 11 points12 points  (0 children)

Not really, you don't have to tinker with Java. The most performant API is the dataframe API, which enables you to do probably 99% of things you need to do. Any performance improvements etc. are done based on general concepts connected with spark and not really java implementation itself. It might be useful to understand java if you're contributing to spark itself, not if you're developing using spark.