Apache Spark learning log [#2] - Learning resources

Apache Spark learning log [#2] - Learning resources

Hi! This short episode will cover resources I'm using for learning Apache Spark (or PySpark - to be more precise).

1: Apache Spark documentation

No surprise here - official documentation is usually the first resource you'll use while learning new language/framework.

Official Apache Spark can be found here and its Pyspark part here.

2: Book: Learning Spark, 2nd Edition

Learning Spark is written by Jules S. Damji, Brooke Wenig, Tathagata Das, Denny Lee. It's worth mentioning that I'm using 2nd edition and it includes Spark 3.x. You can buy both traditional and electronic versions of it, I'm using the latter one since it's included in O'Reilly subscription that I own.

So far I've read first three chapters and I'm pretty happy with the content. Authors are explaining Spark's concepts in a clear and concise manner and they include lots of examples/code snippets that help you reinforce newly gained knowledge.

Code examples included in the book are written in Python & Scala. I'm focusing mainly on Python but, since Spark's API is quite unified, you shouldn't have any major problems with understanding Scala snippets as well. Some of examples also include Spark SQL code.

3: Google

Again - no surprise here: doing a lot of googling.

That's it for this post.

Thanks for reading,
Kuba

Thanks for reading the article, I really appreciate it! Have you heard about Braintrust - the first decentralized talent network? Whether you're a freelancer looking for a job, an employer looking for hiring talents, or you just have a wide network of connections - there's something for you there!

Go check it out and register with below link (yeah - it's my referral link and it's free - no hidden costs):

Registration link