Apache Spark and Cassandra Integration

Apache Spark and Cassandra Integration

Big Data Engineering covers Apache Spark and Cassandra Integration in this insightful video. Details
Spark Executor & Driver Memory Calculation

Spark Executor & Driver Memory Calculation

Big Data Engineering takes a closer look at Spark. Details
Unboxing Spark Standalone Architecture

Unboxing Spark Standalone Architecture

Big Data Engineering closely examines  Spark Standalone Architecture. Apache Spark has a well-defined layered architecture where all the spark components and layers are loosely coupled.... Details
Apache Spark Streaming in K8s with ArgoCD & Spark Operator

Apache Spark Streaming in K8s with ArgoCD & Spark Operator

Here’s an interesting talk Albert Franziu Cros on a CI/CD setup composed by a Spark Streaming job in K8s consuming from Kafka. Over the last... Details
Real-Time Health Score Application using Apache Spark on Kubernates

Real-Time Health Score Application using Apache Spark on Kubernates

This on the Databricks YouTube channel presents the web application that calculates real-time health scores at a very rapid speed using Spark on Kubernates. A... Details
Cost Efficiency Strategies for Managed Apache Spark Service

Cost Efficiency Strategies for Managed Apache Spark Service

With cloud-native rising, the conversation of infrastructure costs seeped from R&D Directors to every person in the R&D: “How does much a VM cost?” “can... Details
Getting Started with Apache Spark on Kubernetes

Getting Started with Apache Spark on Kubernetes

Community adoption of Kubernetes (instead of YARN) as a scheduler for Apache Spark has been accelerating since the major improvements from Spark 3.0 release. Companies... Details
Advanced Natural Language Processing with Apache Spark NLP

Advanced Natural Language Processing with Apache Spark NLP

NLP is a key component in many data science systems that must understand or reason about text. This hands-on tutorial uses the open-source Spark NLP... Details
Comprehensive View on Date-time APIs of Apache Spark 3.0

Comprehensive View on Date-time APIs of Apache Spark 3.0

In this talk from the Databricks YouTube Channel is about date-time processing in Spark 3.0, its API and implementations made since Spark 2.4. In particular,it... Details
How to Use SQL with Delta Lake

How to Use SQL with Delta Lake

Delta Lake is an open-source storage management system (storage layer) that brings ACID transactions and time travel to Apache Spark and big data workloads. The... Details