Apache Spark

Spark

Comprehensive View on Intervals in Apache Spark 3.2

Here’s an overview of intervals in Apache Spark before version 3.2, and the changes that are coming in the future releases.

Read More
Big Data Microsoft Spark

What’s New in .NET for Apache Spark v1.1.1?

.NET for Apache Spark empowers .NET developers to participate in the world of big data analytics. In this episode, Jeremy chats with Michael Rys to discuss some of the new features and capabilities available in this release Related Links .NET for Apache Spark™ .NET for Apache Spark™ tutorial .NET for Apache Spark™ documentation

Read More
Big Data Data Data Warehouse

Data + AI Summit 2021 – Full Thursday AM Keynote on Apache Spark, Data Sciencem and Machine Learning

Here is the entire AM keynote from the Data + AI Summit 2021. The pursuit of AI is one of the biggest priorities in data today. The Thursday morning keynote will be led by Databricks Co-founder and CEO Ali Ghodsi and cover advances in data science, machine learning, MLOps and more in both open source […]

Read More
Azure Synapse Data

Getting Started with Accessing Azure Data Explorer using Apache Spark for Azure Synapse Analytics

In this episode of Data Exposed, Manoj Raheja shows us how to seamlessly integrate with Azure Data Explorer from Apache Spark for Azure Synapse Analytics. Resources: Connect to Azure Data Explorer using Apache Spark for Azure Synapse Analytics GitHub (Sample Code)

Read More
Azure Synapse CosmosDB

Overview of Azure Synapse Link featuring CosmosDB

In this video Chris Seferlis gives an overview of Azure Synapse Link, a newer feature of the Synapse Analytics Suite of tools. Find out why this feature is important, the way it moves Operational Data to Analytical Data, and what you can then do with it. More details about the service and some great tutorials […]

Read More
Databricks Machine Learning

Using Machine Learning at Scale: A Gaming Industry Experience

Databricks Games earn more money than movies and music combined. Games also generate a lot of data. How can that data best be used. This session walks through how to develop a fully automated and scalable Machine Learning pipeline by the example from an innovative gaming company whose games are played by millions of people […]

Read More
Spark

Unboxing Spark Standalone Architecture

Big Data Engineering closely examines  Spark Standalone Architecture. Apache Spark has a well-defined layered architecture where all the spark components and layers are loosely coupled. This architecture is further integrated with various extensions and libraries. Apache Spark Architecture is based on two main abstractions: Resilient Distributed Dataset (RDD) Directed Acyclic Graph (DAG)

Read More
Containers Spark

Real-Time Health Score Application using Apache Spark on Kubernates

This on the Databricks YouTube channel presents the web application that calculates real-time health scores at a very rapid speed using Spark on Kubernates. A health score represents a machine’s lifetime and it is commonly used as a landmark for making a decision on whether to replace the machine with new one for high productivity […]

Read More