Data Engineering

AWS Databricks

Databricks on AWS Cloud Integration Demo

Did you ever wonder how Databricks plays a role in AWS data intergrations? Well, wonder no more. Time stamps: 0:00 Databricks Lakehouse on AWS overview 0:20 Connecting to EC2, S3, Glue, and IAM 0:46 Ingesting Kinesis streams into Delta Lake 1:32 Viewing Delta Lake tables in the Glue console 1:49 Databricks – Redshift integration 2:20 […]

Read More
Data

Dataiku End-to-End Demo

This demo uses a project that predicts flight delays to demonstrate connecting to data, preparing and enriching it, building machine learning models, and operationalizing your work entirely in Dataiku.

Read More
Big Data Data

Code Once Use Often with Declarative Data Pipelines

In this video watch Anthony Awuley, a developer, and Carter Kilgour, a data engineer, explain the value of declarative data pipelines. 

Read More
Big Data Databricks

Tech Talk Series Part Four: Continuous Integration and Continuous Delivery with Delta Lake

Join Databricks for the final in a four part series with Salesforce Engineering.

Read More
Databricks

Introduction to Databricks Unified Data Platform

Simplify your data lake. Simplify your data architecture. Simplify your data engineering. Powered by Delta Lake, Databricks combines the best of data warehouses and data lakes into a lakehouse architecture, giving you one platform to collaborate on all of your data, analytics and AI workloads.

Read More
Spark

Unboxing Spark Standalone Architecture

Big Data Engineering closely examines  Spark Standalone Architecture. Apache Spark has a well-defined layered architecture where all the spark components and layers are loosely coupled. This architecture is further integrated with various extensions and libraries. Apache Spark Architecture is based on two main abstractions: Resilient Distributed Dataset (RDD) Directed Acyclic Graph (DAG)

Read More
Azure Azure Synapse Data Warehouse

Azure Synapse and On-Demand Serverless Compute and Querying

Microsoft Mechanics learns how UK-based data engineering consultant, endjin, is evaluating Azure Synapse for on-demand serverless compute and querying. Endjin specializes in big data analytics solutions for customers across a range of different industries such as ocean research, financial services, and retail industries. Host Jeremy Chapman speaks with Jess Panni, Principal and Data Architect at […]

Read More
Data Warehouse Databricks

Slowly Changing Dimensions (SCD) Type 2

Databricks recently streamed this tech chat on SCD, or Slowly Changing Dimensions. We will discuss a popular online analytics processing (OLAP) fundamental – slowly changing dimensions (SCD) – specifically Type-2. As we have discussed in various other Delta Lake tech talks, the reliability brought to data lakes by Delta Lake has brought a resurgence of […]

Read More
Databricks

Databricks for Data Engineering

ThorogoodBI explores the use of Databricks for data engineering purposes in this webinar. Whether you’re looking to transform and clean large volumes of data or collaborate with colleagues to build advanced analytics jobs that can be scaled and run automatically, Databricks offers a Unified Analytics Platform that promises to make your life easier. In the […]

Read More
Azure Data

The Modern Data Warehouse in Azure – Data Processing

In this video, Chris Seferlis continues discussing the Modern Data Platform in Azure with Part 3: Data Processing. Tools Discusssed: Azure Data Factory Data Flows – https://docs.microsoft.com/en-us/azure/data-factory/concepts-data-flow-overview Azure Databricks – https://azure.microsoft.com/en-us/services/databricks/ Azure HDInsight – https://azure.microsoft.com/en-us/services/hdinsight/ Azure Marketplace – https://azuremarketplace.microsoft.com/en-us/marketplace/

Read More