ETL

Databricks

dbt Projects Integration in Databricks Workflows

In this video, Olya from Databricks reviews how to automate dbt tasks, integrate project into larger workflows, and monitor dbt transformation.

Read More
Big Data Databricks

dbt Core and the Lakehouse

From Databricks‘ YouTube channel Previously, Olya walked through how the dbt-Databricks adapter enables Data Analysts to build, test, and deploy data models on Delta Lake. Additionally, Databricks can also connect to open-source dbt Core to run dbt Core projects as a task in a Databricks job. This allows you to automate your dbt tasks, include […]

Read More
AWS Data

AWS Glue 5th Anniversary Developer Retrospective | Amazon Web Services

AWS Glue serverless ETL and Data integration service celebrates its 5th anniversary. Watch this video to meet the engineers who built Glue from the ground-up, and hear their vision for Glue’s future.

Read More
Azure Synapse

Azure Synapse Full Course

Azure Synapse Analytics (ASA) is changing the way we work with data services in Azure. The ASA workspace combines the core technologies required for data warehousing, Big Data Analytics and Data Science. In this Learn with the Nerds event, Mitchell Pearson will teach you how you can use Synapse Analytics to solve the paradox of […]

Read More
Azure Synapse

Data Science and Predictive Analytics with Azure Synapse

Discover new Azure Synapse features to integrate predictive analytics capabilities into your organization—using both code-free and code-first options for AI/ML.

Read More
Databricks

How Databricks Leverages Auto Loader to Ingest Millions of Files an Hour

Continuously and incrementally ingesting data as it arrives in cloud storage has become a common workflow in our customers’ ETL pipelines. However, managing this workflow is rife with challenges, such as scalable and efficient file discovery, schema inference and evolution, and fault tolerance with exactly-once guarantees. Auto Loader is a new Structured Streaming source in […]

Read More
Big Data Data Databricks

Empowering Zillow’s Developers with Self-Service ETL

Databricks  shows how their tech empowers Zillow’s developers via self-service ETL. These tools abstract away the orchestration, deployment, and Apache Spark processing implementation from their respective users. In this talk, Zillow engineers discuss two internal platforms they created to address the specific needs of two distinct user groups: data analysts and data producers. Each platform […]

Read More
Azure Data

The Modern Data Warehouse in Azure – Data Processing

In this video, Chris Seferlis continues discussing the Modern Data Platform in Azure with Part 3: Data Processing. Tools Discusssed: Azure Data Factory Data Flows – https://docs.microsoft.com/en-us/azure/data-factory/concepts-data-flow-overview Azure Databricks – https://azure.microsoft.com/en-us/services/databricks/ Azure HDInsight – https://azure.microsoft.com/en-us/services/hdinsight/ Azure Marketplace – https://azuremarketplace.microsoft.com/en-us/marketplace/

Read More
Azure Big Data Databricks

How to Build a Cloud Data Platform with Databricks Part 2 – ETL Processing

Learn how to use Apache Spark and Delta Lake on Databricks to perform ETL processing, manage late arriving data, and repair corrupted data. Companies look to support both business analytics and machine learning initiatives within their organization, but often face challenges with complex operations, proprietary technologies, and unreliable data.

Read More