Data Lake

Azure Big Data Data

Where Should You Put Your Data in Azure?

A frequent question asked is: Where goes what or where should I put my data? With Amy Boyd, Frank (not me) invited different product teams to share what type of data and goes in their service. Let’s meet with Synapse Analytics, Cosmo DB, Azure Data Lake, and Azure Data Explorer product manager. Each one will […]

Read More
Databricks

Databricks on Databricks: AMA with Data Engineering SMEs

Data engineers and data leaders are the linchpin of every data-driven organization. Today’s data engineers face a number of critical use cases: ensuring the organization has access to clean, reliable data, maintaining governance and security as the organization scales, and providing access to data teams for analysis. Whatch this session for live Q&A with Databricks […]

Read More
Big Data Databricks

Clean Your Data Swamp by Migrating Off of Hadoop

In this session, learn how to quickly supplement your on-premises Hadoop environment with a simple, open, and collaborative cloud architecture that enables you to generate greater value with scaled application of analytics and AI on all your data. You will also learn five critical steps for a successful migration to the Databricks Lakehouse Platform along […]

Read More
AWS Databricks

Large Scale Lakehouse Implementation Using Structured Streaming

Business leads, executives, analysts, and data scientists rely on up-to-date information to make business decision, adjust to the market, meet needs of their customers or run effective supply chain operations. Come hear how Asurion used Delta, Structured Streaming, AutoLoader and SQL Analytics to improve production data latency from day-minus-one to near real time Asurion’s technical […]

Read More
Data Driven

Dave Wentzel on Why You Don’t Need a Data Warehouse

In this episode of Data Driven, Frank and Andy chat with Philadelphia Microsoft Technology Center Data Architect Dave Wentzel on why you do not need a data warehouse. Also, Frank discusses leaving Microsoft, Frank and Andy talk about five seasons of Data Driven, and even BAILeY has a sentimental moment.Show NotesComing Soon. Press the play […]

Read More
Data Warehouse Databricks

What is Delta Lake?

Delta Lake is an open format storage layer that delivers reliability, security and performance on your data lake — for both streaming and batch operations. By replacing data silos with a single home for structured, semi-structured and unstructured data, Delta Lake is the foundation of a cost-effective, highly scalable lakehouse. Learn more: https://databricks.com/product/delta-lake-on-databricks

Read More
Databricks

Introduction to Databricks Unified Data Platform

Simplify your data lake. Simplify your data architecture. Simplify your data engineering. Powered by Delta Lake, Databricks combines the best of data warehouses and data lakes into a lakehouse architecture, giving you one platform to collaborate on all of your data, analytics and AI workloads.

Read More
Big Data Data

How to Design and Implement a Real-time Data Lake with Dynamically Changing Schema

Building a curated data lake on real time data is an emerging data warehouse pattern with delta. However in the real world, what we many times face ourselves with is dynamically changing schemas which pose a big challenge to incorporate without downtimes. In this presentation we will present how we built a robust streaming ETL […]

Read More
Big Data Spark

How to Use SQL with Delta Lake

Delta Lake is an open-source storage management system (storage layer) that brings ACID transactions and time travel to Apache Spark and big data workloads. The latest and greatest of Delta Lake 0.7.0 requires Apache Spark 3 and among the features is a full coverage of SQL DDL and DML commands.

Read More
Databricks

Delta Lakehouse Data Profiler and SQL Analytics Demo

Coming from a data warehousing and BI background, Franco Patano wanted to have a catalogue of the Lakehouse, including schema and profiling statistics. He created the Lakehouse Data Profiler notebook using Python and SQL to analyze the data and generate schema and statistics tables. He then uses the new SQL Analytics product from Databricks to […]

Read More