PySpark

Databricks

Architecting for Data Quality in the Lakehouse with Delta Lake and PySpark

From null values and duplicate rows to modeling errors and schema changes, data can break for millions of reasons. To combat this, teams are increasingly adopting best practices from DevOps and software engineering to identify, resolve, and even prevent this “data downtime” from happening in the first place. Join Prateek Chawla and Ryan Kearns as […]

Read More
Databricks

Databricks Pyspark: Merge (Upsert) using Pyspark and Spark SQL

Here’s an interesting video on doing a merge operation with PySpark and Spark SQL.

Read More
Big Data Databricks Spark

Data Quality Testing in the Medallion Architecture with PyTest and PySpark

Here’s a great Lightning talk from Data + AI Summit 2020 by Carter Kilgour on ”Why data quality is especially important in the medallion architecture, and how to ensure it with scheduled testing and reporting.”

Read More
Big Data Databricks Python

Automate Data Pipelines with PySpark SQL

Are you struggling with your cloud data management costs and architecture? Are you looking for ways to accelerate your data engineering capacity? By leveraging an age-old common tactic of generating SQL statements at runtime, structuring Dynamic SQL can accelerate the development of data pipelines. Watch this Data Collab Lab to learn more.

Read More
AI Azure

OSS Framework Support in Azure Machine Learning Service

Learn how Azure ML supports Open Source ML Frameworks and MLflow in AzureML. Take a walk through a ScikitLearn and Pytorch example to show the built in support for ML frameworks. Learn More: Azure ML Examples https://aka.ms/AIShow/AzureMLExamples Azure ML Curated Environments https://aka.ms/AIShow/AMLCuratedEnvironments Track and Monitor ML Flow https://aka.ms/AIShow/TrackandMonitorMLFlow Create a Free account (Azure) https://aka.ms/aishow-seth-azurefree Deep […]

Read More
TensorFlow

Deploying a Fraud Detection Microservice using TensorFlow, PySpark, and Cortex

The most popular dataset on Kaggle is  Credit Card Fraud Detection. It’s an easy to understand problem space and impacts just about everyone. Fraud detection is a practical application that many businesses care about.  There’s a also something intrinsically cool about stopping crime with AI. Here’s an interesting article on how to implement a fraud […]

Read More