AI Open Source Python Red Hat

Accelerating AI with Python-native Ray and the Importance of Open Source in AI

On this episode of Data Driven, I explore the topic of distributed computing frameworks for AI and ML workloads. I also discuss the advancements of Ray, a new technology based on Python language, with performance enhancements that could range from 10-12 times faster to thousands of times faster in extreme cases. We delve into the […]

Read More
Big Data Data IoT

Kafka Streams 101: Getting Started

To understand Kafka Streams, you have to begin with Apache Kafka®, a distributed, scalable, elastic, and fault-tolerant event streaming platform. The storage nodes in Kafka, brokers, are just instances of the Kafka storage layer process running on your laptop or server. At the heart of each broker is a log, an append-only file that holds […]

Read More
Big Data Databricks

Clean Your Data Swamp by Migrating Off of Hadoop

In this session, learn how to quickly supplement your on-premises Hadoop environment with a simple, open, and collaborative cloud architecture that enables you to generate greater value with scaled application of analytics and AI on all your data. You will also learn five critical steps for a successful migration to the Databricks Lakehouse Platform along […]

Read More

Unboxing Spark Standalone Architecture

Big Data Engineering closely examines  Spark Standalone Architecture. Apache Spark has a well-defined layered architecture where all the spark components and layers are loosely coupled. This architecture is further integrated with various extensions and libraries. Apache Spark Architecture is based on two main abstractions: Resilient Distributed Dataset (RDD) Directed Acyclic Graph (DAG)

Read More
Big Data

Kafka + Spark Streaming + Hive Example

Davis Busteed walks us through building a proof of concept for Spark Streaming from a Kafka Source to Hive. Check out the README and resource files at https://github.com/dbusteed/kafka-spark-streaming-example 

Read More
Azure Containers FinTech

Banking in Latin America: Planning for Hybrid Cloud | Part 2

In this second part episode, Fernando Mejia walks through everything you need to plan for in a Hybrid Cloud architecture for Azure Kubernetes Service. This includes IP address concerns from on-premises to Azure, hub and spoke topology, as well as the different options you have in Azure Kubernetes Service.  Watch Part 1 Learn more: https://azure.microsoft.com/en-us/overview/kubernetes-on-azure

Read More
Azure Databricks

Introduction to Azure Databricks

Ayman El-Ghazali recently presenting this Introduction to Databricks from the perspective of a SQL DBA at the NoVA SQL Users Group. Code available at:https://github.com/thesqlpro/blogThis is an introduction to Databricks from the perspective of a SQL DBA. Come learn about the following topics: Basics of how Spark works Basics of how Databricks works (cluster setup, basic […]

Read More
Azure Data

The Modern Data Warehouse in Azure – Data Processing

In this video, Chris Seferlis continues discussing the Modern Data Platform in Azure with Part 3: Data Processing. Tools Discusssed: Azure Data Factory Data Flows – https://docs.microsoft.com/en-us/azure/data-factory/concepts-data-flow-overview Azure Databricks – https://azure.microsoft.com/en-us/services/databricks/ Azure HDInsight – https://azure.microsoft.com/en-us/services/hdinsight/ Azure Marketplace – https://azuremarketplace.microsoft.com/en-us/marketplace/

Read More
Azure Data Data Warehouse

Modern Data Warehousing Part 1: Azure Data Ingestion Options

In this video, Chris Seferlis  discusses stage one of the Modern Data Warehouse Process: Data Ingestion. Related Links: Azure Data Factory – https://azure.microsoft.com/en-us/services/data-factory/ Azure Databricks – https://azure.microsoft.com/en-us/services/databricks/ Azure HDInsight – https://azure.microsoft.com/en-us/services/hdinsight/ Azure Synapse – https://azure.microsoft.com/en-us/services/synapse-analytics/ Azure Data Box – https://azure.microsoft.com/en-us/services/databox/ Event Hubs – https://azure.microsoft.com/en-us/services/event-hubs/ Kafka on HDInsight – https://docs.microsoft.com/en-us/azure/hdinsight/kafka/apache-kafka-introduction IoT Hub – https://azure.microsoft.com/en-us/services/iot-hub/

Read More