
Kafka Streams 101: Getting Started
- Frank
- September 13, 2022
- apache kafka
- API
- confluent
- confluent cloud
- Data
- data in motion
- database
- Developers
- event streaming
- event-driven architecture
- Hadoop
- introduction to kafka streams
- Kafka
- kafka streams
- kafka streams architecture
- kafka streams for beginners
- kafka streams for dummies
- kafka streams fundamentals
- kafka streams introduction
- kafka tutorial
- ksqldb
- kstream
- kstreams
- messaging queue
- Microservices
- Open Source
- real-time
- Stream Processing
- STREAMING
- streams
- what is kafka streams
To understand Kafka Streams, you have to begin with Apache Kafka®, a distributed, scalable, elastic, and fault-tolerant event streaming platform. The storage nodes in Kafka, brokers, are just instances of the Kafka storage layer process running on your laptop or server. At the heart of each broker is a log, an append-only file that holds […]
Read More
Clean Your Data Swamp by Migrating Off of Hadoop
In this session, learn how to quickly supplement your on-premises Hadoop environment with a simple, open, and collaborative cloud architecture that enables you to generate greater value with scaled application of analytics and AI on all your data. You will also learn five critical steps for a successful migration to the Databricks Lakehouse Platform along […]
Read More
Unboxing Spark Standalone Architecture
- Frank
- January 22, 2021
- apache hadoop
- apache hive
- Apache Spark
- apache spark architecture
- Big Data
- big data in tamil
- big data training
- Data Engineering
- Hadoop
- hadoop ecosystem
- hadoop framework
- hadoop in tamil
- hadoop overview
- hadoop training
- scala wordcount
- spark architecture
- spark big data
- spark data frames
- spark in tamil
- spark rdd
- spark standalone cluster
- Spark Streaming
- spark vs hadoop
- spark wordcount
- what is big data
- what is hadoop
Big Data Engineering closely examines Spark Standalone Architecture. Apache Spark has a well-defined layered architecture where all the spark components and layers are loosely coupled. This architecture is further integrated with various extensions and libraries. Apache Spark Architecture is based on two main abstractions: Resilient Distributed Dataset (RDD) Directed Acyclic Graph (DAG)
Read More
Kafka + Spark Streaming + Hive Example
Davis Busteed walks us through building a proof of concept for Spark Streaming from a Kafka Source to Hive. Check out the README and resource files at https://github.com/dbusteed/kafka-spark-streaming-example
Read More
Banking in Latin America: Planning for Hybrid Cloud | Part 2
- Frank
- July 21, 2020
- AKS
- Application Gateway
- architecture
- Azure
- Azure Front Door
- Azure in the Enterprise
- Azure Kubernetes Service
- azure networking
- Cloud Native
- cluster
- Deployment
- Developers
- dns
- Fernando Mejia
- Gateway Subnet
- Hadoop
- Hub
- Hybrid Cloud
- Hybrid Networking
- IP
- Kubernetes
- kubernetes cluster
- Lyle Dodge
- machines
- Microservices
- Microsoft
- Microsoft Azure
- network
- Network Peering
- Nodes
- On-Premises
- Pod
- Spoke topology
- sql database
- Virtual Firewalls
- Virtual Network
- VNET
- VPN
- Workloads
In this second part episode, Fernando Mejia walks through everything you need to plan for in a Hybrid Cloud architecture for Azure Kubernetes Service. This includes IP address concerns from on-premises to Azure, hub and spoke topology, as well as the different options you have in Azure Kubernetes Service. Watch Part 1 Learn more: https://azure.microsoft.com/en-us/overview/kubernetes-on-azure
Read More
Introduction to Azure Databricks
Ayman El-Ghazali recently presenting this Introduction to Databricks from the perspective of a SQL DBA at the NoVA SQL Users Group. Code available at:https://github.com/thesqlpro/blogThis is an introduction to Databricks from the perspective of a SQL DBA. Come learn about the following topics: Basics of how Spark works Basics of how Databricks works (cluster setup, basic […]
Read More
The Modern Data Warehouse in Azure – Data Processing
In this video, Chris Seferlis continues discussing the Modern Data Platform in Azure with Part 3: Data Processing. Tools Discusssed: Azure Data Factory Data Flows – https://docs.microsoft.com/en-us/azure/data-factory/concepts-data-flow-overview Azure Databricks – https://azure.microsoft.com/en-us/services/databricks/ Azure HDInsight – https://azure.microsoft.com/en-us/services/hdinsight/ Azure Marketplace – https://azuremarketplace.microsoft.com/en-us/marketplace/
Read More
Modern Data Warehousing Part 1: Azure Data Ingestion Options
In this video, Chris Seferlis discusses stage one of the Modern Data Warehouse Process: Data Ingestion. Related Links: Azure Data Factory – https://azure.microsoft.com/en-us/services/data-factory/ Azure Databricks – https://azure.microsoft.com/en-us/services/databricks/ Azure HDInsight – https://azure.microsoft.com/en-us/services/hdinsight/ Azure Synapse – https://azure.microsoft.com/en-us/services/synapse-analytics/ Azure Data Box – https://azure.microsoft.com/en-us/services/databox/ Event Hubs – https://azure.microsoft.com/en-us/services/event-hubs/ Kafka on HDInsight – https://docs.microsoft.com/en-us/azure/hdinsight/kafka/apache-kafka-introduction IoT Hub – https://azure.microsoft.com/en-us/services/iot-hub/
Read More
Big Data Cluster High Availability
In this video learn about the high availability options you have for the mission critical services running within the SQL Server Big Data Clusters. Find out more here: https://docs.microsoft.com/en-us/sql/big-data-cluster/deployment-high-availability?view=sql-server-ver15&WT.mc_id=dataexposed-c9-niner https://docs.microsoft.com/en-us/sql/big-data-cluster/deployment-high-availability-hdfs-spark?view=sql-server-ver15&WT.mc_id=dataexposed-c9-niner
Read More
Introduction to Azure Data Lake Storage Gen 2
Data Lake Storage Gen 2 is the best storage solution for big data analytics in Azure. With its Hadoop compatible access, it is a perfect fit for existing platforms like Databricks, Cloudera, Hortonworks, Hadoop, HDInsight and many more. Take advantage of both blob storage and data lake in one service! In this video, Azure 4 […]
Read More