Running complex aggregations and analytical functions on real-time operational databases is a powerful capability in Azure SQL. In the last part of this three-part series with Silvano Coriani, we will see how Window Functions can be a great tool to express analytical calculations on real-time data sets.

More episodes in this Operational Analytics Series:
Resources:

Real-Time Operational Analytics:

In this video Chris Seferlis gives an overview of Azure Synapse Link, a newer feature of the Synapse Analytics Suite of tools.

Find out why this feature is important, the way it moves Operational Data to Analytical Data, and what you can then do with it.

More details about the service and some great tutorials can be found here:

In this session from Ignite 2020, learn how you can build real time BI dashboards with deep granularity using Azure Synapse and Azure Cosmos DB.

Additional resources:

Chris Seferlis discusses one of the lesser known and newer Data Services in Azure, Data Explorer.

If you’re looking to run extremely fast queries over large sets of log and IoT data, this may be the right tool for you. I also discuss where it’s not a replacement for Azure Synapse or Azure Databricks, but works nicely alongside them in the overall architecture of the Azure Data Platform.

In BlueGranite’s recent webinar, you will see several examples of Python in action for data modeling and visualization in Power BI. You will also learn where and how Python fits into a Power BI development workflow.

You’ll also see how to balance Python with native Power BI functionality and determine what limitations must be considered when using Python in Power BI.

Are you looking to gain more control of the costs of operating your growing Azure HDInsight clusters?

Are you interested in driving higher utilization of your Azure HDInsight clusters?

If yes, then you should definitely watch this video to learn about how to leverage Azure HDInsight Autoscale feature to help you achieve higher cost efficiency.

Time Index:

  • [00:33] Introduction
  • [01:00] Customer challenge
  • [01:55] Static clusters
  • [03:40] Solution is HDInsightAutoscale
  • [06:00] Demo of deployment
  • [09:05] Demo of load-balanced scaling
  • [10:00] Demo of scheduled scaling
  • [11:20] Validating cluster size

Powerpoint Designer utilizes machine learning to provide users with redesigned slides to maximize their engagement and visual appeal.

Up to 4.1 million Designer slides are created daily and the Designer team is adding new types of content continuously.

Time Index:

  • [02:39] Demo – PowerPoint suggests design ideas to help users build memorable slides effortlessly
  • [03:28] A behind-the-scenes look at how PowerPoint was built to make intelligent design recommendations
  • [04:47] AI focused on intelligently cropping images in photos and centering the objects, positioning the images, and even using multi-label classifiers to determine the best treatment.
  • [06:00] How PowerPoint is solving for Natural Language Processing (NLP).
  • [07:32] Providing recommendations when image choices don’t meet the users’ needs.
  • [09:30] How Azure Machine Learning helps the dev team scale and increase throughput for data scientists.
  • [11:10] How distributed GPUs helps the team work more quickly and run multiple models at once.

Here’s an interesting talk on Dask from AnacondaCon 2018.

Tom Augspurger. Scikit-Learn, NumPy, and pandas form a great toolkit for single-machine, in- memory analytics. Scaling them to larger datasets can be difficult, as you have to adjust your workflow to use chunking or incremental learners. Dask provides NumPy- and pandas-like data containers for manipulating larger than memory datasets, and dask-ml provides estimators and utilities for modeling larger than memory datasets.

These tools scale your usual workflow out to larger datasets. We’ll discuss some of the challenges data scientists run into when scaling out to larger datasets. We’ll then focus on demonstrations of how dask and dask-ml solve those challenges. We’ll see examples of how dask can expose a cluster of machines to scikit-learn’s built-in parallelization framework. We’ll see how dask-ml can train estimators on large datasets.AnacondaCon 2018. Tom Augspurger. Scikit-Learn, NumPy, and pandas form a great toolkit for single-machine, in- memory analytics.

Scaling them to larger datasets can be difficult, as you have to adjust your workflow to use chunking or incremental learners. Dask provides NumPy- and pandas-like data containers for manipulating larger than memory datasets, and dask-ml provides estimators and utilities for modeling larger than memory datasets. These tools scale your usual workflow out to larger datasets. We’ll discuss some of the challenges data scientists run into when scaling out to larger datasets. We’ll then focus on demonstrations of how dask and dask-ml solve those challenges. We’ll see examples of how dask can expose a cluster of machines to scikit-learn’s built-in parallelization framework. We’ll see how dask-ml can train estimators on large datasets.

Learn about what is Spark and using it in Big Data Clusters.

Time index

  • [00:00] Introduction
  • [00:30] One-sentence definition of Spark
  • [00:47] Storing Big Data
  • [01:44] What is Spark?
  • [02:35] Language choice
  • [03:27] Unified compute engine
  • [04:57] Spark with SQL Server
  • [05:47] Learning more
  • [06:10] Wrap-up