Here’s an interesting documentary on how scientists use big data to monitor the earth.
With the rapid emergence of digital devices, an unstoppable, invisible force is changing human lives in incredible ways. That force is data. We generated more data in 2017 than in all the previous 5,000 years of human history.
The massive gathering and analyzing of data in real time is allowing us to address some of humanity’s biggest challenges but as Edward Snowden and the release of NSA documents have shown, the accessibility of all this data comes at a steep price.
This documentary captures the promise and peril of this extraordinary knowledge revolution.
Frank Kane, creator of many great Udemy courses, explains Kafka in his trademark clear and straightforward manner.
Spark is gaining momentum in the big data space. Watch this video for a demonstration of how you can use your favorite developer tools to debug Spark applications.
Product info: azure.microsoft.com/en-us/services/hdinsight/apache-spark/
Learn more: docs.microsoft.com/en-us/azure/hdinsight/spark/apache-spark-load-data-run-query
Raghav Mohan joins Scott Hanselman to talk about Apache Kafka on HDInsight, which added the open-source distributed streaming platform last year to complete a scalable, big data streaming scenario on Azure.
Kafka is capable of processing millions of events/sec, petabytes of data/day to power scenarios like Toyota’s connected car, Office 365’s clickstream analytics, fraud detection for large banks, etc.
Find out how to deploy managed, cost-effective Kafka clusters on Azure HDInsight with a 99.9% SLA with just 4 clicks or pre-created ARM templates.
For more information, see:
- Announcing Apache Kafka for Azure Hdinsight General Availability
- Apache Kafka for HDInsight
- Start with Apache Kafka on HDInsight (docs)
- Use Apache Kafka with Storm on HDInsight (docs)
- Apache Spark streaming (DStream) example with Kafka on HDInsight (docs)
- Analyze logs for Apache Kafka on HDInsight (docs)
Deep learning and AI are fundamentally changing the way data is used in computation. They enable computing capabilities that will transform almost every industry, scientific domain, and public usage of data and compute.
The recent success of deep learning algorithms can be seen as the culmination of decades of progress in three areas: research in DL algorithms, broad availability of big data infrastructure, and the massive growth of computation power produced by Moore’s law and the advent of parallel compute architectures.
Deep learning has been employed successfully in such diverse areas as healthcare, transportation, industrial IoT, finance, entertainment, and retail, in addition to high-performance computing.
Examples shown in this video illustrate how the approach works and how it complements high-performance data analytics and traditional business intelligence.
Here is a great overview of Hadoop for the beginner.
Hadoop is most often associated with big data.
A look at the different Hadoop solutions such as Clouder, Hortonworks, MapR and Intel.
Hear Pythian’s CTO Alex Gorbachev discuss these tools and the overall Hadoop ecosystem.
Mike Olson, Chief Strategy Officer and Co-Founder at Cloudera, explains Apache Spark’s origins, its rise in popularity in the open source community, and how Spark is primed to replace MapReduce as the general processing engine in Hadoop.