Data Lakes are notoriously bad at single record lookups, the kind of query where you are looking for a specific ID in amongst millions of records.

Eouldn’t it be great if we could just pop an index over the top to speed this type of operation up?

Turns out we can!

In this video Simon runs through a quick introduction to using Bloom Filter Indexes with Databricks Delta.

tt ads

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.