Data Lakes are notoriously bad at single record lookups, the kind of query where you are looking for a specific ID in amongst millions of records.
Eouldn’t it be great if we could just pop an index over the top to speed this type of operation up?
Turns out we can!
In this video Simon runs through a quick introduction to using Bloom Filter Indexes with Databricks Delta.