In this episode we speak with Paul Singman Developer Advocate at Treeverse / LakeFS. LakeFS is an open source project  that allows you to transform your object storage into a Git-like repository. 

Top 3 takeaways

  • LakeFS enables use cases like debugging to quickly view historical versions of your data at a specific point in time and running ML experiments over the same set of data with branching..
  • The current data landscape is very fragmented with many tools available.. Over the coming years there will most likely be consolidation of tools that are more open and integrated. 
  • Data quality and observability continue to be key components of successful data lakes and having visibility into job runs.