Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

All very good and useful points. One additional thing to mention is that as you are querying across the raw data with a data lake(house), performance is fundamentally worse, even if a lot of the marketing material will tell you otherwise. Usually significantly worse than if your data was in a columnar database in practice.

Depending on your use case this may or may not be a problem. For most companies I'd wager that it is a bigger problem than it first appears.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: