We are processing petabytes of data every day; only a few kilobytes are relevant. Storage is cheap, but processing is still expensive, especially when the users need to wait for it.
Back in the monolithic/RDBMS platform era, we cared about not to store too much data and optimizing the queries as much as we could. Now with our de-normalized, microservices-based data pipelines, we are storing everything we can just in case.
As we design our next-gen data platforms, we need to think deeply about the consumption layer and have well-defined maps (aka APIs) where users can quickly discover the data treasures. Data and product thinking need to converge.
Latest posts by Leo Celis (see all)
- The AI Automation Engineer - 05/13/25
- Hire One Developer to Press One Key - 05/06/25
- Stop Consuming AI and Start Riding AI - 04/29/25