We are processing petabytes of data every day; only a few kilobytes are relevant. Storage is cheap, but processing is still expensive, especially when the users need to wait for it.

Back in the monolithic/RDBMS platform era, we cared about not to store too much data and optimizing the queries as much as we could. Now with our de-normalized, microservices-based data pipelines, we are storing everything we can just in case.

As we design our next-gen data platforms, we need to think deeply about the consumption layer and have well-defined maps (aka APIs) where users can quickly discover the data treasures. Data and product thinking need to converge.

Did you like this post? Subscribe!

Powered by MailChimp


Hi! My name is Leo Celis. I’m an entrepreneur and Python developer specialized in Ad Tech and MarTech.

read more