We -as a marketing engineers- spend most of our time moving data around. We collect, reformat, and save data coming from multiple origins.

It could be Google Trends serving data through a website, and then a python library storing the data in a dictionary or DataFrame format.

If you follow the data trail, all the way to the first-party data platform’s databases, you will find that the origin is the users. When you run a search on google, retweet a tweet, or log in to Facebook, you are generating marketing data.

Each platform will have its mechanisms and data pipelines to collect the data (or the data you allowed them to collect.) and eventually, they will expose them (aggregated in most cases) to marketing engineers.

Since you don’t know what they are collecting, and the engineers don’t have any way to validate the data accuracy (because it is anonymized and aggregated) the true ad spend ROI is a moving target. It is based on trust with the ad platforms.

The same way they use trust to sell you marketing data, you can use the trust to build an audience. You can become the first-party data collector, by providing a platform (like an web/mobile app, or blog) to your audience, and validate that the behavior you are seeing from them, is the same as the one you see in the ad platforms.

The marketing data origin should start with your current audience. Once you learn directly from your users what they want, then you can validate that with the platforms you are spending ad money on.


Hi! My name is Leo Celis. I’m an entrepreneur and Python developer specialized in Ad Tech and MarTech.

read more