Each advertising channel has a data model. They all seem to share the “campaign > ad group > ad” hierarchy, but the attributes and stats are different.

So tired of running alter tables in RDBM databases, developers saw NoSQL schemaless -or partially inforced schemas- as a good opportunity to store whatever they want, without caring about entity models.

These decisions come from the endless need to use the latest technologies, without thinking for a sec if we are using the right tool for the job.

All systems have clear entities, required attributes, and relationships. Attributes might be added (i.e., Facebook launching a metric), or removed (i.e., Facebook removing a metric.)

Choosing Elasticsearch over PostgreSQL because it is new, or because you don’t have to worry about your data model, is dead wrong.

NoSQL was invented for one single purpose: performance (a lightweight database.) The unstructured data support was by design, to support better performance. It was all about how the data was stored/retrieved and reduce the overhead of the RDBM models.

If you need to store and aggregate/analyze large volumes of data (which is the main use case in the ad tech industry), then NoSQL becomes a good choice.

You can still keep your old and dusty MySQL, saving structured data. Use a schemaless alternative only if you have a good use case for it.

Leo Celis