The Data Layer is I believe a pretty well known concept these days. But at the same moment, there is no widely accepted convention for using it.

Most of the time we choose from the following tools that already include their own implementation/specification:

Besides these tools there was a worthy effort of a pretty wide committee containing sound names from the Analytics community that resulted in Customer Experience Digital Data Layer 1.0 Specification shielded by The World Wide Web Consortium (W3C) . It may look that if there is such effort that the vendors would somehow try to comply with it, but it just didn’t happen. Let me share my ideas about why it didn’t happen and why I see a space for some new ideas and improvement.

Ten Things I Hate About …

Google Tag Manager’s Way

I don’t really like the idea that the Data Layer is tightly integrated into the Tag Management Solution.

The reasons are:

Other issues that make me uncomfortable:

Tealium

In this case the integration of the Data Layer into the TMS is not so tight. Basically the recommendation for it is a global JavaScript object utag_data.

The reasons why I hate this one are pretty simple:

  • Tealium recommends using a flat structure of their Data Layer.

It wouldn’t be such a terrible idea to do a reasonable flattening, because in JavaScript traversing a deep nested objects is simply a painful experience. The reason for this is that if you want to have a fail-safe code, you should test whether the object has a particular child attribute before you try to access it. So when you try to ask for digitalData.page.pageInfo.pageId (I didn’t make this up – this is a recommendation of the before mentioned W3C specification), the code might look like this:

var digitalData = digitalData || {}; // make sure it exists, but don't override it
if (typeof digitalData.page !== "undefined") {
  if (typeof digitalData.page.pageInfo !== "undefined") {
    if (typeof digitalData.page.pageInfo.pageId !== "undefined") {
      console.log(digitalData.page.pageInfo.pageId);
    }
  }
}

In the real-world scenario, you would probably write some sort of function to traverse the object, but basically the logic above needs to happen.

On the other hand, if you try to fight this problem by unreasonable flattening of the Data Layer object, you end up with an unreadable failure-prone solution, like the following (this is a part of the official Tealium documentation):

var utag_data = {
  "site_region": "eu",
  "site_currency": "EUR",
  "page_name": "order",
  "page_type": "order",
  "page_section_name": "Men´s",
  "page_subcategory_name": "Sports",
  "page_category_name": "Clothing",
  "product_id": ["12", "13"],
  "product_sku": ["1234", "5678"],
  "product_name": ["Shorts", "Socks"]
}

See how the product data span across the Data Layer – not very readable. I can imagine that making sure that product1 attributes are the right ones at a glance so as the product2, product3, etc. is a real nightmare (if you have an array of products, let it be an array of objects with further details, right?).

W3C CEDDL 1.0

You already saw the example I gave in the Tealium section to rant about the deep nested JavaScript objects. Right now I can’t imagine a website so complicated that it would need such level of abstraction to introduce a clear naming convention for all the aspects of the measured application.

The only way I like about this specification is this section:

6.13 Extending the Specification

Extending this specification is straightforward: implementers can add appropriate sub-objects or properties as needed.

The way I think about this paragraph is simple. You don’t need to use our bloated specification, but you can introduce your own way of describing object you wish to measure. And this approach I like very much. It lets me doing it my way, but still being able to follow the standard.

… so do you just hate every single solution? Well, no – there is also Adobe Dynamic Tag Management

Once upon a time, there was a new player in the TMS field called Satellite. TLDR: Adobe bought it and made it an integral part of their own solution – Adobe Marketing Cloud.

I like this tool very much, because in fact it doesn’t have a full-blown Data Layer in itself. This is so convenient, because therefore I can take all the best parts of the other solution and I can combine them into my own Data Layer. The TMS in itself has an interesting concept of Data Elements that can read the data from various sources that are available to the TMS on the website (basically via DOM).

So as a best practice, I would recommend using CEDDL-like Data Layer JavaScript object to read data from and if there is a really good reason to do it, you may read the data from the page itself. Good reasons are:

  • You need it just for a limited time or as a proof-of-concept.
  • It is insanely expensive to implement new data into Data Layer Server-side.
  • The data is available in a reliable format (i.e. Data Attribute or Meta Tag attribute).

The resolution?

This ideal Data Layer (and I am very interested if you agree with me about this):

  • is independent from the TMS
  • is transparent, so you can easily check what messages it received and what data it holds
  • is reasonably nested so it describes the reality precisely but the simplest possible way

There is still a lot of questions, though.

  • Is Data Layer a static thing or should it provide some other functionality?
  • How to approach the asynchronous events when you need to synchronize updates in the Data Layer with notification to TMS that something should be measured? In other words – who is in control, Data Layer itself or some other logic that just reads it?
  • How about caching the previously captured values?
  • How about recursive merge? Sometimes it helps a lot, sometimes it causes confusion.

As you can see, there is more questions and ideas than solutions or recommendations. That is because I wanted to share my opinions and provoke some discussion. So I welcome any feedback.

Leave a Reply

Your email address will not be published. Required fields are marked *