The “Data Mess” is almost as old as the installation of a second database within a single organization. Or maybe even older and paper-based.
Many companies, all over the world, have tried to solve the data mess problem for decades with varying degrees of success.
Which is a nice way to say: in many cases with limited or no success.
No matter the many promises of technical silver bullets that were made over the years, like the MPP databases earlier or the Hadoop-based datalakes later, the task of integrating data is still far from being a trivial one.
About 6 months ago I had a chat with a friend, and former Teradata colleague, and he told me he had to discuss the data mesh with the CIO of a large Italian company that was extremely excited about the subject.
Unsurprisingly, given the ripples that this post of Zhamak Dehghani had in the market, in the preceding weeks I had several conversations about the data mesh with my team mates and we are still debating the subject.
I’m writing today because I’m concerned by the fact that the data mesh is perceived, in almost all the conversations I have, as the (new&improved) silver bullet that will finally kill the data mess monster for good.
I think this might be the case. But only as long as the data mesh is not reduced to the technology/architecture part of the solution.
The “data mess” is generated by a combination of shortcomings in 3 key areas:
The data mesh discussions I’ve had so far focus mostly, if not only, on the technical solutions with an unexpressed assumption (or hope) that removing the technical obstacles will be enough to magically fix also people and processes shortcomings.
I guess it might be because a lot of people in IT is more comfortable dealing with technologies than with processes and other people.
Or maybe I am just perceived as too much of a geek for my counterparts to discuss the non-technical aspects of the data mesh with me.
Frankly I hope it’s the latter scenario and the people and processes pillars are being addressed in other streams I’m not part of.
I say this because what the experience in the software quality space taught me is that technologies can facilitate processes, but don’t change them (with a few notable exceptions when packaged ERPs replaced custom solutions ahead of Y2K and many organization in a hurry just had to adapt to processes supported by the ERP they picked).
I also learned that people with enough motivation to do so can ignore, or even hijack, the best processes.
Both the first and the second post of Zhamak Dehghani touch multiple times the process aspects.
Are processes prominently missing only from the conversations I am having and hearing about or is a common pattern?
I tend to think that the people pillar (a.k.a. incentives to embrace a new way of doing things) is still not sorted out, or maybe is even perceived as too hard to approach, in many organizations and for this reason is simply removed from the debate.
I believe that solving the people part of the problem is strongly tied to a real transformation of data into a product rather than just a dump of JSON by-products, that the potential consumer has to figure out how to use, of the organization’s processes.
What incentive is given to the marketing team (or the e-commerce one, or the customer service, or the production lines…) to invest part of their limited budget to produce high-quality. easy to use, data available in the mesh and, maybe, also increase the data value over time?
No ROI, no party.
In the end my answer to the question I asked in the title is:
“Building a data mesh infrastructure without creating effective processes (and the right incentives for individuals and organizations to embrace the new processes) is not going to remove the data mess from the map.”
I am made the same observations. Upstream parts of the value chain often have no direct ROI/incentive to “produce” data good enough for downstream parts of the value chain that have an interest in the data. Hence the teams owning the downstream part of the value chain will always face the challenge regardless of what technology or architectural approach is used to solve the problem.
Pingback: Data Mesh defined | James Serra's Blog
Pingback: Replacing the word “source” with the word “product” is not enough to change the reality of your data. | marcoullasci