Metadata hub prepares company for digital transformation

Knowledge workers spend 30 percent of their workweek looking for and recreating information that they know already exists, according to analyst firm Gartner.

Bacon Tree's (BTC) client's leadership has invested in managing its content to mitigate the considerable cost involved in not being able to find information, notably with a digital asset management system. The DAM uses metadata to help point users to content they need.

Metadata is a type of data that adds context that computers (and people) use to find things. It can also improve workflows, decrease errors, increase productivity, centralize data administration and be a foundation for innovation.

BTC's client has a multiplicity of data stores, and their connections evolved in an organic, narrowly focused manner. This is common across organizations because organic growth tends to be economical and solves problems quickly and efficiently.

Organic growth without a larger vision in mind — or without the time to implement a larger vision — often leads to a web of data stores with many connections that support existing functionality. As growth continues, it becomes more difficult to make those connections work together efficiently.

For BTC's client, this means complicated workflows that include pushing data from system to system, data mistakes that can be corrected only by database administrators, duplicating tasks in several systems, narrow views into business processes, and an inability to serve customers with desired functionality.

The answer is managing data in a more efficient way. This is especially important because, according to Gartner, “by 2020 the greatest source of competitive advantage for 30 percent of organizations will come from the workforce’s ability to creatively exploit digital technology.”

To solve this issue, BTC is implementing a metadata management hub with a search and discovery tool that will save employees time, increase productivity, and be the backbone of digital products.

Daisy chained data is an evolutionary mess

BTC's client’s systems often don't talk to one another. Instead, they pass information through one or two systems so a third system can use it. Think of passing a note in class where it goes from the person on your left to the person on their left until it reaches the intended recipient. Then it traverses the same path to get back to you.

Adding more metadata increases the daisy chain, so we specifically did not want to add more data stores.

Such data issues are not unique to BTC's client, but how others have addressed the issue is instructive.

The advent of big data, brought about by advances in data storage, has taxed the traditional relational database management system (DBMS) setup, which struggles to handle large volumes and varieties of data with speed.

As a consequence, we've adjusted how we handle relational data – making components of the system independent and interchangeable. This includes storing data in a lake, the purpose of which is to hold a variety of data that is usable on demand. This differs from traditional DBMSs, which store and describe data very precisely – and make it very difficult to ask questions that you did not anticipate when setting up the system.

Data lakes, by contrast, let you define the data when you need to ask questions. The idea is that you can ask more and different questions than you can in a traditional DBMS.

But that comes at a cost. In data lakes, we have almost no way to understand the origin, scope, and usage of data. "As a result, data lake projects typically lack even the most rudimentary information about the data they contain or how it is being used," write the authors of "Ground: A Data Context Service."

This causes two types of problems for end users: poor productivity and governance issues. Data users will have problems finding data because they don't know what exists – and even when they do, it may not use the same vocabulary. Governance issues arise when we don't know the origin of the data. It's difficult to control access when you don't know what you have.

A central metadata hub can provide context about metadata's origin, storage, and usage — and creating one provides an opportunity to rethink how we track and leverage data.

An increasing number of organizations agree with this view and are creating a central way to ask questions of their data while it continues to live in the original/appropriate repository, as if the data “telecommutes” to work. You can call it, ask it questions, and send those answers on to other places – all without it leaving “home”.

Netflix, LinkedIn, AirBnB, Uber, Lyft, and Harvard are using metadata hubs to:

●      improve workflows

●       increase productivity

●       power findability (the ability to find what you’re searching for) and discoverability (the ability to find content associated with what you were looking for and the ability to discover answers to questions you didn’t know you needed to ask)

●       centralize administration of data

●       drive innovation

Digital transformation needs metadata

Digital transformation is the process of adding modern IT capabilities to a business, so that it can respond quickly and effectively to changes in the marketplace.

Digital transformation is so important to the continued health of businesses worldwide that it could be worth $18 trillion globally in added business value, and the analyst firm Gartner projects that it could account for 36 percent of total corporate revenue by 2020, according to ITPro.

McKinsey agrees, saying that four of the five most important factors in making digital transformation successful are quality leadership, capability building, empowering workers, upgrading tools, and effective communication.

The end goal is agility — the ability to respond quickly and effectively to external and internal forces in order to remain a healthy, successful organization.

This is a particularly important given the disruption in textbook publishing. The headlines “Pearson Plunges as U.S. Students Shun Textbooks for Online Resources at Rapid Pace”  and ”How 174 Year Old Pearson Is Developing The Netflix Of Education” show two sides of the disruption — students aren’t paying for traditional print books and publishers are trying to respond with different digital offerings.

Client is already looking at market differentiation with an eye on giving the customer content they want in the format they want when they want it. This business strategy is vital, but it will likely prompt uncertainty inside the company as employees wonder about job security. Leadership from the C-suite is key to moving the company and its employees to a culture that embraces digital transformation.

Digital transformation requires the rethinking of how a company employs data and technology.  A comprehensive Harvard Business Review article on digital transformation asserts that strategy must come before anything else, saying, “Figure out your business strategy before you invest in anything.”

Getting data to talk

Enter  a metadata hub that captures the full context of metadata so that we know when and how data is used over time, understand how applications use data, can ask questions of data, and use data to drive innovation.

This is a near-term goal that we are implementing in two phases that will result in three significant advantages: early insight into business processes and how we can create models to best serve those processes; decreased maintenance cost because there will be fewer data transfer processes to maintain; and fewer errors since data is not being passed back and forth among systems.

The client's current system of systems is a combination of manual steps and programmatic calls from each client system to each source of truth (SoT) provider system. Eliminating the spider nest of data dependencies lets us use, add, modify, or delete metadata from each source, and each system that consumes that data will be updated.

Growth depends on metadata

BTC's client's continued growth requires a strong metadata portal to accommodate evolving metadata needs. Our proposed knowledge management platform will promote data-informed decision making and democratize data by empowering employees to discover data assets, understand their provenance, use them with ease, and find insights into customer behavior.