Opinion

Why we need to rethink our approach to metadata management and data governance

Image for post
Image for post
Image courtesy of Andrey_Kuzmin on Shutterstock

As companies increasingly leverage data to power digital products, drive decision making, and fuel innovation, understanding the health and reliability of these most critical assets is fundamental. For decades, organizations have relied on data catalogs to power data governance. But is that enough?

Debashis Saha, VP of Engineering at AppZen, formerly at eBay and Intuit, and Barr Moses, CEO and Co-founder of Monte Carlo, discuss why data catalogs aren’t meeting the needs of the modern data stack, and how a new approach — data discovery — is needed to better facilitate metadata management and data reliability.

It’s no secret: knowing where your data lives and who has access to it is fundamental to understanding its impact on your business. In fact, when it comes to building a successful data platform, it’s critical that your data is both organized and centralized, while also easily discoverable. …


Image for post
Image for post
Image courtesy of Monte Carlo

Monte Carlo, the data observability company, today announced the launch of the Monte Carlo Data Observability Platform, the first end-to-end solution to prevent broken data pipelines. Monte Carlo’s solution delivers the power of data observability, giving data engineering and analytics teams the ability to solve the costly problem of data downtime.

As businesses increasingly rely on data to drive better decision making and maintain their competitive edge, it’s mission-critical that this data is accurate and trustworthy. Today, companies spend upwards of $15 million annually tackling data downtime, in other words, periods of time where data is missing, broken, or otherwise erroneous, and 1 in 5 companies have lost a customer due to incomplete or inaccurate data. …


3 tactics you should be thinking about

Image for post
Image for post
Image courtesy RT Images on Shutterstock.

For data teams, broken data pipelines, stale dashboards, and 5 a.m. fire drills are par for the course, particularly as data workflows ingest more and more data from disparate sources. Drawing inspiration from software development, we call this phenomenon data downtime — but how can we proactively prevent bad data from striking in the first place?

Recently, one of my customers posed this question:

“How do you prevent data downtime?”

As Head of Data for a 3,000-person media company, his team was responsible for serving terabytes of data to hundreds of stakeholders per day. Given the scale and speed at which they were moving, data downtime, in other words, periods of time when data is fully or partially missing, erroneous, or otherwise inaccurate, was an all-too-common occurrence. …


Introducing a new approach to preventing broken analytics dashboards and increasing trust in your data.

Image for post
Image for post
Image courtesy of Shutterstock.

Companies spend upwards of $15 million annually tackling data downtime, in other words, periods of time where data is missing, broken, or otherwise erroneous, and 1 in 5 companies have lost a customer due to incomplete or inaccurate data.

Fortunately, there’s hope in the next frontier of data: observability. Here’s how data engineers and BI analysts at Yotpo, a global eCommerce company, increases cost savings, collaboration, and productivity with data observability at scale.

Yotpo works with eCommerce companies across the world to help them accelerate online revenue growth through reviews, visual marketing, loyalty and referral programs, and SMS marketing.

For Yoav Kamin, Director of Business Performance, and Doron Porat, Data Engineering Team Leader, having consistently accurate and reliable data is foundational to the success of this mission. …


Data is a dynamic, often unstructured entity — when it comes to ensuring data quality, testing alone won’t save your pipelines.

Image for post
Image for post
Image courtesy of Michael Shiffer on Unsplash.

In 2021, data testing alone isn’t sufficient for ensuring accurate and reliable data. Just as software engineering teams leverage solutions like New Relic, DataDog, and AppDynamics to monitor the health of their applications, modern data teams require a similar approach to monitoring and observability. Here’s how you can leverage both testing and monitoring to prevent broken data pipelines and achieve high data reliability.

For most companies, data is the new software.

Like software, data is fundamental to the success of your business. It needs to be “always-on”, with data downtime treated as diligently as application downtime (five nines, anyone?). …


Opinion

Here’s why context is key when it comes to unlocking the value of your metadata.

Image for post
Image for post
Image courtesy of Shutterstock.

Last week, I participated in a panel at the Coalesce conference, led by the Fishtown Analytics team (creators of dbt), to discuss the role of metadata in the modern data stack. One of the points we discussed was: metadata is useless. In this blog post, I’ll explain why.

Over the last decade, data teams have become increasingly proficient at collecting large quantities of data. While this has the potential to drive digital innovation and more intelligent decision making, it has also inundated companies with data they don’t understand or can’t use.

All too often organizations hungering to become data-driven can’t see the forest for the trees: data without a clear application or use case is nothing more than a file in a database or a column in a spreadsheet. …


A cautionary tale for data teams this holiday season

Image for post
Image for post
Image courtesy of Barr Moses.

What happens when a freshness anomaly threatens to ruin Christmas? Turns out, even Santa Claus and his elves aren’t immune to broken data pipelines. With a little help from a classic holiday poem, here’s the story of how data downtime nearly derailed Santa’s workshop and the wise elf who saved the day.

’Twas three nights before Christmas, and in the North Pole

The elves were all frantic, they feared for their roles;

Their prep meeting with Santa was a total disaster,

“This gift list looks wrong!” He chanted louder and faster;

The elves Slacked the data team, “Santa’s freaking out! …


Opinion

So how did we get here? Here are the 3 main ways data governance is failing us and how we can fix it.

Image for post
Image for post
Image Courtesy of Julia Joppien on Unsplash

Over the past several years, data governance has emerged as more than just a trendy buzzword. With the passage of GDPR, CCPA, and other industry, government, and healthcare compliance measures, data governance has become a corporate necessity, yet many Chief Data Officers cite data governance as a major hurdle for their organization.

So how did we get here? Here are the three main reasons why data governance is failing us:

1. A manual approach is no longer practical.

While we’ve made great advancements in areas such as self-service analytics, cloud computing, and data visualization, we’re not there yet when it comes to governance. Many companies continue to enforce data governance through manual, outdated, and ad hoc tooling. Data teams spend days manually vetting reports, setting up custom rules, and comparing numbers side by side. …


How the $4.6b unicorn prevents broken data pipelines

Image for post
Image for post
Image courtesy of Avi Waxman on Unsplash.

Compass, a $6.4b real estate unicorn, works with real estate agents, buyers, and sellers to support the entire buying and selling workflow through their real estate technology platform.

Compass uses data to help fulfill their mission “to help everyone find their place in the world” and fuel their remarkable growth by creating advanced technology products for these three unique audiences. The Compass user experience is fueled by a complex set of technologies, including a powerful mobile app, agent-facing Marketing Center, intelligent pricing tools, and a proprietary CRM.

Behind the scenes, Compass employs a data infrastructure and intelligence team of 10 people. …


5 essentials steps for accelerating the adoption of data at your company

Image for post
Image for post
Image courtesy of Natali_ Mis of Shutterstock.

In the last decade, we’ve figured out how to track, collect, store, and query data, but we haven’t mastered our ability to figure out how to make sure that your data can actually be trusted and used.

Dror Engel, Head of Product, Data Services, at eBay, and Barr Moses, CEO and co-founder of Monte Carlo, share five essential steps for establishing a successful data quality strategy.

Recently, a Chief Data Officer (CDO) at a leading financial services company told us that his data organization processed and stored over hundreds of thousands of jobs (up to one terabyte of data) per day. …

About

Barr Moses

Co-Founder and CEO, Monte Carlo (www.montecarlodata.com). @BM_DataDowntime #datadowntime

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store