Opinion

How to set expectations around data quality and reliability for your company

Image courtesy of Yevgenij_D on Shutterstock, available for use by author with Standard License.

For today’s data engineering teams, the demand for real-time, accurate data has never been higher, yet broken pipelines and stale dashboards are an all-too-common reality. So, how can we break this vicious cycle and achieve reliable data?

Just like our software engineering counterparts 20 years ago, data teams in the early 2020s are facing a significant challenge: reliability.

Companies are ingesting more and more operational and third-party data than ever before. Employees from across the business are interacting with data at all stages of its lifecycle, including those on non-data teams. …


Opinion

Why we need to rethink our approach to metadata management and data governance

Image courtesy of Andrey_Kuzmin on Shutterstock

As companies increasingly leverage data to power digital products, drive decision making, and fuel innovation, understanding the health and reliability of these most critical assets is fundamental. For decades, organizations have relied on data catalogs to power data governance. But is that enough?

Debashis Saha, VP of Engineering at AppZen, formerly at eBay and Intuit, and Barr Moses, CEO and Co-founder of Monte Carlo, discuss why data catalogs aren’t meeting the needs of the modern data stack, and how a new approach — data discovery — is needed to better facilitate metadata management and data reliability.

It’s no secret: knowing…


Image courtesy of Monte Carlo

Monte Carlo, the data observability company, today announced the launch of the Monte Carlo Data Observability Platform, the first end-to-end solution to prevent broken data pipelines. Monte Carlo’s solution delivers the power of data observability, giving data engineering and analytics teams the ability to solve the costly problem of data downtime.

As businesses increasingly rely on data to drive better decision making and maintain their competitive edge, it’s mission-critical that this data is accurate and trustworthy. Today, companies spend upwards of $15 million annually tackling data downtime, in other words, periods of time where data is missing, broken, or otherwise…


Introducing a better metric for calculating the cost of bad data at your company

Image courtesy of Barr Moses.

To quote a friend, “Building your data stack without factoring in data quality is like buying a Ferrari but keeping it in the garage.”

In this article, guest columnist Francisco Alberini, Product Manager at Monte Carlo, introduces a better way measure the cost of bad data on your company.

Last week, I was on a Zoom call with Lina, a Data Product Manager at one of our larger customers who oversees their data quality program.

Her team is responsible for maintaining 1000s of data pipelines that populate many of the company’s most business critical tables. Reliable and trustworthy data is…


The Definitive Guide

5 essential steps for troubleshooting data quality issues in your pipelines

Image courtesy of Monte Carlo.

This guest post was written by Francisco Alberini, Product Manager at Monte Carlo and former Product Manager at Segment.

Data pipelines can break for a million different reasons, and there isn’t a one-size-fits all approach to understanding how or why. Here are five critical steps data engineers must take to conduct root cause analysis for data quality issues.

While I can’t know for sure, I’m confident many of us have been there.

I’m talking about the frantic late afternoon Slack message that looks like:


Data Downtime Interview

A conversation with with Cindi Howson on what takes to achieve data democratization at scale.

We sat down with Cindi Howson, Chief Data Strategy Officer at ThoughtSpot, the leading search and AI-driven analytics platform, for a wide-ranging conversation about her daily work, common challenges organizations face on the road to data democratization, and diversity in data science.

Over the past few decades, the world of data analytics has undergone transformation from a siloed entity to a cross-functional powerhouse. Now, in 2021, in this decade of data, the time is ripe for yet another sea-change, this time in the form of data democratization and accessibility.

Paving the way forward for this new movement towards actionable…


Why we need a distributed approach to data governance and metadata management

Jason Leung on Unsplash.

Over the past few years, data lakes have emerged as a must-have for the modern data stack. But while the technologies powering our access and analysis of data have matured, the mechanics behind understanding and trusting this data in distributed environments have lagged behind.

Here’s where data discovery can help ensure your data lake doesn’t turn into a data swamp.

One of the first decisions data teams must make when building a data platform (second only perhaps to “why are we building this?”) is whether to choose a data warehouse or lake to power storage and compute for their analytics.


How to trust your data workflows, one pipeline at a time

Image courtesy of Paul Skorupskas on Unsplash.

As a new or aspiring data engineer, there are some essential technologies and frameworks you should know. How to build a data pipeline? Check. How to clean, transform, and model your data? Check. How to prevent broken data workflows before you get that frantic call from your CEO about her missing data? Maybe not.

By leveraging best practices from our friends in software engineering and developer operations (DevOps), we can think more strategically about tackling the “good pipelines, bad data” problem. For many, this approach incorporates observability, too.

Jesse Anderson, managing director of Big Data Institute and author of Data…


Introducing a new approach to building secure and scalable data products

Image courtesy of authors.

By Lior Gavish, co-founder and CTO, Monte Carlo, Kevin Stumpf, co-founder and CTO, Tecton, and Barr Moses, co-founder and CTO, Monte Carlo.

As founders of companies that build solutions designed to help teams deliver on the promise of data, we knew we wanted to build great products that are easy to deploy and manage for our customers.

We also knew that since we would be integrating with our customers’ data stacks, we would need to offer the highest level of security and compliance. The question was: how are we going to build them? SaaS? On-prem? Something else?

To meet these…


How to turn your company’s data analytics into more than just a corporate buzzword

Image courtesy of Ashley Jurius on Unsplash.

This article was co-written by Barr Moses, CEO and co-founder of Monte Carlo, and Jit Papneja, a global insights & analytics leader at several Fortune 100 companies, including The Coca-Cola Company, Johnson & Johnson, and Nestlé.

You’re on Snowflake and Looker? Great. But for most companies, having a cloud data stack is just the tip of the iceberg when it comes to operationalizing their data and analytics at scale. We share five non-obvious roadblocks businesses face when becoming data driven and highlight what some of the industry’s leading data engineering and analytics teams are doing to overcome them.

In 2021…

Barr Moses

Co-Founder and CEO, Monte Carlo (www.montecarlodata.com). @BM_DataDowntime #datadowntime

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store