Opinion

How to set expectations around data quality and reliability for your company

Image courtesy of Yevgenij_D on Shutterstock, available for use by author with Standard License.

For today’s data engineering teams, the demand for real-time, accurate data has never been higher, yet broken pipelines and stale dashboards are an all-too-common reality. So, how can we break this vicious cycle and achieve reliable data?

Just like our software engineering counterparts 20 years ago, data teams in the early 2020s are facing a significant challenge: reliability.

Companies are ingesting more and more operational and third-party data than ever before. Employees from across the business are interacting with data at all stages of its lifecycle, including those on non-data teams. …


Opinion

Why we need to rethink our approach to metadata management and data governance

Image courtesy of Andrey_Kuzmin on Shutterstock

As companies increasingly leverage data to power digital products, drive decision making, and fuel innovation, understanding the health and reliability of these most critical assets is fundamental. For decades, organizations have relied on data catalogs to power data governance. But is that enough?

Debashis Saha, VP of Engineering at AppZen, formerly at eBay and Intuit, and Barr Moses, CEO and Co-founder of Monte Carlo, discuss why data catalogs aren’t meeting the needs of the modern data stack, and how a new approach — data discovery — is needed to better facilitate metadata management and data reliability.

It’s no secret: knowing…


Image courtesy of Monte Carlo

Monte Carlo, the data observability company, today announced the launch of the Monte Carlo Data Observability Platform, the first end-to-end solution to prevent broken data pipelines. Monte Carlo’s solution delivers the power of data observability, giving data engineering and analytics teams the ability to solve the costly problem of data downtime.

As businesses increasingly rely on data to drive better decision making and maintain their competitive edge, it’s mission-critical that this data is accurate and trustworthy. Today, companies spend upwards of $15 million annually tackling data downtime, in other words, periods of time where data is missing, broken, or otherwise…


Data Mesh Case Study

One startup’s lessons learned building a data mesh — and data-driven culture — from scratch.

Image courtesy of Dan Geurts on Unsplash.

Berlin-based Kolibri Games has had a wild ride, rocketing from a student housing-based startup in 2016 to a headline-making acquisition by Ubisoft in 2020.

While a lot has changed in five years, one thing has always remained the same: the company’s commitment to building an insights-driven culture. With a new release almost every week, their mobile games are constantly changing and producing enormous amounts of data — handling 100 million events per day across 40 different event types, some with hundreds of triggers.

Along the…


Opinion

Data catalogs are having an identity crisis. Here’s why.

Image courtesy of Erica Zhou on Unsplash.

It seems like every time I refresh my Twitter feed, a new startup launches “the world’s greatest data catalog ever.” And that’s exciting!

If a company is able to build the next best catalog since sliced bread, the data world will surely breathe a collective sigh of relief. And don’t get me wrong: lots of innovation is happening here and clear advancements are being made. Integrations to support data engineers and software developers working directly in data governance reports and dashboards — check. Data science workbooks to foster greater collaboration — check. ML to support automatic data profiling — check.


How to determine whether a data quality solution is right for you

Image courtesy of Artem Sapegin on Unsplash.

As data pipelines become increasingly complex, investing in a data quality solution is becoming an increasingly important priority for modern data teams. But should you build it — or buy it?

In this guest post, Stephen Guerguy and Scott O’Leary, Solutions Engineers at Monte Carlo, discuss 4 key challenges, opportunities, and trade-offs with both options.

As companies ingest more and more data and data ecosystems become increasingly complex — from storing unstructured data in data lakes to democratizing access to a greater number of internal consumers — the onus on data quality has never been higher.

After all, it doesn’t…


There are a lot of technologies you could use to build a data platform — but what do you really need?

Image courtesy of Max Duzij on Unsplash.

One of the most frequent questions we get from customers is “how do I build my data platform?

For most organizations, building a data platform is no longer a nice-to-have but a need-to-have, with many companies distinguishing themselves from the competition based on their ability to glean actionable insights from their data.

Still, justifying the budget, resources, and timelines required to build a data platform from scratch is easier said than done. Every company is at a different stage in their data journey, making it harder to prioritize what parts of the platform to invest in first. …


One data engineering team’s approach to balancing the needs for a self-service platform with end-to-end data trust.

Image courtesy of Max on Unsplash.

Data platforms have made data more accessible and actionable than ever before — assuming you can trust it. Here’s how the Data Engineering team at Auto Trader built a data platform with both decentralized data ownership and reliability in mind.

Manchester-based Auto Trader is the largest digital automotive marketplace in the United Kingdom and Ireland. For Auto Trader, connecting millions of buyers with thousands of sellers involves an awful lot of data.

The company sees 235 million advertising views and 50 million cross-platform visitors per month, with thousands of interactions per minute — all data points the Auto Trader team…


Case Study

Logistics company, Optoro, saves 44 hours per week with a self-service approach to data quality and ownership. Here’s how.

Image courtesy of Marcin Jozwiak on Unsplash.

When your customers are the first to know about data gone wrong, their trust in your data — and your company — is damaged. Learn how the data engineering team at logistics company Optoro faced this challenge head-on, reclaiming trust and valuable time with data observability at scale.

Washington, DC-based Optoro has an admirable mission: to make retail more sustainable by eliminating all waste from returns. They provide return technology and logistics for leading retailers like IKEA, Target, and Best Buy, helping increase profitability while reducing environmental waste through recommerce, or finding the “next best home” for returned items.

I…


Opinion

Modern data and machine learning systems need both monitoring and observability. Here’s why.

Image courtesy of the authors.

Duplicate data sets or stale models can cause unintended (but severe) consequences that data monitoring alone can’t catch or prevent. The solution? Observability.

Barr Moses, CEO and co-founder of Monte Carlo, and Aparna Dhinakaran, CPO and co-founder of Arize AI, discuss how it differs from traditional monitoring and why it’s necessary for building more trustworthy and reliable data products.

Garbage in, garbage out. It’s a common saying among data and ML teams for good reason — but in 2021, it’s no longer sufficient.

Data (and the models it powers) can break at any point in the pipeline, and it’s not…

Barr Moses

Co-Founder and CEO, Monte Carlo (www.montecarlodata.com). @BM_DataDowntime #datadowntime

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store