For today’s data engineering teams, the demand for real-time, accurate data has never been higher, yet broken pipelines and stale dashboards are an all-too-common reality. So, how can we break this vicious cycle and achieve reliable data?
Just like our software engineering counterparts 20 years ago, data teams in the early 2020s are facing a significant challenge: reliability.
Companies are ingesting more and more operational and third-party data than ever before. Employees from across the business are interacting with data at all stages of its lifecycle, including those on non-data teams. …
As companies increasingly leverage data to power digital products, drive decision making, and fuel innovation, understanding the health and reliability of these most critical assets is fundamental. For decades, organizations have relied on data catalogs to power data governance. But is that enough?
Debashis Saha, VP of Engineering at AppZen, formerly at eBay and Intuit, and Barr Moses, CEO and Co-founder of Monte Carlo, discuss why data catalogs aren’t meeting the needs of the modern data stack, and how a new approach — data discovery — is needed to better facilitate metadata management and data reliability.
Monte Carlo, the data observability company, today announced the launch of the Monte Carlo Data Observability Platform, the first end-to-end solution to prevent broken data pipelines. Monte Carlo’s solution delivers the power of data observability, giving data engineering and analytics teams the ability to solve the costly problem of data downtime.
As businesses increasingly rely on data to drive better decision making and maintain their competitive edge, it’s mission-critical that this data is accurate and trustworthy. Today, companies spend upwards of $15 million annually tackling data downtime, in other words, periods of time where data is missing, broken, or otherwise…
Data platforms have made data more accessible and actionable than ever before — assuming you can trust it. Here’s how the Data Engineering team at Auto Trader built a data platform with both decentralized data ownership and reliability in mind.
Manchester-based Auto Trader is the largest digital automotive marketplace in the United Kingdom and Ireland. For Auto Trader, connecting millions of buyers with thousands of sellers involves an awful lot of data.
The company sees 235 million advertising views and 50 million cross-platform visitors per month, with thousands of interactions per minute — all data points the Auto Trader team…
When your customers are the first to know about data gone wrong, their trust in your data — and your company — is damaged. Learn how the data engineering team at logistics company Optoro faced this challenge head-on, reclaiming trust and valuable time with data observability at scale.
Washington, DC-based Optoro has an admirable mission: to make retail more sustainable by eliminating all waste from returns. They provide return technology and logistics for leading retailers like IKEA, Target, and Best Buy, helping increase profitability while reducing environmental waste through recommerce, or finding the “next best home” for returned items.
Duplicate data sets or stale models can cause unintended (but severe) consequences that data monitoring alone can’t catch or prevent. The solution? Observability.
Barr Moses, CEO and co-founder of Monte Carlo, and Aparna Dhinakaran, CPO and co-founder of Arize AI, discuss how it differs from traditional monitoring and why it’s necessary for building more trustworthy and reliable data products.
Garbage in, garbage out. It’s a common saying among data and ML teams for good reason — but in 2021, it’s no longer sufficient.
Data (and the models it powers) can break at any point in the pipeline, and it’s not…
Your team just migrated to Snowflake. Your CTO is all in on this “modern data stack,” or as she calls it: “The Enterprise Data Discovery.” But as any data engineer will tell you, not even the best tools will save you from broken pipelines.
In fact, you’ve probably been on the receiving end of schema changes gone bad, duplicate tables, and one-too-many null values on more occasions than you wish to remember.
The good news? When it comes to managing data quality in your Snowflake environment, there are few steps data teams can take to understand the health of your…
As data systems become increasingly distributed and companies ingest more and more data, the opportunity for error (and incidents) only increases. For decades, software engineering teams have relied on a multi-step process to identify, triage, resolve, and prevent issues from taking down their applications.
As data operations mature, it’s time we treat data downtime, in other words, periods of time when data is missing, inaccurate, or otherwise erroneous, with the same diligence, particularly when it comes to building more reliable and resilient data pipelines.
Many data leaders tell us that their data scientists and engineers spend 40 percent or more of their time tackling data issues instead of working on projects that actually move the needle.
It doesn’t have to be this way. Here’s how the data engineering team at Resident, a house of direct-to-consumer furnishings brands, reduced their data incidents by 90% with data observability at scale.
Direct-to-consumer mattress brands may not be the first category that comes to mind when discussing data-driven companies. But Daniel Rimon, Head of Data Engineering at Resident, credits their investment in technology, data, and marketing with their…
To quote a friend, “Building your data stack without factoring in data quality is like buying a Ferrari but keeping it in the garage.”
Last week, I was on a Zoom call with Lina, a Data Product Manager at one of our larger customers who oversees their data quality program.
Her team is responsible for maintaining 1000s of data pipelines that populate many of the company’s most business critical tables. Reliable and trustworthy data is…