For today’s data engineering teams, the demand for real-time, accurate data has never been higher, yet broken pipelines and stale dashboards are an all-too-common reality. So, how can we break this vicious cycle and achieve reliable data?
Just like our software engineering counterparts 20 years ago, data teams in the early 2020s are facing a significant challenge: reliability.
Companies are ingesting more and more operational and third-party data than ever before. Employees from across the business are interacting with data at all stages of its lifecycle, including those on non-data teams. …
As companies increasingly leverage data to power digital products, drive decision making, and fuel innovation, understanding the health and reliability of these most critical assets is fundamental. For decades, organizations have relied on data catalogs to power data governance. But is that enough?
Debashis Saha, VP of Engineering at AppZen, formerly at eBay and Intuit, and Barr Moses, CEO and Co-founder of Monte Carlo, discuss why data catalogs aren’t meeting the needs of the modern data stack, and how a new approach — data discovery — is needed to better facilitate metadata management and data reliability.
Monte Carlo, the data observability company, today announced the launch of the Monte Carlo Data Observability Platform, the first end-to-end solution to prevent broken data pipelines. Monte Carlo’s solution delivers the power of data observability, giving data engineering and analytics teams the ability to solve the costly problem of data downtime.
As businesses increasingly rely on data to drive better decision making and maintain their competitive edge, it’s mission-critical that this data is accurate and trustworthy. Today, companies spend upwards of $15 million annually tackling data downtime, in other words, periods of time where data is missing, broken, or otherwise…
Companies spend upwards of $15 million annually tackling data downtime, in other words, periods of time where data is missing, broken, or otherwise erroneous, and over 88 percent of U.S. businesses have lost money as a result of data quality issues.
Fortunately, there’s hope in the next frontier of data engineering: data observability. Here’s how the data engineering team at Blinkist, a book-summarizing subscription service, increases cost savings, collaboration, and productivity with data observability at scale.
With over 16 million users worldwide, Blinkist helps time-strapped readers fit learning into their lives through their ebook subscription service.
Gopi Krishnamurthy, Director of…
Your company wants to build a data mesh. Great! Now what? Here’s a quick primer to get you started — and prevent your data infrastructure from turning into a hot mesh.
Since the early 2010s, microservice architectures have been adopted by companies far and wide (think: Uber, Netflix, and Airbnb, among others) as the software paradigm du jour, sparking discussion among engineering teams about the pros and cons of domain-oriented design.
Now, in 2021, you’d be hard-pressed to find a data engineer whose team isn’t debating whether or not to migrate from a monolithic architecture to a decentralized data mesh.
In this article series, we walk through how you can create your own data observability monitors from scratch, mapping to five key pillars of data health. Part I can be found here.
Part II of this series was adapted from Barr Moses and Ryan Kearns’ O’Reilly training, Managing Data Downtime: Applying Observability to Your Data Pipelines, the industry’s first-ever course on data observability. The associated exercises are available here, and the adapted code shown in this article is available here.
As the world’s appetite for data increases, robust data pipelines are all the more imperative. When data breaks — whether…
Meet the data leaders charting a path forward for reliable data in the New Year
Your Marketing Analytics team uses data to inform their email campaigns; your product managers leverage insights about user behavior to prioritize the development of new features; and even your Operations team relies on data to develop growth strategies.
Unfortunately, most companies fail to realize its full potential due to an all-too-common reality for most data teams: data downtime. Data downtime, in other words, periods of time when data is inaccurate, unreliable, or otherwise erroneous, spares no one. It manifests in broken pipelines, stale dashboards, and…
How we’re charting a new path forward for data trust and reliability
In 2021, data is your company’s most critical asset.
As data pipelines become increasingly complex and companies ingest more and more data, it’s paramount that this data is reliable. After talking to hundreds of data teams over the past few years, I was struck by the fact that organizations were investing millions of dollars and strategic energy in data, but decision makers and others on the frontlines couldn’t use it or didn’t trust it. There had to be a better way.
In 2021, it’s not just about having the “modern data stack.” It’s about having a modern approach to working with your data. Here’s how we get there.
Over the past few weeks, I’ve been having lots of conversations with some of the world’s best data teams about their 2021 priorities. With many teams are focused on upgrading or scaling out existing infrastructure, two “resolutions” have really stuck out to me:
Unlike so many others, these two priorities are decidedly not technical, speaking to the need not…
By Ryan Kearns and Barr Moses
In this article series, we walk through how you can create your own data observability monitors from scratch, mapping to five key pillars of data health. Part 1 of this series was adapted from Barr Moses and Ryan Kearns’ O’Reilly training, Managing Data Downtime: Applying Observability to Your Data Pipelines, the industry’s first-ever course on data observability. The associated exercises are available here, and the adapted code shown in this article is available here.
From null values and duplicate rows, to modeling errors and schema changes, data can break for many reasons. Data testing…