Citigroup Fined $136M for Bad Data. What Can We Learn?

Barr Moses
2 min readJul 31, 2024

Bad data isn’t just a headache — it’s a huge financial risk.

As data powers more of the world’s mission critical services — and the data and systems surrounding it become more complex in the process — data quality becomes non-negotiable.

Note: I didn’t say “nice to have.” In 2024 data quality isn’t open for discussion — it’s a clear and present risk demanding of our attention.

Citigroup learned this lesson last week when regulators presented the company with a $136M fine for failure to make sufficient progress on a critical data quality initiative. And that’s before you consider the impact to share price.

So, the obvious question now is…how do you avoid the same fate?

It’s no revelation that incentives and KPIs drive good behavior. Sales compensation plans are scrutinized so closely that they often rise to the topic of board meetings. What if we gave the same attention to data quality scorecards?

Even in their heyday, traditional data quality scorecards from the Hadoop era were rarely wildly successful. I know this because prior to starting Monte Carlo, I spent years as an operations VP trying to create data quality standards that drove trust and adoption.

Here are 4 key lessons for building data quality scorecards that I’ve seen to be the difference between success and failure:

  1. Know what data matters — the best way to determine what matters is to talk to the business. So get close to the business early and often to understand what matters to your stakeholders first.
  2. Measure the machine — this means measuring components in the production and delivery of data that generally result in high quality. This often includes the 6 dimensions of data quality (validity, completeness, consistency, timeliness, uniqueness, accuracy), as well as things like usability, documentation, lineage, usage, system reliability, schema, and average time to fix.
  3. Gather your carrots and sticks — the best approach I’ve seen here is to have a minimum set of requirements for data to be on-boarded onto the platform (stick) and a much more stringent set of requirements to be certified at each level (carrot).
  4. Automate evaluation and discovery — Almost nothing in data management is successful without some degree of automation and the ability to self-service. The most common ways I’ve seen this done are with data observability and quality solutions, and data catalogs.

Want to dive deeper? Check out my full breakdown via the link below for more detail and real world examples.

READ MORE

Stay reliable,

Barr Moses

--

--