Discussions
Anyone else struggling to keep data quality consistent across teams?
I am kinda losing my mind over this and figured this would be a good place to ask since I can’t be the only one dealing with this issue…
I’m doing a summer internship (remote) with a fintech company and part of my role is helping the data team build out some observability workflows using Sifflet. I’ve been learning a TON like, really cool stuff around metadata monitoring, lineage, anomaly detection, etc but man, coordinating across different teams to keep data quality consistent is a whole other beast.
Here’s what’s happening: We’ve got like three different teams touching the same datasets. The product team makes changes to tracking events, the analysts are creating their own dashboards directly on top of those tables, and the data engineers (who are great but totally swamped) are trying to keep the pipelines stable. But no one’s really documenting when changes happen, and there’s no standard process for flagging when something breaks.
We tried using Slack channels for alerts, but they just get buried. Someone mentioned setting up automated incident reports in Sifflet, which sounds smart, but I haven’t done that before. Anyone here actually implemented those successfully? Would love to know what worked or what didn’t.
Also, how do you guys handle schema changes? Like, when a column gets renamed or dropped, how do you make sure that doesn’t silently break five dashboards downstream? I keep thinking about setting up monitors for each table, but it’s honestly a lot of work and I’m not sure if it’s overkill. I’d love to hear if there’s a more efficient way to approach this.
Random side note (but kinda related): I had a friend who outsourced some of his data science coursework to one of those assignment writing UAE services, and he got totally wrecked ‘cause the code didn’t even run lol. Not saying I’d do that, but it reminded me that quality matters way more than just getting something delivered. Same vibe here I can’t just push out monitors and call it a day if the data itself is a mess.
Anyway, just needed to vent a bit but also super open to ideas. Has anyone had luck creating a more centralized system for managing data quality issues across teams? Like, not just observability, but actual processes that get people to talk to each other? Would love some inspiration (or even horror stories tbh)
Thanks in advance if you made it this far lol.