Why linking engineering data sources is a must. (and why current methods don’t work)
The dev team just missed a critical delivery date. Where does an engineering manager or team lead turn for insight into what happened and why – and how to avoid missed deadlines moving forward?
One natural starting point is the team’s own engineering data. For many teams, this will include tools like GitHub and Jira: the market leaders respectively in source control and issue tracking.
Engineering leaders are right that their own data likely contain the answers to many of the challenges (how to improve velocity, predictability, or code quality) that their dev teams are facing.
But analyzing these data sources in isolation or relying on existing integration approaches leaves massive value on the table. Here’s why.
The only way to get full context is to link source control and task management systems. With just GitHub data, teams can identify mechanical patterns: cycle time, PR status, commit timeline, and commit size. But if GitHub is the definite source of what was pushed, Jira explains the why – the product initiative, user need, or triggering event (e.g., bug, outage) that led to work being done. And pivoting off of just git analytics without understanding the why can lead to short-sighted decision making. For example, a team might observe that commit size is declining, leading to improvements in cycle time. We described that pattern here as a virtuous cycle. But it’s not a good thing if the time is focusing on the wrong initiatives (pushing updates, for example, related to a product or feature set that isn’t prioritized).
That context is critical for continuous improvement. When the team sits down to conduct a retrospective, the most important questions to ask are around what drove engineering outcomes. For example: did we miss our deadlines for this sprint because we spent too much time focused on paying down technical debt, or because of changes in product scope? Commit messages and Jira analytics (issue volume/type) can only help answer those questions in broad, subjective strokes. Getting to the “ground truth” of how the team spent its time requires a strong linkage between source control and task management.
Current approaches to linking systems impose an overhead tax on devs. The most common approach is to use GitHub’s app in the Atlassian Marketplace. The idea makes sense in theory: a workflow that allows devs to update Jira tickets (e.g., close an issue or add a comment) within their development tools, and enables Jira references to be pushed to GitHub commits. But there’s a price to pay with this method. It introduces a new layer of overhead to devs, who must master a whole new syntax to be able to properly annotate Jira tickets. When the process of linking commits to issues itself requires additional steps and explicit references to an issue key (see documentation snapshot), it’s not surprising to observe scattershot adoption.
Marketplace reviews tell the whole story of the integration’s shortcomings. The GitHub for Jira app currently has an underwhelming 2.5-star rating, and the comments make clear the degree of developer frustration around the integration. Comments are riddled with examples of corner cases where the integration breaks down (e.g., if PRs are automated via CI/CD, if the organization has an IP allow list on GitHub) or challenges with user adoption (“no idea how to link a project”). Developers overwhelmingly report that the integration itself is brittle, challenging to use, and poorly supported.
Teams need a new, ML-driven approach to connecting the dots. If we accept the above premise – that teams need the context that comes from linking source control and project management systems – and the fact that current approaches don’t deliver on this goal, it becomes clear that teams need a new approach. There are a few requirements for a source control-project management linkage: a) it can’t impose any additional workflow steps on devs; b) it has to be robust to a variety of implementation environments and dev toolsets; and c) it must be able to stitch together PRs with relevant issues and tickets to marry the “what” with the “why.” Machine learning is an ideal solution to this challenge. Natural language processing (NLP) algorithms can be trained to recognize and match entities across systems – e.g., a developer or project in GitHub with its corresponding person or ticket in Jira – without introducing any additional steps on the part of the dev herself. And having a separate ML “intelligence layer” outside of the GitHub and Jira instances themselves ensures resilience against implementation idiosyncrasies.
Ultimately, engineering data sources do contain many of the answers to the dev team’s most pressing questions and challenges. But getting useful insights from those data requires a multi-dimensional approach to the data to integrate across source control and issue tracking systems – and systematically mine for the signals and patterns that matter most.
How Acumen helps
We built Acumen to solve exactly these types of challenges around limited visibility, prioritization, and root causing of engineering issues.
The platform uses machine learning to stitch together and interpret fragmented, incomplete data from across engineering tools. Acumen then mines through the data to unlock patterns and identify team-level blockers and risks.
The result is a crystal clear picture of what’s going on day-to-day and how to unblock the team. For teams, this means fewer fire drills, less “work about work,” and greater productivity. And for organizations, it means the data-driven alignment that can only come from a truly complete view of the team’s activities and patterns.
Learn more about our approach to unifying and analyzing engineering data here.