Field Notes in Agentic Engineering: SDLC Traceability is an Unsolved Problem

Why I Started Asking About Traceability

At DevNexus, once we got past the “What AI tools are you using?” question, I started asking “How do you trace intent all the way from the ticket to the pull request to QA?”

Engineers, managers, and QA practitioners all talked about AI coding agents, AI review, and new workflows. But they’re struggling to see where things go wrong when issues slips through in development.

That missing thread - from requirements to implementation to validation - is a workflow gap in agentic engineering.

Here’s an example of an E2E AI workflow for a developer:

A JIRA ticket with clear requirements and acceptance criteria.
A coding agent used for implementation against that ticket.
A PR that explicitly links back to the ticket.
Code review that checks the changes against those acceptance criteria.
QA validating behavior, regressions, and whether or not the new experience fulfills what was intended.

This is about traceability of intent. A way to look at a code change and see:

This is what we said we would do.
This is the code that claims to do it.
This is how it was reviewed.
This is how it was tested.

Semantic conventions from AI-assisted changes through the development pipeline. The ticket explaining the issue, every agent prompt written to generate the solution, etc.

Right now, if you’re on a large dev team and can’t go fully AI-pilled power user mode (due to company guidelines about which tools are approved for use by employees), then you might be able to trace the intent and completion of code changes. Perhaps the E2E workflow is only partially automated.

This means that, in practice, when discrepancies are uncovered toward the tail end of the development stage, it makes people curious at which point the gap in intent-to-completion sourced from.

Where Do Discrepancies in Intent-to-Completion Surface?

When you don’t have visibility into an entire flow, every defect or incident is at risk of holding the blame. (cultural team norms can determine this factor as well)

Was the requirement unclear?
Did the engineer misinterpret the intent?
Did the developer reviewing the code skim the changes?
Did QA test the wrong thing or not test it thoroughly enough?

Without a trail of evidence, you could spend a lot of time making assumptions and optimizing the wrong stage.

A QA engineer I spoke with had thoughts on this topic. What he wanted was a way to see exactly where alignment broke down. The use of AI tools makes this necessary now more than ever.

On one hand, agentic development gives you more code, with more people and tools touching the flow. On the other hand, you can now instrument and inspect that flow in ways that were painful to do manually.

But only if you treat traceability as an intentional design choice.

Another interesting question is “Does this change actually fulfill the intent expressed in the ticket?”

That is a different standard. It requires context.

A lot of teams are feeling the gap between what AI tools are doing today and what they wish those tools could do: connect the dots between requirements, implementation, merge, QA, deployment, and any potential future production issue that needs triaging.

The “Trail of Evidence” as a Governance Control

When I describe a seamless, well-connected E2E AI workflow where intent and output are data that’s collected throughout the pipeline, I think it resonates because it solves multi-dimensional problems in engineering and critical handoff checkpoints between product/design teams, software engineers, and QA teams.

Quality → it lets teams see where their process is breaking, so they can actually fix the right thing.
Accountability → it makes it easier to answer “how did this get through?” without defaulting to blame.
Teachability → it gives newer engineers and teams a concrete mental model for what good delivery looks like in their org.

This is what I call code governance in practice as a designed feedback loop that makes quality enforceable and explainable.

Governance, in this sense, is the combination of standards, controls, and visibility that lets an engineering organization confidently say, “We know how this code came to be, and we can defend that process.”

Agentic Engineering Without the Thread Is Risky

Agentic workflows are powerful. A ticket can become a working implementation in minutes. A coding agent can refactor large surfaces quickly. AI code review can flag things that people might not have accounted for in implementation.

But when the thread from ticket to PR to QA is missing, that same power can leave room for a new kind of opacity.

And when something breaks, you are left reconstructing what happened from dispersed knowledge and partial context.

In that world, adding “more AI” does not inherently make you safer. It may just make you faster at generating changes you do not fully understand. And I talk about those dangers in my article on preserving understanding while 10x-ing AI output.

To move toward maturity in agentic engineering, we must account for these questions as our software quality guardrails. As practitioners and owners of services and apps at companies, we must be clear about the “how” behind the work.

How do we design our workflows so intent, implementation, and validation stay connected?
How do we give reviewers and QA relevant context?
How do we build a system where quality problems are visible as patterns so we can avoid interpreting them as one-offs?

My Takeaway

In agentic engineering, traceability will become as important as speed.

Teams that participate in the building and managing of software need systems that help them understand how code moved from idea to production, where risk entered, and where their process is failing them, even if subtle.

That’s why I believe the missing thread running from ticket to deployment is one of the main ways companies will distinguish between “we ship fast” and “we ship fast and can still stand behind what we ship.”

This is the second entry in Field Notes in Agentic Engineering. If your team is experimenting with E2E AI-native workflows or if you have stories about where that thread snapped, I’d love to hear more about it!

Field Notes in Agentic Engineering: SDLC Traceability is an Unsolved Problem

Why I Started Asking About Traceability

Where Do Discrepancies in Intent-to-Completion Surface?

The “Trail of Evidence” as a Governance Control

Agentic Engineering Without the Thread Is Risky

My Takeaway

Reply

Keep Reading

nnenna hacks

Home

about me

Strategic AI Consulting