Two years ago, I started studying AI-generated misinformation. I wanted to understand how large language models produce convincing false content, how quickly that content spreads, and whether humans can even tell the difference anymore. That work led to four papers at The Web Conference 2026, tools like JudgeGPT and RogueGPT, and a growing concern I could not shake: the problem is bigger than text.
Because while I was studying how AI breaks trust through information, something else was happening. AI was gaining the ability to act.
The Shift I Noticed
Misinformation research forced me to think carefully about trust. Can you trust that a piece of content is what it claims to be? Did a human write it? Is the source legitimate? Can verification tools keep pace with generation quality?
These are information integrity questions. But they turn out to be a specific instance of a broader challenge: how do you maintain trust and oversight when AI systems operate with increasing autonomy?
The same questions that apply to AI-generated text apply — with more urgency — to AI agents taking actions on your behalf. When an agent sends an email, edits a document, or executes a command in your environment, how do you know it did what you intended? How do you verify it did not do something extra? How do you calibrate how much autonomy to grant?
Misinformation is a trust problem in the information layer. Agentic AI is a trust problem in the action layer. They are related, and researchers who have worked on one are well-positioned to contribute to the other.
Where I Am Going
My research is shifting toward agentic AI systems — specifically the questions of trust, oversight, and safety that arise when AI agents act in the world rather than just generating content.
The questions I find most interesting right now:
- Verification of agent actions. When an agent completes a task, how does a human confirm it did the right thing? Log files are one answer, but logs can be gamed. Structured output formats help, but interpretation requires effort. What are the right primitives for human oversight of agent behavior?
- Trust calibration. Humans calibrate trust in other humans over time, based on track record, transparency, and social cues. What are the equivalent mechanisms for AI agents? How should trust accumulate (or decay) based on observed behavior?
- Agent-tool interface design. The interface between an agent and its tools shapes what the agent can do and what it cannot. Good interface design can make unsafe actions harder, observable actions clearer, and correct actions easier. This is under-theorized.
- Multi-agent oversight. As agents orchestrate other agents, oversight becomes recursive. Who watches the watcher? How do accountability chains work in agentic pipelines?
PowerSkills as a Practical Case Study
One of the projects I have been building is PowerSkills — a set of Windows automation skills for AI agents. It gives agents structured access to Outlook (email and calendar), Edge browser via Chrome DevTools Protocol, desktop automation, and shell commands.
PowerSkills is open source (MIT license) and installable via AgentSkills:
npx skills add aloth/PowerSkills
Every command returns a consistent JSON envelope:
{
"status": "success",
"exit_code": 0,
"data": { ... },
"timestamp": "2026-03-06T16:00:00+01:00"
}
Building PowerSkills clarified something for me: the agent-tool interface is a design problem, not just an engineering one. Decisions about what to expose, what to restrict, and how to structure output all affect how safely and predictably an agent can operate. A well-designed tool surface makes agent behavior more auditable. A poorly designed one makes it opaque.
PowerSkills is also a testbed. I plan to use it to study how agents actually use structured tool interfaces, what errors arise, and where human oversight is most needed.
Connection to Prior Work
The Verification Crisis paper I co-authored surveyed experts in journalism and fact-checking about GenAI disinformation. The most consistent finding: traditional verification methods are breaking down faster than new tools can be developed. Provenance — knowing where content came from and who created it — emerged as the most reliable path forward.
The same logic applies to agent actions. Provenance for actions: knowing what an agent did, when, and why. C2PA and similar standards address content provenance; we need equivalent frameworks for agent action provenance.
My earlier work on Origin Lens — a mobile app for C2PA image verification — showed that provenance tooling can be made accessible to non-expert users. That same design challenge exists for agent oversight tooling.
Open Questions for the Community
I am at the beginning of this research direction, and I have more questions than answers. A few I am actively thinking about:
- What does “minimal footprint” mean in practice for an AI agent? How do you operationalize it?
- How should agents communicate uncertainty about their own actions to human supervisors?
- What failure modes emerge in long-horizon agentic tasks that do not appear in single-turn interactions?
- How do we design evaluation frameworks for agent trustworthiness, not just task performance?
If you are working on any of these — or adjacent problems — I would like to hear from you.
What Is Next
In the near term: more empirical work with PowerSkills, expanding the agent-tool interface design research, and continuing to connect the information integrity thread to the agentic AI thread. I am also looking for collaborators, so if your work overlaps, reach out.
The trust questions do not go away just because the AI got more capable. They get harder.
