// Reading nowStart

Gemini Spark and the End of the Session: How AI Is Shifting From Passive to Persistent

By Addy · May 25, 2026

Every AI product you have used until now has had the same fundamental property. It existed when you opened it and stopped existing when you closed it. Claude, ChatGPT, Gemini, they were tools you reached for, used, and put down. The session was the unit. Open a tab, get a response, close the tab. The intelligence was available. It was not present.

That property is so foundational to how AI products have been designed that most people have never thought to question it. The session model felt natural because it mapped to how we use software: you open an application when you need it, you close it when you do not. The AI was a more capable version of a search box. You asked. It answered. You left.

Gemini Spark, announced at Google I/O on May 19, is built on a different premise entirely. It does not wait for you to open it. It runs on Google Cloud infrastructure, continuously, whether your phone is on or off, whether your laptop is open or closed, whether you are awake or asleep. You do not open Spark. You delegate to it. The difference between those two verbs is the difference between every AI product that has existed and what is being built now.

What Spark Actually Is

The architecture is worth understanding before the implications.

Spark keeps a cloud execution context tied to your account that can respond to recurring tasks and triggers, work across connected Google apps, execute tool calls through Workspace and MCP connections, and surface results through Android Halo, a small communication layer at the top of the phone screen showing what background agents are doing.

That means it can execute recurring tasks, maintain persistent state, and eventually spawn custom sub-agents in the background. The comparison that several developers have made is accurate: Spark is less like a chatbot and more like an employee who has a desk, checks their inbox, and does things while you are in meetings. They do not need you to be present for them to work. They surface what they have done when you return.

Users can email Spark directly through a dedicated Gmail address, and the agent can interact with the web directly through Chrome. On mobile, progress is visible through the Android Halo system. The interface surfaces have multiplied. The underlying process is one.

Google says Spark runs on Gemini 3.5 and uses the Antigravity harness. The MCP integration is the detail that makes Spark more than a Google Workspace automation. Google's first announced MCP connections include Canva, OpenTable, and Instacart, with more partners integrating. That matters because MCP turns third-party tools into action surfaces for AI agents. An agent that can read your email, check your calendar, draft a document, and coordinate with external services is not just a Workspace tool. It is infrastructure for everything.

Gemini Spark is currently being tested inside Google and with trusted testers, with a beta planned for U.S. Google AI Ultra subscribers. Google also launched a new $100 AI Ultra tier and cut its top Ultra plan from$ 250 to $200. The$ 100 tier includes five times the usage limits of the existing AI Pro plan in the Gemini app and Google Antigravity, 20 terabytes of cloud storage, and a YouTube Premium individual plan. The new price point is not just a discount. It is a redistribution of the value proposition. At $250, Ultra was a premium tier for heavy users. At$ 100, it becomes a platform subscription for anyone building their digital life around persistent AI.

The Shift Nobody Has Named Clearly

The history of human-computer interaction has moved through three distinct phases, each defined by the distance between the human's intention and the machine's response.

The first phase was command-line computing. You typed an instruction. The machine executed it exactly. The distance between intention and response was zero: the machine did precisely what you told it and nothing else. It was powerful for experts and unusable for everyone else.

The second phase was graphical interfaces. You clicked on things. The machine responded to gestures. The distance grew slightly. Clicking a button might trigger a sequence of operations you did not specify individually, but the relationship remained fundamentally transactional. You acted. The machine reacted. You were in control of every step.

The third phase is the one we are entering now. You set an intention. The machine pursues it. The distance between your input and the machine's action becomes large enough that what happens in between is not visible to you unless you ask. You tell Spark to monitor incoming invoices, flag any above $10,000 for review, and draft responses to the rest. Spark does this while you sleep, while you travel, while you are in meetings. You did not specify each step. You specified the outcome.

This is not a difference of degree from the previous phase. It is a difference of kind. The machine is no longer reacting to what you do. It is acting on what you care about. The interface is not a screen you look at. It is a process that runs alongside your life.

The transition from passive to persistent AI is the same transition computing made when it moved from batch processing to interactive sessions in the 1970s, and from desktop applications to always-on internet services in the 1990s. Each transition felt incremental at the time and looked structural in retrospect. This one will too.

Why Google Is Positioned Differently Than Anyone Else

In the race to build compelling personal AI agents, Google may have an underrated advantage: it already has all your emails.

That observation from TechCrunch is the most precise description of Google's competitive position in one sentence. A persistent AI agent is only as useful as the data it has access to and the actions it can take. The agent that knows what is in your inbox, your calendar, your documents, your search history, and your location at any given moment is categorically more useful than the agent that has to be given that context manually every session.

Every AI company is trying to build the persistent agent layer. OpenAI has ChatGPT Agent, Codex, and enterprise agent infrastructure moving toward long-running work. Anthropic has Claude's memory, computer use, and Conway-shaped always-on agent experiments. Microsoft has Copilot attempting to run across Outlook, Teams, and the Microsoft 365 ecosystem.

Where Google believes it can differentiate is depth of integration. Because it controls the operating system, the browser, the email client, and the cloud infrastructure, Spark can operate across all of those layers without the friction that standalone agents face when stitching together disparate tools.

A persistent agent built on top of Gmail, Google Calendar, Google Docs, Google Search, Chrome, and Android is not just using APIs to access your data. It is running inside the systems that generate your data. The agent does not need to be given context. It has been present for the context as it was created.

This is the same competitive advantage that made Google Search dominant over every alternative: not that the algorithm was better in isolation, but that it had seen more of the web than anyone else because it had been crawling it longer. Spark's advantage over every standalone AI agent is not necessarily that Gemini 3.5 is smarter than every rival model. It is that Spark can know more about your digital context than any agent you would have to authorize and configure from scratch.

Google is also previewing Android Halo, effectively turning the operating system into a dashboard for persistent AI agents. The operating system itself becoming the visibility layer for what background agents are doing is the hardware complement to the cloud runtime. You do not check in on Spark by opening an app. You glance at the top of your phone screen the way you glance at a notification count.

The Daily Brief and What It Signals

Alongside Spark, Google announced Daily Brief, an automated morning digest agent that assembles a personalized summary of overnight developments: emails that arrived, calendar events for the day, relevant follow-up details, and tasks that need attention.

Daily Brief is not technically impressive. It is a summarization pipeline running on a schedule. What it signals is more important than what it does: Google believes the right interaction model for a persistent AI is not "ask me anything" but "I will tell you what matters before you ask."

The difference is the direction of initiation. Every AI product to date has been pull-based. You pull information from the AI by asking. Daily Brief is push-based. The AI pushes information to you based on what it has determined is relevant. This requires the AI to have a model of what you care about, which is exactly the model that Gmail, Calendar, and Search have been building for years.

The distinction between pull and push is not cosmetic. It is the difference between a tool and an assistant. A tool waits to be used. An assistant anticipates what you need before you ask. Every human assistant relationship eventually shifts from pull to push as the assistant learns enough about their principal to anticipate rather than respond. Spark and Daily Brief are Google's implementation of that shift at AI scale.

The Sam Altman framing this publication covered in the April piece on AI moving from your screen to your world described it as setting intentions rather than issuing commands: telling your AI what you want to accomplish and what you are worried about, then having it operate in the background and batch updates rather than demand attention. Google's description of Spark at I/O was strikingly similar: an agent you direct to work in the background, find what you need, and help you take action. Two different companies, same framing, six weeks apart. The convergence is not coincidence. It is the industry reaching the same conclusion about what the next interaction model looks like.

The Governance Problem Nobody Has Solved Yet

A 2025 AI Builder Summit survey cited by ECI Research found that 44% of enterprise AI leaders have only moderate confidence that AI agents can act autonomously without human intervention.

That confidence gap is not irrational. A passive AI that gives you a wrong answer wastes the time you spent asking. A persistent AI that acts on a wrong inference wastes the time it spent acting, and potentially causes damage that has to be undone.

Google's framing of agent permissions suggests even the platform builder is still working through the boundaries. Spark can draft a response. It is designed to ask before high-stakes actions like sending email or spending money. The "I drafted this, approve?" model that the developer community has converged on as the right UX for early agentic AI systems is present in Spark's design. The agent does the work. The human reviews the output. Autonomy is earned as trust is established.

The governance architecture that works for a single agent making conservative decisions about email drafts does not automatically scale to a network of agents making consequential decisions across an organization's data. The identity, access, and permission policies that govern what Spark can read, write, and act on are being designed now, before production deployments, which is the right order. Whether the designs hold under the pressure of real organizational complexity is the question that the closed beta is there to answer.

The Codex mobile article this publication ran earlier this month described the same governance arc in a different context: a coding agent running on a developer's machine or remote environment, with mobile supervision, approvals, screenshots, test results, and diffs flowing back to the user. The pattern is consistent across every persistent AI system shipping in May 2026: maximum capability in the forward direction, conservative defaults on authorization, human review as the gate before irreversible actions.

Both Codex and Spark are answers to the same architectural question. Where does the persistent agent run, and who controls what it can do? Codex runs locally or in configured remote environments, scoped to developer tooling, with relay-based remote steering. Spark runs on Google Cloud infrastructure, scoped to Workspace and MCP-connected tools, with Android Halo as the visibility surface.

Different runtime environments. Same design philosophy. The agent works while you are away. You review what it did. You redirect if it went wrong. Irreversible actions require explicit authorization.

What This Means for How You Use AI Starting Now

The framing that has governed AI product design since 2022, AI as a faster, more capable search box, is not wrong. It is incomplete.

The search box framing captured one valid mode of interaction: you have a specific question, the AI has a specific answer, the transaction completes. That mode will not disappear. It will become one mode among several rather than the defining one.

The persistent agent mode is different in every dimension. The interaction is ongoing rather than transactional. The agent has context accumulated over time rather than context provided in a single prompt. The output is actions taken rather than answers given. The human's role is direction and review rather than query and consumption.

Spark is the first major consumer AI product from a frontier lab built explicitly around a persistent cloud agent rather than a chat window. The architectural shift from session to process is the same leap email made over fax.

That comparison is precise. Fax was a better way to send a document than postal mail. Email was a different category, not faster document delivery, but persistent asynchronous communication that changed how organizations coordinated. The productivity gains from email were not from sending messages faster. They were from being reachable and coordinated continuously, across time zones, across schedules, without requiring synchronous presence.

Spark is not a faster chatbot. It is a different category, not better answers on demand, but continuous work on your behalf that does not require your presence. The gains will not be from getting responses faster. They will be from work that happens while you are somewhere else, from tasks that complete themselves overnight, from information that arrives before you think to ask for it.

The passive AI asked you to come to it. The persistent AI comes to you, whether you asked or not, because you already told it what you care about.

That shift is what Gemini Spark represents. Not a product launch. A change in the relationship between human attention and machine capability, and the beginning of an era in which AI is most useful precisely when you are not using it.

Sources:

The Gemini app becomes more agentic, delivering proactive, 24/7 help - Google
Google introduces Gemini Spark, a 24/7 agentic assistant with Gmail integration - TechCrunch
Everything new in our Google AI subscriptions, fresh from I/O 2026 - Google
Stay in sync with your agent with Android Halo - Google
AI Maturity in CX: Why Hybrid Models Are Winning in 2026 - Efficiently Connected

Previously on TheQuery: The Next Layer: How AI Is Moving From Your Screen to Your World and Codex Mobile Is Here. Claude Had It First. OpenClaw Had It Before Either of Them.