Developer turns OpenAI’s Codex into a long-running “employee” as the tool gets remote, Goal and screenshot features

Jason Liu, author of the 13k‑star Instructor repository, has reportedly been recruited into OpenAI’s Codex team — and he is demonstrating a new way to use the model: not as a single-shot assistant but as a persistent, continuously working agent. It has been reported that Liu runs dozens of month‑spanning threads — one for scheduling, one for open‑source projects, one for social monitoring — each keeping full history, preferences and decisions so the agent can pick up where it left off. Codex itself has also received a recent batch of updates, reportedly adding Appshots (direct screenshot input), a formal Goal mode and the ability to continue working while a machine is locked.

How he rigs Codex to "clock in"

Liu’s setup combines Codex features like Heartbeats and @computer with a local Obsidian vault that stores TODOs, people, projects, agents and notes. The trick is persistence: instead of relying on Codex’s built‑in memory, he version‑controls core state locally so an agent can consult and update project files, roll back if needed, and preserve context across weeks. It has been reported that he uses Codex to scan Slack and Gmail, draft replies (but not send them), check Amazon refund queues while showering, and even automate file uploads by invoking a browser via @computer — pushing rendered video files into review threads autonomously. Users set goals and acceptance criteria; Codex then advances tasks autonomously for hours or days, with checkpoints for human review.

Why this matters — and what could go wrong

This pattern turns an LLM from a tool into something closer to an on‑payroll assistant. What do we gain? Continuous context, fewer repetitive prompts, and automation that spans multiple services — Google Docs comments, GitHub PRs, Slack threads and more. What do we risk? Data privacy, permission creep and security exposure if agents continue to operate while devices are locked. Liu’s local vault approach mitigates some privacy concerns, but Chronicle — Codex’s experimental screen‑capture memory — remains gated by permissions and throughput limits. It has been reported that these features are still immature.

The wider backdrop matters too. This work sits at the intersection of an AI arms race and evolving cross‑border data and export controls: Western models powering Chinese developers raise regulatory questions about data flows and dependency. For now the story is simple: smarter agents are arriving, and developers are finding ways to let them work through the night. Who controls the long‑running agent — the user, the platform, or the employer — is the next big question.