DUNIN7 · LOOMWORKS · RECORD
record.dunin7.com
Status Archived (read-only)
Path queued-directions/archive/loomworks-queued-directions-and-deferred-work-v0_5.md

Loomworks — Queued Future Directions and Deferred Work

Version. 0.5 Date. 2026-05-09 Author. Marvin Percival (DUNIN7 Operator) with Claude.ai Provenance. Memory archive consolidation. v0.1 (2026-05-04) supersedes prior memory edits and absorbs the obsolete #16. v0.2 (2026-05-07) adds Section 5 (Agent capability development) capturing two entries from the Phase 47 scoping arc: Claude Dreaming as a Companion behavior pattern, and Claude Managed Agents conceptual primitives. The metadata-driven-runtime section in particular preserves the conversational trajectory that the 500-character memory limit had to compress out. v0.3 (2026-05-09) adds Section 6 (Mobile presence and quick-capture engagement pair) capturing the architectural framing produced by the two paired investigations — loomworks-quick-capture-engagement-investigation-v0_2.md and loomworks-mobile-presence-investigation-v0_1.md. Two entries: the substrate intent-class engagement (quick-capture as bidirectional Memory traffic on a fast path) and the capture-device engagement (mobile presence as a layered composition of OS-mediated, foreground, tap-to-speak, and conversational surfaces). Plus a small mobile-engineering-capacity note that's pacing-readiness, not architecture. v0.4 (2026-05-09) adds Section 7 (Multi-instance fleet and jurisdiction governance) absorbing the substance of two memory edits whose source documents now exist in project knowledge — loomworks-deployment-strategy-v0_2.md, loomworks-hosting-cost-analysis-v0_1.md, and loomworks-white-label-multilanguage-analysis-v0_1.md. Two entries: the multi-instance fleet via DevOps engagements as the deployment topology, and the protocol triangle (Loom + FORAY + OVA) as the methodology framing that makes per-instance regulatory jurisdiction a first-class product feature. v0.5 (2026-05-09) adds Section 8 (External information seams) capturing two entries from a chat session that surfaced the github.com/public-apis catalog and sharpened it into a load-bearing architectural move. Two entries: the general-purpose external API gateway specialist (one specialist for all public APIs, FORAY-attested and OVA-scoped, configuration-driven via Phase 38 declare-and-register), and the public-apis catalog as the reference universe for the gateway. The trajectory note on 8.1 preserves the Operator's reframe from "ship 10-15 first-party adapters" to "build one general-purpose gateway and let the API catalog live as supplied domain in engagement Memory" — a reframe that converts an N-adapter build direction into a build-once-then-configure pattern. Purpose. Queued future directions and deferred work that don't need to load into every fresh chat but should be retrievable when the relevant topic comes up. Foundational fresh-chat orientation stays in memory; topical futures live here. Discoverability keywords. engagement creation assistance, Discovery-to-seed, blank-page problem, code health audit, semantic duplication, Phase 30 induction, Phase 32 marketing, deferred phases, external rendering services, render specialist SDK, Novi AI, Steve AI, Story.com, image generation, Goosey, Nano Banana, Gemini Flash Image, storybook art, upload facility extension, file type extensibility, contribution skill, metadata-driven runtime, configuration-extensible, code-extensible, multi-stage extraction, multi-output extraction, non-text assertions, assertion content kind, Claude Dreaming, AutoDream, agent memory consolidation, memory hygiene, cross-engagement pattern detection, observation to Shape, Accounting reconciliation, Claude Managed Agents, outcomes, grader agent, multi-agent orchestration, lead and sub-specialist composition, sessions and webhooks, agent fabric, agent capability development, quick-capture, dictation, mobile presence, Hey Companion, Hey Freda, App Intents, App Actions, Siri Shortcut, foreground listener, Picovoice Porcupine, wake word, watch complication, tap-to-speak, share sheet, parking, parked on level 10, voice register family, single-valued slot, multi-valued slot, temporal coherence, supersession, held layer, Companion-noticed retraction, asynchronous commit window, name-agnostic router, Operator Layer mobile, federation, offline queue, mobile engineering capacity, multi-instance fleet, DevOps engagement, control plane, Authority instance, Instance A, Instance B, Instance C, Instance D, Hetzner, Railway, AWS, Cloudflare Pages, fleet management, cross-instance traffic, email registry, abuse boundary, grant flow, public marketing website, credit request form, jurisdiction governance, GDPR, EU data residency, Schrems II, data sovereignty, regulatory framework, retention policy, white-label, managed hosting, product tier, protocol triangle, Loom remembers, FORAY proves, OVA scopes, prove data never left, external information seams, external API gateway, general-purpose API specialist, public-apis catalog, github.com/public-apis, REST API integration, FORAY-attested external calls, OVA-scoped credentials, API gateway tiers, API-key auth, OAuth pagination, response transformation, semantic mapping across services, declarative API configuration, API specifications as Memory, build-once-configure pattern, ship-N-adapters reframe, weather API, FX rate API, holiday API, geocoding API, current-state-from-external-services.


How to use this file

Each entry below carries four slots:

Entries are organized topically, not chronologically. New queued directions arriving in future conversations should be added under the right section (or a new section) with the same four-slot structure.


Section 1 — Engagement creation arc

The two entries in this section are complementary parts of the same arc: assistance during Discovery (helping an Operator reach the point of having a Discovery record), then transformation of the Discovery record into a candidate seed.

1.1 — Engagement creation assistance pattern (Discovery)

Trigger. Any conversation about how a new Operator gets started; the blank-page problem; conversational creation of an engagement; Operator Layer Arc 2 (Companion brain) discussions touching engagement onboarding; the Operator-supplied vs Operator-elicited information distinction; or an actual instance of Claude helping an Operator articulate a new engagement (e.g. the marketing engagement creation that prompted this entry).

Substance. The process of soliciting information from an Operator to build a seed and its six fields is itself a product capability, not just a thing Claude does for Marvin in chat. Every new Operator faces the blank-page problem: they have a domain, they have intentions, they have material in their head — but they don't have it in the form a candidate seed needs. The assistance Claude provides to help Marvin create new engagements is the template for what the Loomworks product should offer built-in to every new Operator.

Current state. Pattern is being practiced in chat-based Discovery sessions but is not captured as a product feature. Capture the process as it happens — the questions Claude asks, the order it asks them in, the way it surfaces tensions, the way it helps the Operator see the shape of their own engagement. That captured process is the input to building the in-product Discovery assistance.

Trajectory. Originally framed as "Claude does this for Marvin in chats" — a manual operator-side activity. Reframed as "this is itself a product capability" because every Operator who ever uses Loomworks will face the same problem and the manual chat process is the prototype for the in-product assistance. The reframe matters because it changes what's worth capturing during Discovery sessions: not just the output (the Discovery document), but the process (the sequence of questions and elicitations).

1.2 — Discovery-to-seed skill

Trigger. Any conversation about transforming a Discovery document into a candidate seed; bridging Discovery to engagement creation; the seed-induction loop; or the gap between "Operator has talked through what their engagement is" and "engagement exists in Loomworks with seed v0.1 ready for induction."

Substance. A skill (in the Loomworks sense — bounded structural transformation, no registered actor, no instruction set) that creates a candidate seed conforming to R-A5 through R-A11 from a Discovery document landed in Memory. The skill takes the Discovery record as input and produces a draft seed with the six fields filled in. The candidate seed then enters the seed-induction loop normally, where any gaps or ambiguities surface as findings the Operator addresses.

Current state. Not built. Bridges 1.1's output (a Discovery record produced through assistance) to the existing seed-induction mechanism. The induction loop already exists and works; what's missing is the structural transformation from Discovery's freer shape to the seed's constrained shape.

Trajectory. Originally considered building this as an agent (LLM call with prompt). Set aside in favor of the skill framing because (a) the transformation is bounded — there's a known input shape and a known output shape, (b) an agent's flexibility is not what's needed; the value is precisely the constraint, (c) the skill pattern composes with the rest of Loomworks (extraction skills, materializer registry, etc.) and an agent here would be the only agent in the upload-side pipeline.


Section 2 — Deferred build phases

These are phases that are scoped well enough to draft CRs against but explicitly held back. They will resume when their preconditions are met or when Marvin asks to revisit.

2.1 — Phase 30: induction of existing engagements

Trigger. Any conversation about Enrollium, ExpenseDesk, or FarmGuard entering Loomworks; about how the three pre-existing engagements get on-boarded; about deriving seeds from prior REQ-format material; or about the one-off induction solution that lives separately from Loomworks itself.

Substance. The three existing engagements (Enrollium, ExpenseDesk, FarmGuard) need to enter Loomworks. They have prior material — REQ-format requirements specifications, conversational history, specification artifacts — that should drive their seed creation. The induction solution is a one-off: it derives seeds from prior material rather than from Discovery sessions, and it lives outside Loomworks (it's a migration tool, not a Loomworks feature).

Current state. Deferred until Marvin asks to revisit. Not blocking any current work. The architectural separation (induction solution is not part of Loomworks) is settled; what remains is execution.

2.2 — Phase 32: marketing engagement through conversational creation

Trigger. Any conversation about creating the Loomworks marketing engagement; about practicing the conversational engagement creation pattern (Section 1.1) on a real engagement; about marketing content authored through the methodology; about Loomworks marketing in general.

Substance. Create a Loomworks marketing engagement through the conversational creation process. This is both (a) the marketing engagement itself (real product output) and (b) the first practice of the engagement-creation-assistance pattern on a non-trivial engagement, captured as the prototype for the in-product Discovery assistance feature.

Current state. Deferred until Marvin asks to revisit. The loomworks-marketing-creation-flow-content-v0_3.md and related drafts hold the working material from earlier sessions.

2.3 — Code health audit phase

Trigger. Any conversation about codebase consolidation, semantic duplication, shared utilities, refactoring, or "is the codebase getting messy"; or any concrete observation of duplicated logic across components; or whenever the current bug-fix wave stabilizes and a consolidation phase becomes natural.

Substance. A lightweight consolidation phase similar to Phase 24. Scan the codebase for semantic duplication — repeated business logic across components, similar helper functions in multiple files, parallel implementations of the same pattern. Extract shared utilities. Establish conventions to prevent future duplication. This is maintenance work, not feature work; the goal is to reduce the cost of every subsequent build phase rather than to deliver new capability.

Current state. Scheduled after the current bug-fix wave stabilizes. Not blocking any current work. Worth doing eventually because semantic duplication compounds — a duplication tolerated for one phase becomes three duplications tolerated for the next phase, and so on.


Section 3 — Rendering and image generation

These two entries cover Render-side work that depends on engine phases not yet shipped.

3.1 — External rendering service integration as render specialists

Trigger. Any conversation about Novi AI, Steve AI, Story.com, or other external rendering services; about integrating LLM-as-a-service render providers; about the render specialist pattern; about post-Phase-D adapter work; or about the question "how does Loomworks render content via external services."

Substance. External rendering services (video, animation, complex generated content) integrate as render specialists, not as enumerated built-in integrations. The pattern: a specialist SDK that providers (or the Operator on their behalf) implement against; rendering rules that carry service-specific parameters; the existing render-specialist registry mechanism dispatches by render type to the right specialist. The specialist pattern is the integration surface — extensibility-first.

Current state. Prerequisites for this are now partly shipped: Phase 34 (external-service polling for specialists) landed; Phase 35 (render content kinds — binary blob, external reference, multi-file) is the next engine phase and brings binary content support. After Phase 35, the substrate carries what's needed to support specialists that produce binary or external-referenced content. The specialist SDK design itself is post-Phase-D adapter work.

Trajectory. Originally considered building enumerated integrations (one for Novi, one for Steve, one for Story.com) similar to how API integrations sometimes get built. Set aside in favor of the specialist SDK pattern because (a) the rate of new external services arriving is high and growing, (b) enumerated integrations would force engine releases for every new provider, (c) the specialist pattern already works for the existing render side and composes naturally.

3.2 — Image generation for Goosey storybook art

Trigger. Render comes up in a Goosey context (specifically: post-Phase-D adapter work; depends on engine Phases 35 and 38); or any conversation about children's storybook art generation; or any other engagement that needs character-consistent illustrated content; or general image-generation cost/quality assessment.

Substance. Operator wants help identifying a cost-effective image-generation solution for Goosey storybook art and similar children's storybook engagements. Key requirements: character consistency across pages (the moose should look like the same moose in scene 1 and scene 12), ability to refine conversationally (the workflow is iterative — first draft is rarely final), reasonable cost per image (storybooks have many pages; per-image cost compounds).

Current state. Initial assessment 2026-05-04: Nano Banana 2 (Gemini 3.1 Flash Image) likely fits — character consistency across pages is a stated capability, conversational refinement is a stated capability, cost is in the $0.02–0.15/image range. Worth comparing against Midjourney, Flux, DALL-E successors at decision time; the field moves fast and the assessment will be stale within months.

Trajectory. Image generation has been on the radar since Goosey's first storybook render in early 2026 but was deferred because the substrate couldn't carry binary content. With Phase 35 in flight, the substrate constraint is being removed and the image-generation question becomes near-term. The Nano Banana 2 assessment is intended as a starting point for the actual evaluation, not a recommendation.


Section 4 — Upload facility extension architecture

This section is more substantial than the others because the underlying conversation covered more ground. Three sub-sections: what's already extensible and how, three categories of future data type that don't fit the current shape, and the metadata-driven runtime as a future direction.

4.1 — What's already extensible

The current upload facility (Phase 16 amendments) is code-extensible. Adding a new file type is:

  1. Write a Python skill function implementing the ExtractionSkill protocol (async def __call__(*, file_path, engagement_id, db) -> ExtractionResult).
  2. Register it in app lifespan against a content-type pattern (registry.register("image/", describe_image_skill, label=..., mode=...)).
  3. If the skill needs a new library, add it to dependencies.

This is not zero-code-change extensibility, but it's small-code-change extensibility. The contribution endpoint, the frontend's mode-button logic, and the supported-types discovery all work without modification — a new file type adds one Python file, one registration line, and (sometimes) one dependency. The architecture is set up to make per-file-type cost low.

This handles every variation of existing categories cleanly: new audio formats, new image formats, new document formats, spreadsheet contents flattened to text, code files, email files. The ExtractionSkillRegistry's content-type-keyed dispatch is the right shape for these.

4.2 — Three categories of future data type that strain the current architecture

These are categories where "file → text → assertion" doesn't fit cleanly. Each one is real future work; each one has a known architectural response.

Category 1 — Multi-stage extraction (file → intermediate → text). Some data types need intermediate processing before text extraction makes sense: a video file needs frame extraction → per-frame Vision description → assembled narrative; a long audio recording needs speaker diarization → per-speaker transcription → structured transcript; a 3D model file needs orthographic projection → vision-based description; a compressed archive needs enumeration → recursive extraction of each member. Today's ExtractionSkill protocol can technically handle these (a video skill could do all the intermediate work inside its __call__), but the per-call duration goes from "seconds" to "many minutes," which the synchronous contribution endpoint can't accommodate. Architectural response: apply Phase 34's external-service polling pattern. Extend ExtractionSkill to optionally return a handle (ExtractionInProgress) the way RenderSpecialist now optionally returns ExternalProductionHandle; the contribution endpoint surfaces a "processing" state; a polling loop drives completion. This is real work but a known pattern.

Category 2 — Multi-output extraction (file → many assertions). The current architecture commits each contribution to a single assertion. Some data types naturally produce many: a long meeting transcript should yield one assertion per topic, not one massive assertion; a spreadsheet of measurements should yield one assertion per row; an email thread should yield one assertion per message; a multi-page PDF where each page is a distinct topic should yield one assertion per page. Today the skill returns ExtractionResult.text: str (singular), and the user gets one held assertion. The cleanest extension: ExtractionResult becomes list[ExtractionResult] (or the protocol gains an extract_many variant); the contribution endpoint creates N held assertions, each with its own slice of provenance metadata pointing back to the same source file; the frontend (already capable of displaying multiple held assertions per Phase 17's redirect work) handles the UI side without change. Architectural response: small CR extending ExtractionResult to support a list, plus the endpoint create-loop.

Category 3 — Non-text-shaped knowledge. Some knowledge isn't usefully shaped as text. A photograph as evidence (the assertion is the image, not "this image shows X"); a spreadsheet of structured data (the assertion is the data); a recorded conversation where audio carries information transcription would lose; a diagram where structure matters. This is the deepest strain, because the methodology assumes assertions are propositions in language. Two viable directions:

Direction A handles every case that's actually arrived. Direction B is queued for whenever non-text knowledge becomes a real engagement requirement.

Architectural asymmetry worth noting. The upload boundary (the contribution endpoint and the registry) is well-shaped for extension. The downstream consumers (Manifestation, Shaping, Render) currently assume text-shaped assertions. If non-text knowledge becomes important, the bigger work isn't extending upload — it's extending what comes downstream. Direction B's full cost is at the consumer side, not the producer side.

4.3 — Metadata-driven runtime as a future direction

Trigger. Any conversation about whether the upload facility can be extended without modifying the codebase; about Operator-defined extraction; about plugin systems for Loomworks; about cross-surface DSLs (extraction, rendering, shaping all describable in the same metadata language); about the gap between "code-extensible" and "configuration-extensible"; about how a non-developer Operator could add a new file type.

Substance. A metadata-driven runtime would replace today's code-extensible pattern (write a Python skill, register it) with a configuration-extensible pattern (author a data record describing the extraction; a runtime interpreter executes it; new file types added without writing Python). Three layers exist within this direction:

Three trigger conditions justify reaching for a metadata-driven runtime (any one of which would make the build worth it):

  1. Operator Layer commits to operator-defined extraction. If a clinical-trial Operator wants to ingest a particular FHIR bundle format, or a legal Operator wants to parse a particular court-filing PDF schema, having them describe the extraction in metadata rather than wait for a developer is exactly what the Operator role implies. This is the most likely first trigger.
  1. Per-skill code authoring becomes the dominant cost. The current pace (new skill arrives every several months) is far below the break-even point. If new skills started arriving every week or two, the per-skill code cost would compound and the metadata DSL would pay back its build cost.
  1. Cross-surface amortization. Render specialists, shaping skills, extraction skills, engagement-creation flows, and the operator-supplied API integrations all have similar shape — describe an operation, parameterize it, run it. If Loomworks ever builds the metadata-driven runtime once, it amortizes across all these surfaces, not just upload extraction. The cost of building it once for one surface is high; the cost of building it once for five surfaces is roughly the same and the per-surface payback is much higher.

Current state. Not built. Today's code-extensible pattern is right-sized for current loads, current Operator model (one Operator + Claude, not many self-serving Operators), and current new-skill arrival rate. The path is well-understood; the build is queued, not pursued.

Trajectory (load-bearing). This is the entry where preserving the conversational trajectory matters most.

The first response in the conversation called the metadata-driven runtime "wrong-shaped for the product" — a permanent judgment, implying the door was closed. Marvin pushed back: "But does the extensibility accommodate new file types without modifying the code base?" — surfacing that the original answer had been too quick to dismiss. The corrected framing was "premature given today's loads" — a queued judgment, implying the door is open and the conditions for opening it just haven't arrived.

The distinction matters because:

The error in the first response was treating a current-state cost-benefit judgment as if it were a permanent architectural judgment. The two are different things, and conflating them is the kind of mistake that closes doors that didn't need to be closed.

The corrected position: the metadata-driven runtime is the right architecture for a future Loomworks where Operators define their own extraction, where the new-skill rate is high, and where the same runtime composes across multiple surfaces. None of those conditions bind today. All of them are plausible in the medium term. Reach for it when at least one of them starts to bind, not before.


Section 5 — Agent capability development

The two entries in this section emerged from the Phase 47 scoping arc (2026-05-07) when external-platform agent developments at Anthropic — Claude Dreaming and the broader Managed Agents primitives — surfaced as relevant to Loomworks' Companion and specialist trajectory. Neither is Loomworks-native; both are external work that maps usefully onto Loomworks elements. Captured here so the mapping survives across sessions and so the relevant Loomworks attachment points are explicit.

5.1 — Claude Dreaming as a Companion behavior pattern

Trigger. Any conversation about agent memory consolidation; about pruning, merging, or reorganizing accumulated knowledge; about "is the engagement Memory getting messy" or similar concerns about long-running engagements accumulating low-signal assertions; about Claude Dreaming, AutoDream, or /dream (the Claude Code variant); about cross-session pattern detection; about whether the Companion should observe and act on patterns across an engagement's history; about Accounting engagement reconciliation as a recurring evaluator task; about observation-to-Shape transitions where accumulated Memory becomes a structured specification proposal; about whether Loomworks should incorporate dreaming-style behaviors as agent capabilities mature.

Substance. Anthropic shipped Claude Dreaming for Managed Agents in research preview on 2026-05-06: a scheduled process that reviews past sessions and memory stores, extracts patterns, prunes stale entries, merges duplicates, and resolves contradictions. In Claude Code specifically, Auto Dream consolidates CLAUDE.md and MEMORY.md files between sessions. The capability addresses a real problem: as agent memory accumulates, it becomes redundant, contradictory, and stale; the agent's effective context degrades.

The same problem applies to Loomworks engagement Memory at scale. As an engagement runs over months, assertions accumulate. Some become redundant (different assertions saying overlapping things). Some contradict newer assertions. Some become stale. The Companion's context window fills with low-signal material.

The governance model is what differs. Claude Dreaming modifies memory autonomously by default (with optional human review). Loomworks is built on Operator authority — the machine surfaces and signals, the Operator approves. Autonomous memory modification is a category error in Loomworks. Loom Protocol's no-erasure guarantee is structural.

The Loomworks-native version of dreaming routes through the engagement pipeline. The Companion proactively reviews Memory and proposes consolidations or retractions. "Assertions #12 and #23 say overlapping things — want me to retract #12 and amend #23 to cover both?" That proposal enters Shaping. The Operator approves. The retraction is FORAY-attested. The provenance survives.

Specific applications across Loomworks:

Current state. The infrastructure exists. Phase 44 trigger evaluator + proactive observation pathway + four-room pipeline give the Companion everything it needs to "dream" — the missing pieces are specific evaluator implementations and prompt instructions for each application above. None are urgent today. They become valuable when:

Trajectory. Originally framed as "should Loomworks adopt dreaming?" The corrected framing: dreaming is not a capability to adopt — it's a use case for capabilities Loomworks already has. The substantive question is which evaluator tasks and prompt instructions to build, not whether to graft external machinery onto the substrate. The methodology and governance differences (autonomous vs. pipeline-governed) are what they are; they're not in tension because Loomworks isn't trying to compete with Managed Agents on agent infrastructure — they're at different altitudes.

The relationship: Anthropic is publicly working out the conceptual primitives for long-running agent memory. Loomworks borrows the vocabulary (consolidation, hygiene, pattern extraction) and applies it through Loomworks' governance model. The borrowing is honest because both are addressing the same underlying reality.

5.2 — Claude Managed Agents conceptual primitives

Trigger. Any conversation about agent infrastructure platforms; about Anthropic's Managed Agents service; about lead/sub-agent orchestration patterns; about grader agents or outcome-based evaluation; about session lifecycle, vault credentials, or webhooks for agent platforms; about whether the Companion should be implemented on Managed Agents or on Loomworks' own substrate; about specialist composition extending to multi-agent decomposition; about how external systems should subscribe to engagement lifecycle events (post-Arc-2 external connectors); about render review or quality assessment for composed Render outputs; about the relationship between Loomworks-as-methodology and external agent platforms.

Substance. Anthropic's Managed Agents (GA April 2026; expanded 2026-05-06 with Outcomes, multi-agent orchestration, and webhooks alongside the Dreaming research preview) is a managed agent infrastructure platform. The platform's vocabulary gives Loomworks four conceptual primitives worth tracking as the Companion and specialists develop into full agents.

Primitive 1 — Outcomes and grader agents → Shape-as-acceptance-criteria + review specialist. Managed Agents lets users define criteria for what "good" looks like; a grader agent evaluates output against those criteria. Loomworks already has the criteria — the Shape is a complete specification (Phase 38 grammar declaration formalizes this). What's missing is the explicit grader. A review specialist that reads the Shape and the Render and produces an acceptance assessment is a natural extension of Loomworks' specialist pattern. Becomes important when (a) render specialists compose other specialists' work (Phase 37 territory) and "is this Render faithful to the Shape?" needs to be answerable without the Operator personally reading every output, (b) automated quality gates become valuable at platform scale.

Primitive 2 — Multi-agent orchestration → lead/sub-specialist composition with engagement Memory as shared substrate. Managed Agents supports lead agents that decompose work and assign to sub-agents working in parallel on a shared filesystem, with results consolidated through the lead agent's context. Loomworks has render specialists, and Phase 37 introduced adapter chaining and composition — partway there. The Managed Agents pattern points where this extends: a lead specialist that orchestrates sub-specialists, with the engagement's Memory as the shared substrate (instead of a generic filesystem). The shape-grammar declaration work (Phase 38) makes this more tractable because sub-specialists can declare what fragments they produce and the lead can compose them. As specialists become richer, this orchestration vocabulary becomes the way to describe them.

Primitive 3 — Memory + Dreaming → engagement Memory + governed consolidation. See 5.1. The conceptual primitive: persistent agent memory needs active maintenance. Dreaming validates the problem; Loomworks governs the solution differently.

Primitive 4 — Sessions and webhooks → engagement participation periods + lifecycle events. Managed Agents treats sessions as first-class objects with lifecycle events that external systems can subscribe to. Loomworks has engagement participation but no equivalent first-class session boundary or webhook surface. When Arc 2's external service connectors come online (Stripe link, messaging, calendar, third-party render services), the question "how do external systems know what's happening in an engagement?" needs answering. Webhooks on engagement state transitions, render completions, approval events — that's the integration surface. The Managed Agents shape gives the primitive without prescribing the implementation.

The relationship between Loomworks and Managed Agents. Managed Agents is an agent infrastructure layer. Loomworks is a methodology and governance environment for agent-mediated work. They're at different altitudes. Loomworks could run on Managed Agents under the hood, or run its own substrate; the methodology doesn't change either way. What matters is that Anthropic is publicly working out conceptual primitives — outcomes, orchestration, memory, dreaming, sessions — and those primitives compose well with what Loomworks is building. Borrowing the vocabulary is cheaper than inventing parallel terms, and the mapping is honest because both are addressing the same underlying reality of long-running agent work.

Current state. None of the four primitives are built into Loomworks today. Each maps to a specific future direction:

Trajectory. Considered framing as "should Loomworks build on Managed Agents?" Set aside because the question conflates two layers. Managed Agents is infrastructure; Loomworks is methodology. The implementation choice (own substrate vs. Managed Agents) is independent of the conceptual borrowing. The conceptual borrowing is the load-bearing insight: vocabulary that Anthropic is making canonical in the broader ecosystem can be adopted in Loomworks without coupling Loomworks' substrate to Anthropic's platform.

The corrected position: Loomworks builds its own substrate and borrows external vocabulary where it composes well. The Managed Agents primitives are the most useful current example — outcomes/grader, orchestration, memory/dreaming, sessions/webhooks all map cleanly onto Loomworks elements (Shape, specialist composition, engagement Memory, engagement lifecycle). Track new primitives as they emerge from external work; absorb the ones that map cleanly; don't import implementations.


Section 6 — Mobile presence and quick-capture engagement pair

The two entries in this section are the substrate side and the capture-device side of the Operator's mobile experience of the Companion. They emerged from the conversation initiated by the question "Would it be possible to have the Companion present and listening on my mobile (iPhone or Android) similar to how Siri is always present on my iPhone? I would like to be able to say 'Hey Companion, I parked on level 10.'" The conversation surfaced two engagements rather than one, and produced a pair of investigations that should be read together: loomworks-quick-capture-engagement-investigation-v0_2.md and loomworks-mobile-presence-investigation-v0_1.md.

The pair is paced by three constraints worth flagging up front: (i) the substrate side is prerequisite to the capture-device side (mobile is the highest-value surface but cannot ship before /quick-capture exists); (ii) persona-emergence dependency (the voice register fires at much higher frequency than the conversational surface; persona instability is felt sharply at this volume); (iii) mobile-engineering-capacity is the larger pacing constraint (native iOS + native Android is a substantial build, larger than any single substrate phase Loomworks has shipped, and warrants engineering capacity beyond the current DUNIN7-M4 + CC cadence). All three are pacing-readiness concerns, not architecture concerns; the architecture is settled enough at investigation level for scoping to begin when capacity allows.

6.1 — Quick-capture engagement (substrate)

Trigger. Any conversation about dictation surfaces, fast-path Memory contribution, "Hey Companion, ..." utterance shapes, the parking / grocery-list / "I just parked" use case, append-only Memory contribution with terminal acknowledgment, single-valued vs. multi-valued slot semantics, Companion-noticed supersession, the held-layer-as-supersession-resolution-surface, or any conversation about when an utterance should bypass the full Phase 42 converse pipeline because it's structurally not a conversation. Also: any conversation about queries against quick-captured Memory ("Where did I park?"), or about disambiguation between same-day same-slot facts ("I parked at level 7" after earlier "I parked at level 10").

Substance. Quick-capture is one class of bidirectional Memory traffic on a fast path with three modes: write-clean, write-with-resolution, and read. Capture is dictation-class (capture, route, author held assertion, brief acknowledgment, stop). Routing is engagement-determined via the Companion's three-layer router (Operator-declared rules → engagement-context default → light heuristic). Writes that conflict with structurally similar prior held assertions surface either Companion-noticed supersession (clean cases; auto with naming acknowledgment) or one focused disambiguation question (ambiguous cases). Reads against quick-captured Memory are a sibling mode on the same surface.

The architectural question — intent-class extension on the existing classifier path vs. pre-classifier fast path vs. hybrid — is open. Most likely hybrid: fast path for dedicated capture surfaces (mobile shortcuts, watch complications, keyboard shortcuts), intent extension for chat-pathway invocation. Same router, two entry points.

The methodology-finding-grade observation: single-valued vs. multi-valued slots as an engagement-declared property; Companion-noticed supersession on the held layer as distinct from Operator-explicit retraction. Single-valued slots ("current parking location") supersede on update; multi-valued slots ("items on the grocery list") accumulate. The engagement declares which is which. The held layer is the cheap-supersession layer because supersession before commit doesn't require retraction ceremony. The asynchronous-commit window doubles as the supersession window. Phase 38's declare-and-register pattern is a natural extension surface for slot-semantics declaration.

The voice register expands from one template to four: clean-ack ("Got it. Level 10."), query-response ("Level 10 — at 9:13 this morning."), disambiguation-question ("Got it — earlier today you said level 10. Replace it?"), resolution-ack ("Done. Level 7 it is."). All short; all dictation-class; resistant to the conversational pipeline's tendency toward warmth and elaboration. Likely ships through a voice loader at loomworks/orchestration/quick_capture_voice.py analogous to the credit_voice loader from Phases 49 and 50.

The router is name-agnostic: wake-word recognition is upstream of the substrate; the router receives an utterance and a destination, not a wake-word-keyed dispatch. Companion-name authority lives in the companion_name column (Phase 41); every surface reads from there.

Current state. Not built. Investigation v0.2 is the architectural framing. A scoping note is the next deliverable, paired with the mobile-presence scoping (6.2). The CR follows after scoping.

Trajectory. v0.1 of the investigation framed quick-capture as a third class of turn in a methodology trinity (Companion-proposes/commits, Companion-converses, Companion-routes-and-records). v0.2 reverted that framing: quick-capture is one class of bidirectional Memory traffic with sub-modes, not a third write-class peer. The methodology distinctions live on the write-vs-read and clean-vs-conflicting axes within quick-capture, not at the class-of-turn level. The reversion happened because a three-utterance example ("I parked on level 10" / "where did I park" / "I parked at level 7") surfaced that reads (the second utterance) don't fit "routes-and-records" — they're a sibling mode on the same surface, not a separate class. The trinity framing was making a class distinction at too coarse a level; v0.2 lands it at the right level (sub-modes within quick-capture). Preservation of the v0.1 framing matters because someone reading "third class in a trinity" without the trajectory would not know to look for the sub-mode framing — they'd assume the original answer was the right one.

A second trajectory note: the held-vs-committed posture for alpha was framed in v0.1 as Option 2 (held + asynchronous batch commit), recommended on methodologically-conservative grounds. v0.2 sharpens — Option 2 is more than conservative; it's the architecturally-required layer for the temporal-coherence pattern. Companion-noticed supersession needs the held layer to operate cheaply; without it, every supersession would require explicit Operator-side retraction. The alpha posture has a methodologically-positive reason, not just a defensive one. This is worth preserving because future scoping might be tempted to skip the held layer for "performance reasons" or similar, and the methodology cost of doing so would be high.

6.2 — Mobile presence engagement (capture-device)

Trigger. Any conversation about iPhone / Android Companion presence, "Hey Siri" / "Hey Google" routing to Loomworks, App Intents, App Actions, Siri Shortcuts, foreground wake-word detection, on-device wake-word libraries (Picovoice Porcupine, Snowboy), tap-to-speak surfaces (widgets, watch complications), the Operator Layer extending to mobile, federation between web/desktop and mobile surfaces, the mobile chat surface, push notifications for proactive Companion behavior, offline quick-capture queueing, or the parking-garage example use case (which is literally often spoken offline, in a parking garage, before the substrate is reachable).

Substance. Mobile presence is multi-surface, not single-surface. The literal "always listening like Siri" interpretation is closed for third parties on iOS (Apple reserves system-level wake-word for Siri) and effectively closed on Android (foreground services with persistent notifications are the closest available; battery and OEM-killer dynamics make them unreliable for always-on). The right framing is composition of surfaces, each appropriate for a different posture:

The federation framing is methodology-finding-grade: mobile is an Operator Layer surface, not a forked substrate. Same companion_name, same engagements, same Memory, same converse pipeline, same approval cards. The web Operator Layer is "Operator Layer (web)"; mobile is "Operator Layer (mobile)." Same family, different form factor. The principle is naming-neutral and form-factor-neutral; the same composition would apply to a future TV surface, car surface, or kiosk surface.

Offline posture is a first-class concern, not a footnote. The Operator's exact opening example — "I parked on level 10" — is literally often spoken in a parking garage, where mobile data is unreliable. Local timestamping with replay-time idempotency is the cleanest answer: captures land in a local queue with their original timestamps; queue drains on connectivity return; voice acknowledgment speaks immediately on capture (the on-device voice template is loaded; ack isn't network-gated). Substrate idempotency on (utterance, time-window, person) tolerates offline-replay edge cases.

The recommended primary form is native (iOS Swift/SwiftUI, Android Kotlin/Jetpack Compose). PWA loses most of mobile-presence's value (App Intents, foreground wake-word, watch complication, widgets all require native). Cross-platform (React Native, Flutter, Kotlin Multiplatform) trades platform-native feel for code reuse; some integrations need native bridges anyway.

Current state. Not built. Investigation v0.1 is the architectural framing. A scoping note is the next deliverable, paired with quick-capture's scoping (6.1). The build follows after scoping and after the substrate-side /quick-capture endpoint exists. The build is paced by mobile-engineering-capacity (see 6.3) more than by methodology readiness.

Trajectory. Originally framed as "achieve Siri-equivalence." Set aside because the literal interpretation is closed for third parties on both platforms. Reframed as "compose a set of surfaces such that the Operator experiences the Companion as present and reachable, with as little friction as the platforms permit." The composition turns out to be richer than a single wake-word surface would have been — different surfaces serve different postures, and that's a feature. Worth preserving the trajectory because future sessions might revisit the question "why don't we just make the app always-listen like Siri does," and the answer is Apple and Google reserve that capability for themselves; the App Intents Siri-Shortcut pattern is the closest a third party gets, and it's actually good.

A second trajectory note: the watch complication observation. Initial framing treated the watch as a thin extension of the iPhone/Android app, mostly an afterthought. On reflection, the watch is the highest-value form factor for quick-capture — phone-in-pocket, wrist-accessible posture is exactly when "I parked on level 10" wants to be said. The watch surface eliminates the friction of getting the phone out, which is the highest friction of any surface in the composition. Worth preserving because scoping might otherwise treat the watch as "we'll do that later" when the actual usage pattern says it's the peak case.

6.3 — Mobile engineering capacity (pacing readiness, not architecture)

Trigger. Any conversation about when to begin Loomworks mobile-app construction; about whether DUNIN7's current build cadence (one substrate phase at a time on DUNIN7-M4 + CC) is appropriate for a native iOS + native Android build; about specialist contractors for mobile work; about cross-platform frameworks (React Native, Flutter, Kotlin Multiplatform) as a way to compress effort; about the relationship between methodology readiness and engineering readiness; or about the pacing of mobile vs. desktop / substrate work generally.

Substance. Native iOS + native Android is a substantial build, plausibly larger than any single substrate phase Loomworks has shipped. The current build cadence is well-tuned for substrate work but is unlikely to be the right cadence for mobile. Three options surface:

The decision is organization-level, not methodology-level. The methodology is settled enough at investigation level for any of the three to work. What changes between them is timing, cost, and platform-native feel.

Current state. No decision; flagged for the next manifest pass and for whichever scoping pass elevates 6.1 / 6.2 from queued direction to active scope.

Trajectory. Originally not surfaced as a separate concern — the assumption was that mobile would proceed through DUNIN7-M4 + CC like substrate phases. The mobile-presence investigation surfaced the build-volume estimate explicitly: native iOS + native Android is comparable in effort to the Operator Layer's web frontend, which has been a multi-phase undertaking. Treating mobile like a single substrate phase would underestimate the effort by an order of magnitude. Worth preserving because future scoping needs to know that mobile is not a small build, even if the methodology investigations make the architecture look tractable. Architecture being tractable doesn't make execution small.


Section 7 — Multi-instance fleet and jurisdiction governance

The two entries in this section emerged from the deployment-strategy work that paralleled Phase 47 credit-substrate scoping. Together they describe the deployment topology that lets Loomworks ship as a fleet of instances, and the methodology framing (Loom + FORAY + OVA composition) that turns per-instance regulatory jurisdiction into a first-class product feature rather than a custom build for each customer. Captured here so the topology and the methodology survive together — neither is useful without the other.

7.1 — Multi-instance fleet via DevOps engagements

Trigger. Any conversation about Loomworks deployment topology; about hosting jurisdictions or regulatory residency; about white-label and managed-hosting product tiers; about the Authority instance and how grant issuance crosses the fleet; about cross-instance traffic and resilience to instance outages; about how a partner runs their own Loomworks deployment under their own brand; about Hetzner / Railway / AWS / Cloudflare Pages or other hosting choices for production instances; about the public marketing website and how it relates to the Authority; or about fleet management as a Companion task.

Substance. Each Loomworks deployment is a DevOps engagement on DUNIN7's own Loomworks (the control plane). Loomworks manages its own distribution through its own pipeline — fleet management is not a separate tool, it's engagements on Loomworks managing other Loomworks deployments. The pattern carries five load-bearing properties that hang together:

Current state. Substrate built and tagged through Phase 49. The credit schema is ready (phase-47-credit-substrate-foundation, phase-48-credit-completion-and-operator-signin). Phase 50 brings the public credit-request form and the conversion-credit asset_id override — completing the Authority-side surfaces the website needs to talk to. Fleet deployment itself is post-Phase-50 work: standing up Instance A on Hetzner CPX22, then Instance B on Railway, then Instance C on Hetzner Frankfurt for GDPR, plus the marketing website on Cloudflare Pages. Minimum viable test (website + A + B + C, skipping D) costs roughly $77/month plus Anthropic credits — small enough to prove the pattern before AWS-class spend. Source documents: loomworks-deployment-strategy-v0_2.md, loomworks-hosting-cost-analysis-v0_1.md, loomworks-white-label-multilanguage-analysis-v0_1.md.

Trajectory. v0.1 of the deployment strategy framed credit data as a separate database on each instance and centralized the credit store. v0.2 corrected on five points after credit-scoping v0.7 surfaced the underlying architecture: credit data co-locates in the engine DB under the credit schema (trigger atomicity and transactional signup require this); two-engagement governance with Credit Management as Authority and Accounting as state-keeper (replacing the single "Credit Management" framing); email registry on Instance A only (replacing distributed registries with their abuse-boundary holes); grant flow replacing invitation-code flow (the website is the entry point; codes aren't the abstraction); and the marketing website added as a first-class deployment artifact (the form needs somewhere to live). Worth preserving because the v0.1 framing would have produced a fleet that looked superficially similar but failed at exactly the points where the corrections matter — abuse, atomicity, and the form's home.

7.2 — Protocol triangle for jurisdiction governance

Trigger. Any conversation about regulatory jurisdiction as a product feature; about GDPR or EU data residency; about Schrems II or "prove data never left X soil"; about per-instance regulatory framework or retention policy configuration; about how Loomworks handles data sovereignty for regulated customers; about why an Operator might choose Hetzner Frankfurt over Railway US; or about the role each protocol (Loom, FORAY, OVA) plays in compliance posture.

Substance. Jurisdiction is a first-class deployment parameter, and the three protocols compose to make it so:

The composition is what makes jurisdiction selectable. Operators choose hosting based on regulatory need (their domain requires GDPR-bound residency) or privacy preference (they prefer EU jurisdiction even when not strictly required). DUNIN7 doesn't custom-build jurisdiction support per customer — the protocols already compose to give it.

Current state. Conceptually intact because all three protocols already exist: FORAY is live, OVA seam shipped in Phase 45 (per-instance authorization broadened by Phase 49's bimodal dispatch), Loom remembers per-engagement. The per-instance jurisdiction config is implementation work not yet done — Loom-side rule storage, FORAY-side instance-tagged attestations, OVA-side jurisdiction-scoped credentials. Instance C (Hetzner Frankfurt) is the planned EU/GDPR instance per the deployment strategy; the protocol-triangle implementation lands when that instance is stood up.

Trajectory. Original framing (memory-edit era) treated FORAY as carrying the entire compliance story — "FORAY proves it." The conversation sharpened to the triangle once the question "how does an EU operator know their data isn't readable by a US instance?" surfaced — FORAY proves what did happen, but it doesn't prevent unauthorized reads in the first place. Naming the triangle distributes the load: Loom holds the rules, FORAY proves observance after the fact, OVA enforces ahead of the fact. Worth preserving because future implementation might be tempted to lean on FORAY alone (the audit-trail framing is more familiar to compliance-minded customers), and missing the OVA leg would leave the system reactively compliant rather than proactively secure.


Section 8 — External information seams

Two entries covering how Loomworks engagements pull current state from external services. The architectural move is to build one general-purpose specialist rather than N service-specific adapters; the public-apis catalog is the reference universe of services that specialist can be configured against.

8.1 — General-purpose external API gateway specialist

Trigger. Any conversation about Loomworks accessing external APIs; about per-service adapter sprawl; about pulling current state into engagement working state from external sources (weather, FX, holidays, geo, market data, news, image services, etc.); about how the protocol triangle wraps external data; about the seam-and-stub pattern; about whether to ship N specialists or one generic adapter; about MCP versus REST integration patterns; or about the github.com/public-apis catalog and its relationship to Loomworks.

Substance. Build one general-purpose external API specialist that handles all public APIs through a uniform FORAY-attested, OVA-authorized interface, rather than building N service-specific specialists. The specialist takes a declarative API description (URL, method, authentication pattern, parameters, expected response shape) — held as Memory in the engagement that uses it — and produces calls wrapped with full protocol-triangle discipline:

The architectural payoff is that constraint-creep is bounded. Adding the 200th external service is the same shape as adding the second; the gateway scales linearly while specialist count stays at one. The Specialist SDK consumption pattern (Population 1 / 2 / 3) gets a load-bearing first instance — the gateway is itself a Population-1 DUNIN7-built specialist that handles a class.

The pattern has tiers, each progressively more capable but each independently shippable:

Tier 1 covers most cases. Tiers 2-4 are progressive enhancements driven by what real engagements need.

Current state. Not built. Architecturally compatible with existing pieces — Phase 34 shipped external-service polling for specialists, Phase 38 shipped declare-and-register grammar, OVA credential storage exists, FORAY attestation surface is live. The trigger to build is the second engagement that requires live external data on the hot path: Credit Management didn't need it (its operational data is engine-internal), but FarmGuard (weather, sensor data) and ExpenseDesk (FX rates, vendor lookup) both will, and Goosey image generation is a render-side analogue of the same pattern. When two of those land in close succession, the gateway specialist becomes the right shared substrate to build first rather than building per-engagement.

Trajectory. Originally framed as "ship 10–15 first-party specialist adapters for high-leverage services" — weather, geo, FX, holidays, identity verification, etc. Reframed by the Operator on 2026-05-09 to "build one general-purpose API gateway with FORAY/OVA wrapping; let the catalog of API definitions live as engagement Memory." The reframing is load-bearing because it converts a build-direction (write more specialists per service) into a build-once-then-configure pattern (write the gateway, configure per-engagement). It also surfaces a sharper observation: external API specifications are themselves a class of supplied domain knowledge — the public-apis catalog is structured external knowledge that becomes more valuable to Loomworks when held as Memory than when retrieved ad-hoc. This composes with the supplied-domain thesis from the methodology engagement: the Operator (or DUNIN7 as a default seed) supplies the catalog of "what external services exist and how to call them"; the gateway acts on it. The reframing also addresses the constraint-creep concern raised in the earlier MCP discussion — instead of N specialists or N MCP servers proliferating, one configurable gateway scales.

8.2 — Public-apis catalog as reference universe

Trigger. Any conversation about which external APIs Loomworks should support; about the universe of free/public APIs available to draw from; about github.com/public-apis as a resource; about discovery of external services for new engagements; about populating the gateway specialist's known endpoints; or about external information seams in general.

Substance. The public-apis repository (github.com/public-apis/public-apis) is a community-curated catalog of free public APIs across roughly 50 categories — weather, finance and FX, geocoding and geo data, holidays, government data, news, sports, identity verification, image services, vehicles, animals, anime, blockchain, and more. Each entry follows a standardized format: name, description, authentication requirement (none / apiKey / OAuth), HTTPS support, CORS support, optional Postman collection link. The catalog is actively maintained, sponsored by APILayer, and contains roughly 1,500+ entries.

For Loomworks, the catalog is the reference universe for the general-purpose API gateway specialist (8.1). Three ways it composes:

Current state. Not ingested. The catalog is a resource to draw on; ingestion is gated on the gateway specialist (8.1) being built. The catalog should be treated as a living external resource — community-maintained, will continue to evolve — not as a fixed snapshot. Ingestion patterns should accommodate periodic refresh.

Trajectory. Originally surfaced as "this might be useful" by the Operator in a chat session. Sharpened in the same session into the gateway specialist pattern (8.1) once the recognition landed that one wrapper handles the class. The catalog without the gateway is just a list; the gateway without the catalog still works (engagements can declare any endpoint); the two compose into a coherent external-information-seam architecture in which the catalog supplies the universe and the gateway supplies the discipline.


What's NOT in this file

A few things worth being explicit about, so this file's scope is clear:

This file is for: queued directions that will be reached for someday and shouldn't be lost in the meantime; deferred work whose preconditions haven't been met; and explicit trajectory preservation for cases (like 4.3, 5.1/5.2, 6.1/6.2, 7.1/7.2, and 8.1) where the path that got us here matters as much as the destination.


Changelog

v0.1 (2026-05-04). Initial consolidation. Sections 1–4. Memory archive consolidation supersedes prior memory edits #1, #18, #19, #20, #21, #22, #23 (and absorbs the obsolete #16).

v0.2 (2026-05-07). Adds Section 5 (Agent capability development):

v0.3 (2026-05-09). Adds Section 6 (Mobile presence and quick-capture engagement pair):

v0.4 (2026-05-09). Adds Section 7 (Multi-instance fleet and jurisdiction governance) absorbing the substance of two memory edits (multi-instance fleet via DevOps engagements; protocol triangle for jurisdiction governance) into a discoverable section paired with their source documents. Two entries:

v0.5 (2026-05-09). Adds Section 8 (External information seams) capturing two entries from a chat session in which the github.com/public-apis catalog was flagged as a future-useful resource and the Operator's follow-up reframe sharpened it into a load-bearing architectural move. Two entries:


DUNIN7 — Done In Seven LLC — Miami, Florida Loomworks — Queued Future Directions and Deferred Work — v0.5 — 2026-05-09