Voice listening silence-submit change-request

Version. 0.2 Date. 2026-05-20 Author. Claude.ai (this conversation) Working machine. The Mac Mini, with Claude Code as the executor Repository. DUNIN7/loomworks-engine (substrate, small) and DUNIN7/loomworks (frontend, primary) Substrate baseline. loomworks-engine main at tag voice-provenance-v0_1 (assumed; the voice-provenance ship lands first). Frontend baseline. loomworks branch phase-60-operator-layer-upload-pathway-v1 at tag voice-provenance-v0_1 (assumed). Grounded in.

The voice listening infrastructure at voice-listening-v0_1 (engagement-navigation) and in-engagement-voice-listening-v0_1 (in-engagement).
The voice-listening-timeout-reset follow-on filed during the engagement-list scoping ship — names this gap: Web Speech API auto-times-out and the panel handles it as a cancel rather than as a clean stop.
The Companion-tunable-settings-as-Companion-said-not-menu-clicked methodology candidate from today.

Plain-language summary

This change-request adds a silence-detection submit path to the voice listening hook, in addition to the existing explicit "say 'send' to submit" path.

Today, voice listening on the in-engagement surface (and on engagement-navigation, parallel) requires the Operator to say the word "send" at the end of their utterance to commit the transcript. Without "send", the Web Speech API's built-in silence timeout fires and the transcript is discarded (handled as a cancel).

This works well for deliberate dictation — long thoughts, multiple sentences, pauses for thinking, then explicit close with "send". It fails for short conversational utterances like "Companion, where did I park?" where the Operator's natural speech rhythm has them stop talking when they're done. Today they have to remember to say "send" or they lose the transcript.

After this ships:

The voice listening hook detects silence after the Operator stops talking. If silence persists past a threshold (default 2.5 seconds, Companion-tunable), the transcript commits — same effect as if the Operator had said "send".
The "send" verb still works as before — explicit override, useful for the dictation rhythm where the Operator may pause within their utterance.
The Operator can adjust the silence threshold via the Companion: "wait longer before sending", "wait shorter before sending", "reset wait time", or specific "wait three seconds before sending" (numeric).
A new setting key voice.listening.silence_submit_seconds lives in person_settings. Default 2.5. Special value 0 disables silence-submit (revert to explicit-send-only behavior — for Operators who prefer the explicit-only model).
The listening panel's footer text updates to reflect the active mode. With silence-submit active: "say 'send' or pause to submit · esc to cancel". With silence-submit disabled: "say 'send' to submit · esc to cancel" (current text).

What you (the Operator) see after this ships.

Open an engagement. Click the microphone. Say "Rico, where did I park?" (or "Companion, where did I park?" if you haven't renamed your Companion, or just "where did I park?" without addressing it at all) — stop talking. After ~2.5 seconds of silence, the panel closes, the transcript commits as your turn, the Companion replies. No need to say "send".

The voice infrastructure doesn't listen for the Companion's name — it captures whatever you say and routes it as a turn. Whether you address the Companion explicitly (by default name "Companion", or by whatever custom name you've given it like "Rico"), or just speak the question directly without an addressee, the transcript flows the same way.

For a longer dictation, say a paragraph, pause mid-thought (1 second is fine), continue, pause again, finish. Say "send" to explicitly close — that submits immediately, no waiting for the silence threshold. The "send" override still works.

For Operators who prefer the explicit-only model, say "don't wait, only send on 'send'" to the Companion — the silence_submit_seconds setting goes to 0, behavior reverts to today's.

For shorter or longer thresholds, say "wait three seconds before sending" (or any value in seconds between 0.5 and 10) — the setting tunes accordingly.

What this change-request does not do.

Does not change the explicit "send" verb behavior. That path stays exactly as it works today.
Does not change the Cancel button or Escape key behavior. Cancel still discards the transcript.
Does not introduce a separate "pause" indicator to the listening panel UI (no visual countdown of "submitting in 2.5 seconds…"). The panel just closes cleanly when silence-submit fires. The Operator who wants to know when submission will happen can use "send" explicitly.
Does not affect the engagement-navigation voice listening path's navigational recognizer behavior. The silence-submit path applies to both surfaces, but the routing of the resulting transcript still differs per surface (engagement-navigation routes through the recognizer; in-engagement routes as a conversation turn).
Does not add a per-engagement silence threshold. The setting is per-Operator only.

Estimated effort. Two checkpoints — Section A small substrate work (new setting key + tune_setting vocabulary extension), Section C primary frontend work (extend the useVoiceListening hook with silence detection + wire the threshold + update panel footer text).

Methodology inheritance. Companion-tunable-settings-as-Companion-said-not-menu-clicked — the silence threshold is adjusted by speaking to the Companion. Loomworks-pattern-over-imported-convention — voice as a thinking-surface interaction supports both deliberate-with-explicit-close and conversational-with-natural-close rhythms; the choice isn't imported from a specific assistant convention. Thinking-surface-not-messaging-surface — voice on a thinking surface needs to handle both rhythms because thinking spans them.

Step 0 — Codebase inspection block

Step 0.1 — Pre-flight


ls -la /Users/dunin7/loomworks-engine
ls -la /Users/dunin7/loomworks
cd /Users/dunin7/loomworks-engine && git status
cd /Users/dunin7/loomworks && git status
git -C /Users/dunin7/loomworks-engine log --oneline -3
git -C /Users/dunin7/loomworks log --oneline -3

Both working trees clean. Substrate at voice-provenance-v0_1 or later. Frontend at the same.

Step 0.2 — Existing voice listening hook shape

Confirm the current shape of the voice listening hook:


cd /Users/dunin7/loomworks
cat src/app/operator/engagement-navigation/useVoiceListening.ts | head -80

Report:

How the hook currently detects "send" in the transcript.
How it currently handles the Web Speech API's onend / onerror events.
Whether it already has interim-transcript handling (each partial transcript update fires a callback or updates state).
Whether it has any internal timer mechanism today.

The silence-submit work hooks into the interim-transcript-update path — each interim transcript event resets a silence timer; when the timer fires without further interim events, silence-submit triggers.

Step 0.3 — tune_setting handler current shape

Confirm the handler's executor dispatch and existing setting keys:


cd /Users/dunin7/loomworks-engine
grep -n "_execute_enum_setting\|_execute_numeric_setting\|_execute_stepped_int_setting\|KNOWN_SETTINGS" src/loomworks/orchestration/tune_setting.py src/loomworks/preferences/person_settings.py | head -20

The handler currently has three executors:

_execute_enum_setting (for conversation.message_order)
_execute_numeric_setting (for voice.listening.blur_intensity)
_execute_stepped_int_setting (introduced and then retired in the FORAY-audit ship; the executor may or may not still exist in code — Claude Code verifies)

The new setting voice.listening.silence_submit_seconds is a numeric float (similar shape to blur_intensity but with different bounds and a meaningful 0-value). Either reuse _execute_numeric_setting with appropriate spec parameters, or add a small dispatch branch. Claude Code's call at execution.

Step 0.4 — VoiceListeningPanel footer text

Confirm where the listening panel's footer text is rendered:


cd /Users/dunin7/loomworks
grep -n "say.*send.*submit\|esc to cancel" src/app/operator/engagement-navigation/VoiceListeningPanel.tsx

The footer text is a hardcoded string today. This ship makes it conditional on whether silence_submit_seconds > 0.

Step 0.5 — Halt-or-proceed

If any check surfaces drift (especially around the _execute_stepped_int_setting retire — if it was retired, this ship may need it re-introduced for numeric float settings with floor/ceiling like silence_submit_seconds), halt and report. If everything matches, proceed to Section A.

Section A — Substrate

A.1 — New setting key

KNOWN_SETTINGS in person_settings.py gains voice.listening.silence_submit_seconds:

Type: float
Default: 2.5
Bounds: [0.0, 10.0] (0 = disabled; 0.5 minimum to avoid trigger-happy submission; 10.0 maximum to avoid effectively-never)
Validator: _validate_silence_submit_seconds enforces the bounds and the float type; rejects bool, NaN, negative, out-of-range.

A.2 — tune_setting handler vocabulary

_LABEL_TO_KEY gains:

"silence", "silence submit", "wait time", "wait", "pause", "pause time" → voice.listening.silence_submit_seconds

Direction vocabulary on this setting:

"wait longer", "longer wait", "longer pause", "longer" → step up by 0.5
"wait shorter", "shorter wait", "shorter pause", "shorter", "faster" → step down by 0.5
"don't wait", "no wait", "disable silence", "only on send", "send-only" → set to 0
"reset wait time", "default wait", "reset pause" → set to default (2.5)
Numeric phrasings like "wait three seconds" / "wait 1.5 seconds" / "pause four seconds" → parsed numeric value, clamped to bounds

The numeric-parsing branch is the new shape — prior settings (blur intensity, message order, banner duration) didn't accept arbitrary numeric values from the Operator; they used named direction tokens. Voice silence threshold benefits from explicit numeric input because Operators often have a specific feel for what duration works for them. The parser accepts patterns like "N seconds", "N second", "N.N seconds", "N", "N.N" — extracted from the operation_data direction string. If parsing fails or value is out of bounds, fall back to no-change with a Companion-spoken explanation.

A.3 — _value_in_plain_words extension

The plain-words renderer handles the new key:

0 → "disabled — voice only submits when you say 'send'"
0.5–9.5 → "{N} seconds of silence before submitting" (1 second special-cased to singular)
10 → "10 seconds of silence before submitting" (maximum)

A.4 — Tests

Per-write tests:

"wait longer" steps up by 0.5 (clamped at 10)
"wait shorter" steps down by 0.5 (clamped at 0; doesn't go negative)
"don't wait" sets to 0
"wait three seconds" parses and sets to 3.0
"wait 1.5 seconds" parses and sets to 1.5
"reset wait time" sets to default 2.5
Out-of-bounds numeric ("wait 100 seconds") rejected with Companion explanation
Invalid numeric ("wait banana seconds") rejected with Companion explanation
Setting persists across handler invocations
FORAY audit row written for every successful change

Approximately ten to fifteen new pytest tests.

A.5 — Intent classifier taxonomy update

intent_classifier.md gains:

New taxonomy entry under tune_setting for silence-submit phrasings
Example phrasings: "wait longer before sending", "don't wait, only send on send", "wait 3 seconds", "shorter pause"
Note that numeric values are extractable from the utterance

A.6 — Responder template tightening

tune_setting.md adds the new setting's deterministic message templates (per the existing pattern):

voice.listening.silence_submit_seconds:
When tuning from N → M: "Silence-submit set to {M} seconds." (or "disabled" for 0)
On reset: "Silence-submit back to default — 2.5 seconds."

Section B — Companion handler

No new Companion handler. The existing tune_setting handler's vocabulary extends in Section A. Section deliberately empty.

Section C — Frontend

C.1 — useVoiceListening hook silence detection

The hook extends with a silence-detection mechanism:

On hook mount, read voice.listening.silence_submit_seconds from person_settings (via the existing fetchMeSetting adapter). Hold the value in a ref so the timer logic can read it without re-running effects on every state change.
On each interim transcript update event from the Web Speech API, clear any existing silence timer and start a new one with the configured duration.
When the timer fires without further interim events, the hook signals a clean stop — same effect as the Operator saying "send".
If silence_submit_seconds === 0 (disabled), no timer is started; behavior reverts to today's explicit-send-only model.
The "send" verb path is unchanged — explicit detection of "send" in the final transcript still triggers immediate close.
The Web Speech API's native onend event (which fires on its own internal timeout) is handled per current logic — if no transcript was collected, it's a cancel; if transcript exists but neither "send" nor silence-submit fired (edge case where the API closes faster than the configured silence threshold), the existing behavior is preserved.

C.2 — Setting re-read on listen open

Each time the Operator opens the listening panel (microphone click → start listening), the hook re-reads the silence_submit_seconds setting. This matches the blur intensity pattern — the setting is read fresh on each open so Companion-spoken changes take effect on the next listening session.

C.3 — Panel footer text

VoiceListeningPanel.tsx's footer text becomes conditional:

When silence_submit_seconds > 0: "say 'send' or pause to submit · esc to cancel"
When silence_submit_seconds === 0: "say 'send' to submit · esc to cancel" (current)

The footer reads the current setting (passed as a prop from the hook) and renders accordingly.

C.4 — No visible countdown

The panel does not show a visible countdown of "submitting in 2.5s…". The silence-submit fires when the threshold elapses; the panel just closes. Operators who want explicit control use "send"; those who let silence resolve get a clean implicit close.

(Rationale: a visible countdown would clutter the panel and might invite the Operator to keep talking just to "reset the timer" mid-thought. The submit happens or it doesn't; if the Operator wants explicit control, "send" is always available.)

C.5 — Tests

Vitest unit tests cover:

Silence-submit fires after configured duration of no interim events.
Silence-submit timer resets on each interim event.
"send" verb still triggers immediate close regardless of silence-timer state.
silence_submit_seconds === 0 disables silence-submit entirely; no timer starts.
Setting re-read on each listening session (mock the fetchMeSetting and verify it's called on listen-open).
Panel footer text updates based on current setting.
Edge case: very short utterance, immediate silence — fires after threshold from the last interim event, not from listen-open.
Edge case: Web Speech API ends before silence threshold (browser internal timeout). Existing logic preserved.

Approximately ten to fifteen new vitest tests.

Section D — Verification

Checkpoint A — substrate setting key + handler extension.


cd /Users/dunin7/loomworks-engine
pytest tests/test_voice_tune_setting* tests/test_voice_silence* -v
pytest

New tests pass; full suite shows no regressions beyond the two pre-existing baseline failures.

Halt for Operator review.

Checkpoint C — frontend hook + panel.


cd /Users/dunin7/loomworks
npm test
npm run build

All existing vitest tests plus the new tests pass. Build succeeds.

Smoke test. Operator opens an engagement. Confirms:

Default silence-submit behavior. Click the microphone, say "Companion, where did I park?" (or "Rico, where did I park?" if you've renamed the Companion, or just "where did I park?" without addressing it) — stop talking. Within 2.5 seconds of silence, the panel closes, the transcript appears as a turn, the Companion replies. No need to say "send". The behavior is the same regardless of whether the utterance addresses the Companion by name or not.

Explicit "send" still works. Click the microphone, say a longer dictation with deliberate pauses, say "send" to close. Panel closes immediately on "send"; transcript routes; turn appears with voice glyph (from prior voice-provenance ship).

Tune via Companion — disable. Say "don't wait, only send on send" to the Companion. Click microphone, speak, stop talking — wait 5+ seconds. Confirm the panel does NOT auto-close; only Cancel/Escape/explicit-"send" closes it.

Tune via Companion — adjust. Say "wait five seconds before sending". Click microphone, speak, stop talking — confirm the silence-submit fires at ~5 seconds, not ~2.5.

Reset. Say "reset wait time". Back to default 2.5 seconds.

Audit ledger captures. Run:

``` psql -h localhost -U playground -d playground_dev -c "SELECT timestamp, payload->>'setting_key' AS key, payload->>'previous_value' AS prev, payload->>'new_value' AS new FROM audit.foray_events WHERE payload->>'setting_key' LIKE 'voice.listening.silence%' ORDER BY timestamp DESC LIMIT 5;" ``` Press q to exit pager. Recent silence-threshold changes should appear as audit rows.

Halt for Operator visual confirmation.

Section E — Acceptance criteria

The voice listening hook detects silence after the Operator stops talking and submits the transcript when the configured threshold elapses.
The "send" verb path remains unchanged — explicit submission still works immediately on "send".
The Operator can adjust the silence threshold via Companion utterances (longer / shorter / specific seconds / disable / reset).
Setting voice.listening.silence_submit_seconds persists in person_settings; default 2.5, special value 0 disables silence-submit.
The listening panel's footer text reflects the active mode.
Each setting change writes a FORAY audit row.
All existing tests continue to pass; new tests cover the silence-submit mechanism and the threshold tuning.
The Operator confirms by eye that a natural-ended utterance (whether addressed to the Companion by default name "Companion", by custom name like "Rico", or with no addressee at all) submits naturally without saying "send".

Section F — Out of scope

Visible countdown of pending submission.
Per-engagement silence threshold.
Different thresholds per surface (engagement-navigation vs in-engagement).
Cancel-on-no-speech behavior change (the Web Speech API still cancels if no transcript at all is collected; this ship doesn't touch that path).
Voice activity detection beyond interim events. The hook relies on the Web Speech API's interim-event firing as the proxy for "voice is happening." More sophisticated VAD is out of scope.
Mobile platform-specific silence behavior. Desktop only at v0.1.

Section G — Methodology inheritance

G.1 — Companion-tunable-settings-as-Companion-said-not-menu-clicked

Honored. The silence threshold is adjusted by speaking to the Companion ("wait longer", "wait 3 seconds", etc.). No menu surface is added.

G.2 — Loomworks-pattern-over-imported-convention

Honored. Voice as a thinking-surface interaction supports both deliberate (explicit-send) and conversational (natural-silence) rhythms. The choice isn't imported from a specific assistant convention.

G.3 — Thinking-surface-not-messaging-surface

Honored. Voice on a thinking surface needs to handle both rhythms because thinking spans them — quick conversational asks ("Companion, where did I park?") and longer deliberate compositions both deserve clean submission paths.

G.4 — Operator-tunable defaults with sensible starting point

Honored. The default 2.5 seconds is a reasonable starting point that most Operators will accept; those who prefer different timing or the explicit-only model can tune via Companion.

G.5 — FORAY-audit-for-narrative-events

Honored. The new setting changes write FORAY audit rows via the existing _audit_setting_change mechanism. Adds another narrative event type to the substrate's audit ledger.

Operator approval


Approved for execution.
- Marvin Percival
- [timestamp]

Once approval is recorded, Claude Code executes end-to-end, halting at Checkpoints A and C.

Document trailer

DUNIN7 — Done In Seven LLC — Miami, Florida Voice listening silence-submit change-request — v0.2 — 2026-05-20 Grounded in the voice-listening-timeout-reset follow-on filed during the engagement-list scoping ship and the Companion-tunable-settings methodology candidate Builds on voice-provenance-v0_1 (assumed prior ship) Requires Operator approval before Claude Code begins execution.