Switches the voice proxy to the OpenAI audio API (/v1/audio/transcriptions and
/v1/audio/speech) so it works with OpenAI, Groq, or a local server. Adds a
Settings -> Voice tab (base URL, API key, models, voice) plus a Quick Settings
toggle, and removes the bundled Python sidecar.
Review fixes: stop mic tracks on unmount, clear the global TTS stop handler and
revoke leaked blob URLs, add fetch timeouts in the proxy, surface mic errors in
the button, trim before appending transcripts, and drop the repo-wide wav ignore.
Users need a visible upload path from the explorer itself, not only drag and
drop behavior with no progress feedback. Routing picker and drop uploads
through one XHR-backed hook keeps progress, validation, refresh, and success
counts consistent for every upload source.
The 200MB limit is mirrored in the client, multer, and nginx template so large
uploads fail predictably instead of being blocked by whichever layer sees the
request first. The server also returns explicit requested and uploaded counts
so partial or multi-file batches can render accurate status text.
Adds a push-to-talk mic button in the composer and a read-aloud button on
assistant messages. Both are opt-in and hidden unless a voice backend is
configured via VOICE_SIDECAR_URL.
The auth-gated /api/voice proxy forwards to a configurable backend exposing
/transcribe and /tts (provider-agnostic); the frontend probes /api/voice/health
and hides the controls when disabled. Adds i18n keys and docs/voice.md.
Includes a local, no-API-key reference backend in voice-sidecar/ (faster-whisper
for STT, Kokoro-82M for TTS, both CPU-capable).
Users deploying behind a reverse proxy need a config they can adapt.
The template documents each proxy block and centralizes upstream/subpath values.
It also notes that Nginx location matchers still require literal subpath edits.