mirror of https://github.com/siteboon/claudecodeui.git synced 2026-06-27 06:05:54 +08:00

Go to file

Haile 591e8e7642 fix: voice tts format settings (#919 )

* feat(voice): add optional speech-to-text input and read-aloud TTS

Adds a push-to-talk mic button in the composer and a read-aloud button on
assistant messages. Both are opt-in and hidden unless a voice backend is
configured via VOICE_SIDECAR_URL.

The auth-gated /api/voice proxy forwards to a configurable backend exposing
/transcribe and /tts (provider-agnostic); the frontend probes /api/voice/health
and hides the controls when disabled. Adds i18n keys and docs/voice.md.

Includes a local, no-API-key reference backend in voice-sidecar/ (faster-whisper
for STT, Kokoro-82M for TTS, both CPU-capable).

* refactor(voice): provider-agnostic backend and in-app config

Switches the voice proxy to the OpenAI audio API (/v1/audio/transcriptions and
/v1/audio/speech) so it works with OpenAI, Groq, or a local server. Adds a
Settings -> Voice tab (base URL, API key, models, voice) plus a Quick Settings
toggle, and removes the bundled Python sidecar.

Review fixes: stop mic tracks on unmount, clear the global TTS stop handler and
revoke leaked blob URLs, add fetch timeouts in the proxy, surface mic errors in
the button, trim before appending transcripts, and drop the repo-wide wav ignore.

* fix(voice): relax backend timeout and surface timeout errors

Bumps the proxy timeout to 5 minutes (VOICE_TIMEOUT_MS) since local TTS can
synthesize long messages at roughly real-time, and returns a clear timed-out
message (504) instead of failing silently. The read-aloud button now shows
backend errors.

* fix(voice): play read-aloud through an app-level player to stop cutoffs

Read-aloud now runs in a single module-level player outside the React tree instead
of per-message component state. Switching chats or re-rendering a message no longer
revokes the blob URL mid-play (the 'Invalid URI' cutoff). Adds content-keyed caching so
re-listening doesn't regenerate, and reuses one audio element (also unlocks iOS once).

* fix(voice): address review (SSRF guard, auth mapping, client timeout)

Validates the user-supplied backend URL (http/https only, blocks the link-local
metadata range) to prevent SSRF; remaps upstream 401/403 so a bad voice API key
isn't read as the app's own auth failing; adds a client-side AbortController timeout
on the read-aloud request so the button can't sit in loading if a request stalls.

* docs(voice): provider-agnostic wording and jsdoc on proxy functions

drop leftover sidecar/faster-whisper references now that the backend is any
openai-compatible voice api, and add jsdoc to the voice-proxy functions so the
docstring coverage check passes.

* fix(voice): harden timeout parsing, tts input check, and player abort

- fall back to the default when VOICE_TIMEOUT_MS is non-numeric or <= 0, so a
bad override can't make the abort fire immediately
- type-check the tts `text` before calling .trim() so a non-string body returns
400 instead of throwing
- abort the in-flight TTS fetch on stop() and on a superseding play, so tapping
read-aloud repeatedly doesn't leave orphaned requests generating audio

* feat(voice): send transcript with the main send button while recording

while dictating, the main send button stops recording, transcribes, and sends
in one tap, matching the codex-style flow. the mic button still stops and drops
the transcript into the input box to edit before sending. voice recording state
is lifted into the composer so both buttons share it, and the send button is
enabled (not grayed) while recording. also fix a pre-existing type error: the
quick-settings preferences map was missing voiceEnabled.

* fix(voice): make stop() idempotent so a double tap can't throw

guard on the recorder's own state instead of react state, so a double tap or
the mic and send buttons both firing won't call stop() on an already-inactive
MediaRecorder.

* fix(voice): expose TTS format in user settings

* fix(voice): harden recording and backend behavior

Redirects could bypass the backend URL guard, and TTS playback waited for full buffering.

Recording could overlap or finish after teardown. Controls also ignored backend readiness.

Explicit formats and config-aware cache keys prevent stale audio after settings change.

* fix(voice): validate config and request boundaries

Malformed stored settings could break voice requests instead of using safe defaults.

Health results could outlive auth changes. URL checks also did not guard the fetch sink.

Remove constant recorder branches so lifecycle cancellation stays clear.

* fix(voice): separate client and server backends

User-selected backend URLs must remain usable without letting clients control server requests.

Call custom providers from the browser while keeping the server proxy bound to its configured host.

This restores voice controls for frontend settings without reopening the SSRF path.

* fix: hide voice options until enabled

---------

Co-authored-by: newsbubbles <nathaniel.gibson@gmail.com>
Co-authored-by: Simos Mikelatos <simosmik@gmail.com>

2026-06-26 16:06:40 +02:00

.github

chore: add docker sandbox action

2026-04-21 16:41:40 +00:00

.husky

Feat/improve husky git hook (#517 )

2026-03-10 17:44:11 +01:00

docker

fix: add latest tag to docker npx command and change the detach mode to work without spawn

2026-04-14 17:37:20 +00:00

docs

feat: add file tree upload progress

2026-06-08 14:52:09 +03:00

plugins

feat(refactor): move plugins to typescript (#557 )

2026-03-18 16:44:07 +03:00

public

chore: remove unused modelConstants from the project

2026-06-09 20:51:19 +00:00

redirect-package

chore: remove unused modelConstants from the project

2026-06-09 20:51:19 +00:00

scripts

fix(macos): fix node-pty posix_spawnp error with postinstall script (#347 )

2026-02-18 12:21:00 +01:00

server

fix: voice tts format settings (#919 )

2026-06-26 16:06:40 +02:00

shared

feat: add opencode support (#762 )

2026-05-28 10:50:41 +02:00

src

fix: voice tts format settings (#919 )

2026-06-26 16:06:40 +02:00

.env.example

Improve dev host handling and clarify backend port configuration (#532 )

2026-03-16 12:40:01 +01:00

.gitignore

feat(i18n): add French (fr) locale (#878 )

2026-06-15 15:02:50 +03:00

.gitmodules

feat: new plugin system (#489 )

2026-03-09 13:00:52 +03:00

.npmignore

adding npmignore

2025-12-30 17:49:30 +00:00

.nvmrc

Refactor/app content main content and chat interface (#374 )

2026-02-13 20:26:47 +01:00

.release-it.json

chore: changing package name to @cloudcli-ai/cloudcli

2026-04-03 15:37:49 +00:00

CHANGELOG.md

chore(release): v1.34.0

2026-06-09 20:34:48 +00:00

commitlint.config.js

Refactor/shared and tasks components (#473 )

2026-03-05 23:47:58 +01:00

CONTRIBUTING.md

chore: relicense to AGPL-3.0-or-later

2026-03-29 00:57:09 +00:00

eslint.config.js

Surface provider skills in the slash command menu (#759 )

2026-05-12 21:33:12 +03:00

index.html

Update index.html with manifest crossorigin

2026-03-05 23:19:42 +01:00

LICENSE

chore: relicense to AGPL-3.0-or-later

2026-03-29 00:57:09 +00:00

NOTICE

chore: relicense to AGPL-3.0-or-later

2026-03-29 00:57:09 +00:00

package-lock.json

Add browser use as MCP to providers (#889 )

2026-06-17 22:06:17 +02:00

package.json

Add browser use as MCP to providers (#889 )

2026-06-17 22:06:17 +02:00

postcss.config.js

- Upgrading to Vite 7

2025-07-11 10:29:36 +00:00

README.de.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

README.ja.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

README.ko.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

README.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

README.ru.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

README.tr.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

README.zh-CN.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

README.zh-TW.md

chore: add github issues board plugin

2026-06-11 14:00:41 +03:00

release.sh

modified: .gitignore

2025-10-31 09:45:35 +01:00

tailwind.config.js

refactor: add primitives, plan mode display, and new session model selector

2026-04-20 12:47:55 +00:00

tsconfig.json

Feature/backend ts support andunification of auth settings on frontend (#654 )

2026-04-15 13:26:12 +02:00

vite.config.js

fix(vite): proxy /plugin-ws WebSocket requests to the backend in dev (#757 )

2026-06-04 20:57:24 +03:00

README.md

Cloud CLI (aka Claude Code UI)

A desktop and mobile UI for Claude Code, Cursor CLI, Codex, and Gemini-CLI.
Use it locally or remotely to view your active projects and sessions from everywhere.

CloudCLI Cloud · Documentation · Discord · Bug Reports · Contributing

English · Русский · Deutsch · 한국어 · 简体中文 · 繁體中文 · 日本語 · Türkçe

Screenshots

Desktop View Main interface showing project overview and chat	Mobile Experience Responsive mobile design with touch navigation
CLI Selection Select between Claude Code, Gemini, Cursor CLI and Codex

Features

Responsive Design - Works seamlessly across desktop, tablet, and mobile so you can also use Agents from mobile
Interactive Chat Interface - Built-in chat interface for seamless communication with the Agents
Integrated Shell Terminal - Direct access to the Agents CLI through built-in shell functionality
File Explorer - Interactive file tree with syntax highlighting and live editing
Git Explorer - View, stage and commit your changes. You can also switch branches
Session Management - Resume conversations, manage multiple sessions, and track history
Plugin System - Extend CloudCLI with custom plugins — add new tabs, backend services, and integrations. Build your own →
TaskMaster AI Integration (Optional) - Advanced project management with AI-powered task planning, PRD parsing, and workflow automation
Model Compatibility - Works with Claude, GPT, and Gemini model families (the full list of supported models is available at runtime via GET /api/providers/:provider/models)

Quick Start

CloudCLI Cloud (Recommended)

The fastest way to get started — no local setup required. Get a fully managed, containerized development environment accessible from the web, mobile app, API, or your favorite IDE.

Get started with CloudCLI Cloud

Self-Hosted (Open source)

npm

Try CloudCLI UI instantly with npx (requires Node.js v22+):

npx @cloudcli-ai/cloudcli

Or install globally for regular use:

npm install -g @cloudcli-ai/cloudcli
cloudcli

Open http://localhost:3001 — all your existing sessions are discovered automatically.

Visit the documentation → for full configuration options, PM2, remote server setup and more.

Docker Sandboxes (Experimental)

Run agents in isolated sandboxes with hypervisor-level isolation. Starts Claude Code by default. Requires the sbx CLI.

npx @cloudcli-ai/cloudcli@latest sandbox ~/my-project

Supports Claude Code, Codex, and Gemini CLI. See the sandbox docs for setup and advanced options.

Which option is right for you?

CloudCLI UI is the open source UI layer that powers CloudCLI Cloud. You can self-host it on your own machine, run it in a Docker sandbox for isolation, or use CloudCLI Cloud for a fully managed environment.

	Self-Hosted (npm)	Self-Hosted (Docker Sandbox) (Experimental)	CloudCLI Cloud
Best for	Local agent sessions on your own machine	Isolated agents with web/mobile IDE	Teams who want agents in the cloud
How you access it	Browser via `[yourip]:port`	Browser via `localhost:port`	Browser, any IDE, REST API, n8n
Setup	`npx @cloudcli-ai/cloudcli`	`npx @cloudcli-ai/cloudcli@latest sandbox ~/project`	No setup required
Isolation	Runs on your host	Hypervisor-level sandbox (microVM)	Full cloud isolation
Machine needs to stay on	Yes	Yes	No
Mobile access	Any browser on your network	Any browser on your network	Any device, native app coming
Agents supported	Claude Code, Cursor CLI, Codex, Gemini CLI	Claude Code, Codex, Gemini CLI	Claude Code, Cursor CLI, Codex, Gemini CLI
File explorer and Git	Yes	Yes	Yes
MCP configuration	Synced with `~/.claude`	Managed via UI	Managed via UI
REST API	Yes	Yes	Yes
Team sharing	No	No	Yes
Platform cost	Free, open source	Free, open source	Starts at $7/month

All options use your own AI subscriptions (Claude, Cursor, etc.) — CloudCLI provides the environment, not the AI.

Security & Tools Configuration

🔒 Important Notice: All Claude Code tools are disabled by default. This prevents potentially harmful operations from running automatically.

Enabling Tools

To use Claude Code's full functionality, you'll need to manually enable tools:

Open Tools Settings - Click the gear icon in the sidebar
Enable Selectively - Turn on only the tools you need
Apply Settings - Your preferences are saved locally

Tools Settings interface - enable only what you need

Recommended approach: Start with basic tools enabled and add more as needed. You can always adjust these settings later.

Plugins

CloudCLI has a plugin system that lets you add custom tabs with their own frontend UI and optional Node.js backend. Install plugins from git repos directly in Settings > Plugins, or build your own.

Available Plugins

Plugin	Description
Project Stats	Shows file counts, lines of code, file-type breakdown, largest files, and recently modified files for your current project
Web Terminal	Full xterm.js terminal with multi-tab support
Claude Watch	Watches long-running Claude Code sessions for hangs and exposes process controls
CloudCLI Scheduler	Create workspace-scoped scheduled prompts and execute them through a local CLI such as Codex, Claude Code, or Gemini CLI
PRISM CloudCLI	Session intelligence for Claude Code inside CloudCLI, including token burn visibility
Sessions	View, manage, and kill active Claude Code sessions
Token Cost Calculator	Calculate API costs from model prices and token usage, with preset model pricing support
Task Queue	Task queue dashboard to view, filter, and launch agent tasks
GitHub Issues Board	Kanban board for GitHub Issues with bidirectional TaskMaster sync and /github-task CLI skill auto-install

Build Your Own

Plugin Starter Template → — fork this repo to create your own plugin. It includes a working example with frontend rendering, live context updates, and RPC communication to a backend server.

Plugin Documentation → — full guide to the plugin API, manifest format, security model, and more.

FAQ

How is this different from Claude Code Remote Control?

Claude Code Remote Control lets you send messages to a session already running in your local terminal. Your machine has to stay on, your terminal has to stay open, and sessions time out after roughly 10 minutes without a network connection.

CloudCLI UI and CloudCLI Cloud extend Claude Code rather than sit alongside it — your MCP servers, permissions, settings, and sessions are the exact same ones Claude Code uses natively. Nothing is duplicated or managed separately.

Here's what that means in practice:

All your sessions, not just one — CloudCLI UI auto-discovers every session from your ~/.claude folder. Remote Control only exposes the single active session to make it available in the Claude mobile app.
Your settings are your settings — MCP servers, tool permissions, and project config you change in CloudCLI UI are written directly to your Claude Code config and take effect immediately, and vice versa.
Works with more agents — Claude Code, Cursor CLI, Codex, and Gemini CLI, not just Claude Code.
Full UI, not just a chat window — file explorer, Git integration, MCP management, and a shell terminal are all built in.
CloudCLI Cloud runs in the cloud — close your laptop, the agent keeps running. No terminal to babysit, no machine to keep awake.

Do I need to pay for an AI subscription separately?

Yes. CloudCLI provides the environment, not the AI. You bring your own Claude, Cursor, Codex, or Gemini subscription. CloudCLI Cloud starts at $7/month for the hosted environment on top of that.

Can I use CloudCLI UI on my phone?

Yes. For self-hosted, run the server on your machine and open [yourip]:port in any browser on your network. For CloudCLI Cloud, open it from any device — no VPN, no port forwarding, no setup. A native app is also in the works.

Will changes I make in the UI affect my local Claude Code setup?

Yes, for self-hosted. CloudCLI UI reads from and writes to the same ~/.claude config that Claude Code uses natively. MCP servers you add via the UI show up in Claude Code immediately and vice versa.

Community & Support

Documentation — installation, configuration, features, and troubleshooting
Discord — get help and connect with other users
GitHub Issues — bug reports and feature requests
Contributing Guide — how to contribute to the project

License

GNU Affero General Public License v3.0 or later (AGPL-3.0-or-later) — see LICENSE for the full text, including additional terms under Section 7.

This project is open source and free to use, modify, and distribute under the AGPL-3.0-or-later license. If you modify this software and run it as a network service, you must make your modified source code available to users of that service.

CloudCLI UI - (https://cloudcli.ai).

Acknowledgments

Built With

Claude Code - Anthropic's official CLI
Cursor CLI - Cursor's official CLI
Codex - OpenAI Codex
Gemini-CLI - Google Gemini CLI
React - User interface library
Vite - Fast build tool and dev server
Tailwind CSS - Utility-first CSS framework
CodeMirror - Advanced code editor
TaskMaster AI (Optional) - AI-powered project management and task planning

README.md

Cloud CLI (aka Claude Code UI)

Screenshots

Desktop View

Mobile Experience

CLI Selection

Features

Quick Start

CloudCLI Cloud (Recommended)

Self-Hosted (Open source)

npm

Docker Sandboxes (Experimental)

Which option is right for you?

Security & Tools Configuration

Enabling Tools

Plugins

Available Plugins

Build Your Own

FAQ

Community & Support

License

Acknowledgments

Built With

Sponsors