Commit Graph

271 Commits

Author SHA1 Message Date
Simos Mikelatos
a0899a252e Merge branch 'main' into electron-app 2026-06-26 16:09:19 +02:00
Haile
591e8e7642 fix: voice tts format settings (#919)
* feat(voice): add optional speech-to-text input and read-aloud TTS

Adds a push-to-talk mic button in the composer and a read-aloud button on
assistant messages. Both are opt-in and hidden unless a voice backend is
configured via VOICE_SIDECAR_URL.

The auth-gated /api/voice proxy forwards to a configurable backend exposing
/transcribe and /tts (provider-agnostic); the frontend probes /api/voice/health
and hides the controls when disabled. Adds i18n keys and docs/voice.md.

Includes a local, no-API-key reference backend in voice-sidecar/ (faster-whisper
for STT, Kokoro-82M for TTS, both CPU-capable).

* refactor(voice): provider-agnostic backend and in-app config

Switches the voice proxy to the OpenAI audio API (/v1/audio/transcriptions and
/v1/audio/speech) so it works with OpenAI, Groq, or a local server. Adds a
Settings -> Voice tab (base URL, API key, models, voice) plus a Quick Settings
toggle, and removes the bundled Python sidecar.

Review fixes: stop mic tracks on unmount, clear the global TTS stop handler and
revoke leaked blob URLs, add fetch timeouts in the proxy, surface mic errors in
the button, trim before appending transcripts, and drop the repo-wide wav ignore.

* fix(voice): relax backend timeout and surface timeout errors

Bumps the proxy timeout to 5 minutes (VOICE_TIMEOUT_MS) since local TTS can
synthesize long messages at roughly real-time, and returns a clear timed-out
message (504) instead of failing silently. The read-aloud button now shows
backend errors.

* fix(voice): play read-aloud through an app-level player to stop cutoffs

Read-aloud now runs in a single module-level player outside the React tree instead
of per-message component state. Switching chats or re-rendering a message no longer
revokes the blob URL mid-play (the 'Invalid URI' cutoff). Adds content-keyed caching so
re-listening doesn't regenerate, and reuses one audio element (also unlocks iOS once).

* fix(voice): address review (SSRF guard, auth mapping, client timeout)

Validates the user-supplied backend URL (http/https only, blocks the link-local
metadata range) to prevent SSRF; remaps upstream 401/403 so a bad voice API key
isn't read as the app's own auth failing; adds a client-side AbortController timeout
on the read-aloud request so the button can't sit in loading if a request stalls.

* docs(voice): provider-agnostic wording and jsdoc on proxy functions

drop leftover sidecar/faster-whisper references now that the backend is any
openai-compatible voice api, and add jsdoc to the voice-proxy functions so the
docstring coverage check passes.

* fix(voice): harden timeout parsing, tts input check, and player abort

- fall back to the default when VOICE_TIMEOUT_MS is non-numeric or <= 0, so a
  bad override can't make the abort fire immediately
- type-check the tts `text` before calling .trim() so a non-string body returns
  400 instead of throwing
- abort the in-flight TTS fetch on stop() and on a superseding play, so tapping
  read-aloud repeatedly doesn't leave orphaned requests generating audio

* feat(voice): send transcript with the main send button while recording

while dictating, the main send button stops recording, transcribes, and sends
in one tap, matching the codex-style flow. the mic button still stops and drops
the transcript into the input box to edit before sending. voice recording state
is lifted into the composer so both buttons share it, and the send button is
enabled (not grayed) while recording. also fix a pre-existing type error: the
quick-settings preferences map was missing voiceEnabled.

* fix(voice): make stop() idempotent so a double tap can't throw

guard on the recorder's own state instead of react state, so a double tap or
the mic and send buttons both firing won't call stop() on an already-inactive
MediaRecorder.

* fix(voice): expose TTS format in user settings

* fix(voice): harden recording and backend behavior

Redirects could bypass the backend URL guard, and TTS playback waited for full buffering.

Recording could overlap or finish after teardown. Controls also ignored backend readiness.

Explicit formats and config-aware cache keys prevent stale audio after settings change.

* fix(voice): validate config and request boundaries

Malformed stored settings could break voice requests instead of using safe defaults.

Health results could outlive auth changes. URL checks also did not guard the fetch sink.

Remove constant recorder branches so lifecycle cancellation stays clear.

* fix(voice): separate client and server backends

User-selected backend URLs must remain usable without letting clients control server requests.

Call custom providers from the browser while keeping the server proxy bound to its configured host.

This restores voice controls for frontend settings without reopening the SSRF path.

* fix: hide voice options until enabled

---------

Co-authored-by: newsbubbles <nathaniel.gibson@gmail.com>
Co-authored-by: Simos Mikelatos <simosmik@gmail.com>
2026-06-26 16:06:40 +02:00
Simos Mikelatos
63f3c3941d feat: add desktop notifications and skills updates 2026-06-26 10:25:47 +00:00
Simos Mikelatos
e6c6f89dda Merge branch 'main' into electron-app 2026-06-26 10:02:48 +02:00
Haile
4a503b1dc8 fix(shell): prioritize user npm binaries (#913)
Interactive shells could resolve bundled or system CLIs before user-installed npm binaries.

Move existing user npm global directories to the front of PATH while preserving all other entries.

Co-authored-by: Simos Mikelatos <simosmik@gmail.com>
2026-06-24 20:15:52 +02:00
Koya Kikuchi
f6326c8082 feat(version): warn when the server was updated but not restarted (#898)
When the package is updated on disk but the long-lived server process is
not restarted, the new frontend bundle (served from disk) talks to the
old running backend. New DB-backed features then fail silently — e.g.
deleting/archiving a session appears to do nothing — because the new
schema/routes only take effect on restart.

Nothing currently detects this skew: useVersionCheck only compares the
frontend's build-time version against the latest GitHub release.

This exposes the running server's version (captured once at startup) via
/health, compares it to the frontend's build-time version in
useVersionCheck, and shows a "restart required" banner in the sidebar
(and a small indicator in the collapsed sidebar) when they differ.

- server: add `version` (RUNNING_VERSION, read once at startup) to /health
- useVersionCheck: return `restartRequired` / `runningVersion`
- SidebarFooter / SidebarCollapsed: surface a restart-required banner
- i18n: add `version.restartRequired` to all 10 sidebar locales

Verified with `tsc --noEmit` (client + server) and eslint.

Co-authored-by: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Co-authored-by: Simos Mikelatos <simosmik@gmail.com>
2026-06-22 22:49:57 +02:00
Haile
c5fe127958 feat(skills): add provider skill management (#909)
* feat(skills): add provider skill management

Users need one settings surface to discover and install skills without manually navigating provider-specific directories.

Add provider-backed global skill installation for Claude, Codex, Gemini, and Cursor, while keeping OpenCode read-only because it reuses other providers' skill locations.

Add a responsive Skills settings tab with scoped discovery, search, refresh controls, markdown and folder uploads, upload feedback, and overflow-safe layouts.

Validate bundled skill files and paths before writing them, preserve scripts and assets, and cover provider discovery and installation behavior with tests.

* fix(skills): preserve uploaded skill folders

Folder drops discarded supporting scripts and assets.

Keep relative paths and upload every file from the selected skill folder.

Use the selected folder name for installation and cover it in provider tests.

* fix(skills): restrict standalone skill uploads

Only show Markdown files when selecting standalone skills.

Normalize browser file paths so SKILL.md is not mistaken for a folder named dot.

* fix(skills): validate installs before writing

Preserve bundled files and normalize fallback names across skill installation paths.

Validate complete batches before writing and reject existing targets to avoid partial installs.

Keep project metadata and make folder selection tolerant of casing and cancelled dialogs.

* fix(skills): overwrite existing installations

Replace an existing skill directory instead of rejecting a duplicate installation.

Remove stale supporting files so the installed directory exactly matches the new upload.
2026-06-22 22:45:27 +02:00
Simos Mikelatos
1c05fe0905 fix: stabilize cloud computer use mcp 2026-06-19 20:47:53 +00:00
Simos Mikelatos
077baee5f2 fix: authenticate desktop agent websocket 2026-06-19 15:52:49 +00:00
Simos Mikelatos
9f8cee8919 fix: restore macos semantic helper cast 2026-06-19 15:05:47 +00:00
Simos Mikelatos
bb323fc566 fix: respect cloud computer use setting 2026-06-19 15:02:07 +00:00
Simos Mikelatos
5ef40be2d3 fix: macos release 2026-06-19 14:46:58 +00:00
Simos Mikelatos
cf4b28273e fix: compile macos semantic helper 2026-06-19 14:22:47 +00:00
Simos Mikelatos
f4c68942a5 fix: repair desktop launcher local view 2026-06-19 14:20:23 +00:00
Simos Mikelatos
4d70a2588c feat: improve Computer Use linking status 2026-06-19 13:47:16 +00:00
Simos Mikelatos
53c3c4c27a Fix long-running desktop resource leaks 2026-06-19 13:07:08 +00:00
Simos Mikelatos
278fe4f7b1 Fix semantic review issues and release action runtime 2026-06-19 12:46:40 +00:00
Simos Mikelatos
d7f4d4c342 Fix desktop release review findings 2026-06-19 12:29:46 +00:00
Simos Mikelatos
d1930fecdb fix: build semantic helpers on macos and windows 2026-06-19 12:17:32 +00:00
Simos Mikelatos
1726705459 feat: add CloudCLI computer use semantics, desktop helper packaging, and permission onboarding 2026-06-19 12:09:55 +00:00
Simos Mikelatos
a35200f340 Harden computer use MCP handling 2026-06-19 08:06:26 +00:00
Simos Mikelatos
9f24f80f33 Fix computer use session error status 2026-06-19 07:47:56 +00:00
Simos Mikelatos
2af3d38afe Harden desktop workflows and computer use handling 2026-06-19 06:21:13 +00:00
Simos Mikelatos
531833bc87 Merge branch 'main' into electron-app 2026-06-19 08:19:36 +02:00
Simos Mikelatos
f75ae385dd Add on-demand desktop server bundle 2026-06-18 21:08:29 +00:00
Karel Bourgois
a12ca8eed3 fix(claude-sync): skip subagent transcripts to prevent main session corruption (#854)
The session indexer scans ~/.claude/projects recursively via
findFilesRecursivelyCreatedAfter, which descends into per-session
subagents/ directories. Claude writes subagent transcripts at:

  ~/.claude/projects/<encoded-cwd>/<session-id>/subagents/agent-<id>.jsonl

These files repeat the parent session's sessionId. When indexed as
standalone sessions they upsert over the parent row and overwrite its
jsonl_path with the subagent path, corrupting the main session record
(the sidebar then points at, and renders, the subagent transcript).

Add a single isSubagentTranscript() guard (path segment named
"subagents") and apply it in both the recursive scan and the
single-file watcher path.

Co-authored-by: Haile <118998054+blackmammoth@users.noreply.github.com>
2026-06-18 15:37:37 +03:00
Simos Mikelatos
bf50d29c20 Merge remote-tracking branch 'origin/browser-use' into electron-app
# Conflicts:
#	src/i18n/locales/en/settings.json
2026-06-17 20:17:38 +00:00
Simos Mikelatos
e88539170e Add browser use as MCP to providers (#889) 2026-06-17 22:06:17 +02:00
Simos Mikelatos
ffc0cd7501 Improve Browser settings load and managed MCP display 2026-06-17 20:04:44 +00:00
Simos Mikelatos
59194d1502 Refine Browser naming and managed MCP UX
- Rename Browser Use surfaces to Browser
- Register Browser MCP under the new server name
- Mark CloudCLI-managed MCP servers read-only
- Adjust MCP stdio framing and sidebar footer sizing
2026-06-17 19:18:23 +00:00
Simos Mikelatos
7e6028b113 feat: add desktop computer use runtime 2026-06-17 19:01:15 +00:00
Simos Mikelatos
9881e5e366 feat(browser-use): improve mobile monitoring ux 2026-06-17 18:19:12 +00:00
Simos Mikelatos
086df034b4 feat(browser-use): simplify agent session monitoring 2026-06-17 17:04:11 +00:00
Simos Mikelatos
fc71fc7d2b Merge branch 'pr889-fixes' into electron-app
# Conflicts:
#	server/index.js
2026-06-17 15:45:07 +00:00
Simos Mikelatos
a0d56429a7 fix browser use 2026-06-17 15:43:21 +00:00
Simos Mikelatos
6af4afe6f2 Merge branch 'main' into browser-use 2026-06-16 19:02:36 +02:00
Haileyesus
d7a38a567a chore: move tests to appropriate folder 2026-06-16 17:54:48 +03:00
Haileyesus
c6c153e7f2 chore: move tests to appropriate folder 2026-06-16 17:47:52 +03:00
Simos Mikelatos
7aeca52669 Merge branch 'browser-use' into electron-app
# Conflicts:
#	src/components/main-content/view/subcomponents/MainContentTabSwitcher.tsx
2026-06-16 06:51:35 +00:00
Simos Mikelatos
9438a365f2 feat: improve browser use session controls 2026-06-15 21:14:10 +00:00
Simos Mikelatos
0426522406 feat: expose browser use to agents via MCP 2026-06-15 19:47:58 +00:00
Simos Mikelatos
6e7e2ff4c1 feat: make browser use opt-in 2026-06-15 18:12:27 +00:00
Simos Mikelatos
e6263dbd1f refactor: store browser use settings in database 2026-06-15 17:57:00 +00:00
Simos Mikelatos
260070bae0 feat: add browser use runtime setup settings 2026-06-15 17:52:27 +00:00
Simos Mikelatos
861cfecbaa feat: add electron app support 2026-06-15 16:21:05 +00:00
Simos Mikelatos
828d1a2302 Merge remote-tracking branch 'origin/feat/unify-websocket-2' into browser-use-independent 2026-06-15 16:12:10 +00:00
Haileyesus
9fb2d91b26 fix: resolve session provider on backend reads
Session history and token usage reads already have a stable app session id.
Passing provider and project hints from the frontend kept those reads coupled
with provider-specific state that the backend can resolve from the session row.

Resolve token usage provider server-side and narrow the session store read API
to session id plus pagination. This keeps provider-specific storage decisions
behind the backend boundary and makes reconnect, pagination, and load-all use
the same session-owned contract.
2026-06-15 14:04:50 +03:00
Haileyesus
d0adddbbda fix: normalize project session payloads
The sidebar had to understand cursorSessions, codexSessions,
and other provider buckets because /api/projects exposed
provider-shaped arrays.

That leaked backend adapter storage into project state and made
frontend behavior drift each time a provider needed another bucket
or exception.

Return one sessions list with provider metadata instead. Project
state, search, and running-session filtering now share one contract,
while provider-specific storage remains behind the backend boundary.
2026-06-15 13:43:18 +03:00
Simos Mikelatos
243e6cecd5 Add browser use workspace panel 2026-06-14 20:34:16 +00:00
Haileyesus
5b9adbbdee fix(opencode): bind watcher sessions to app rows early 2026-06-12 23:22:11 +03:00