Session history and token usage reads already have a stable app session id.
Passing provider and project hints from the frontend kept those reads coupled
with provider-specific state that the backend can resolve from the session row.
Resolve token usage provider server-side and narrow the session store read API
to session id plus pagination. This keeps provider-specific storage decisions
behind the backend boundary and makes reconnect, pagination, and load-all use
the same session-owned contract.
The sidebar had to understand cursorSessions, codexSessions,
and other provider buckets because /api/projects exposed
provider-shaped arrays.
That leaked backend adapter storage into project state and made
frontend behavior drift each time a provider needed another bucket
or exception.
Return one sessions list with provider metadata instead. Project
state, search, and running-session filtering now share one contract,
while provider-specific storage remains behind the backend boundary.
The remote environment could start OpenCode runs under /opt/claudecodeui.
That happened even when the selected project path was correct.
The integration relied on child-process cwd alone.
OpenCode run resolves its workspace through the explicit --dir contract.
Pass --dir with the resolved working directory.
Assert in the CLI test that launch args include the workspace dir.
The sidebar could keep a provider-native id after backend remapping.
That left a duplicate non-working session visible until refresh.
Fresh sessions could also appear hours old.
SQLite CURRENT_TIMESTAMP is UTC without a timezone suffix.
Browser parsing then treated those values like local time.
Broadcast a canonical session_upserted event when the provider id is mapped.
Collapse provider-id aliases onto the stable app session id in the client.
Normalize session-row timestamps to ISO UTC when reading from the repository.
Add a running-session view to the sidebar, including header controls, running counts, empty states, and row-level processing indicators so active provider work is visible outside the current chat.
Hydrate running state after refresh through a status-only /api/providers/sessions/running endpoint backed by chatRunRegistry.listRunningRuns, then sync and poll the frontend processingSessions map from AppContent without attaching to chat streams or replaying messages.
Preserve fresh local processing entries during sync so newly sent messages are not cleared before the backend registry catches up, and clear completed sessions once the status endpoint no longer reports them.
Thread active session state through sidebar project/session components, show rotating loaders for processing sessions, and keep the running search mode expanded and filterable.
Fix optimistic local user-message dedupe so repeated prompts are only collapsed when a matching server echo appears from the same send window, preventing sent messages from disappearing until assistant completion.
Add registry test coverage for listing currently running app sessions.
Tests: npx eslint on changed files; npx tsc --noEmit -p tsconfig.json; npx tsc --noEmit -p server/tsconfig.json; npx tsx --tsconfig server/tsconfig.json --test server/modules/websocket/tests/chat-run-registry.test.ts.
The frontend previously juggled placeholder IDs, provider-native IDs, and session_created handoffs, which caused race conditions and provider-specific branching. This introduces app-allocated session IDs, a chat run registry with event replay, delta sidebar updates, and one kind-based websocket contract so the UI can treat every provider the same while JSONL remains the source of truth.
Replace the chat processing banner with a minimal activity indicator and
rebuild the state model underneath it. The old banner was driven by five
overlapping pieces of state (isLoading, canAbortSession, claudeStatus in the
chat, plus two app-level Sets updated in lockstep through four callbacks)
that had to be kept in sync imperatively. Because completion and status
events mutated the *viewed* session's flags regardless of which session they
belonged to, a background session finishing could hide the indicator for a
still-running session, returning to a finished session could briefly show a
stale banner, and a late status reply could override a newer request.
The fix is structural rather than patch-by-patch: a single
Map<sessionId, {statusText, canInterrupt, startedAt}> in useSessionProtection
is now the only source of truth for "this session is working". The indicator,
stop button, composer streaming state, and session protection are all derived
from the viewed session's entry on render, so there is no stale local copy to
restore or reset when switching sessions. A PENDING_SESSION_ID sentinel
covers the window before a new conversation receives its real session id.
Terminal events delete the entry atomically, which is why the indicator
disappears the instant the final chunk arrives. Stale check-session-status
replies are discarded via an ifStartedBefore guard (an idle reply older than
the entry's startedAt describes a previous request, not the current one).
The second half unifies the provider lifecycle contract, because the frontend
could not be made race-free while each provider terminated differently:
- cursor emitted complete twice per run (result line + process close), which
double-played the completion sound and let a late close-complete clear a
newer request's indicator
- aborts produced two completes (the abort-session reply plus the provider's
own non-aborted one), so cancelling a run played the celebration sound
- codex omitted exitCode; others attached ad-hoc fields (resultText, isError,
isNewSession) the client had to know about
- claude/codex failures ended with only an error event while gemini/cursor
also emit kind:'error' for mid-run stderr noise, so 'error' was ambiguous
between "the run died" and "a process wrote to stderr"
Every run now ends with exactly one complete built by createCompleteMessage()
({sessionId, actualSessionId, exitCode, success, aborted}); abort-session
sends it on behalf of cancelled runs and providers detect the abort and skip
their own. error is demoted to an informational row, so stderr noise no
longer kills the indicator mid-run, and the client celebrates only
success: true completes.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Users can miss chat completions while the app is in the background.
They can also miss completions when their attention is elsewhere.
Add opt-out sound notifications and a temporary title marker.
This makes completion noticeable without external audio assets or persistent browser notifications.
Plugin servers are started with a deliberately minimal env (PATH, HOME,
NODE_ENV, PLUGIN_NAME). On Windows that drops system variables that child
processes need to bootstrap. The one that bit me: without APPDATA, CPython
cannot find the per-user site-packages, so a plugin that shells out to a
pip install --user CLI launches the tool but it dies with ModuleNotFoundError.
SystemRoot, PATHEXT and TEMP cause similar failures for other tools.
On win32, pass through a small allowlist of non-secret system variables
(SystemRoot, windir, SystemDrive, USERPROFILE, APPDATA, LOCALAPPDATA, TEMP,
TMP, PATHEXT) when they are set. No change off Windows, and no host secrets
are exposed.
Replace bare background operator with nohup+disown so the cloudcli
server process survives after the sbx exec session terminates.
Also redirects stdout/stderr to /tmp/cloudcli-ui.log for debugging
via `cloudcli sandbox logs`.
Fixes#791
Co-authored-by: NoahHahm <noah@naverz-corp.com>
Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Co-authored-by: Haile <118998054+blackmammoth@users.noreply.github.com>
Users need a visible upload path from the explorer itself, not only drag and
drop behavior with no progress feedback. Routing picker and drop uploads
through one XHR-backed hook keeps progress, validation, refresh, and success
counts consistent for every upload source.
The 200MB limit is mirrored in the client, multer, and nginx template so large
uploads fail predictably instead of being blocked by whichever layer sees the
request first. The server also returns explicit requested and uploaded counts
so partial or multi-file batches can render accurate status text.
* perf(file-tree): parallelize directory traversal and widen default ignore list
The project file-tree endpoint walked children sequentially with
`await fsPromises.stat()` inside a for-loop plus a separate
`fsPromises.access()` probe before recursing. On high-latency
filesystems (NFS/SMB) every one of those round-trips was serialized,
so a 120k-file SMB-mounted project took ~2 minutes to load.
This change:
* Runs stat() and recursive getFileTree() calls in parallel via
`Promise.all` — pipelines round-trips and lets subtree traversals
overlap.
* Drops the redundant access() probe; any EACCES now surfaces from
readdir's own try/catch in the recursive call, saving one RTT per
directory.
* Extracts the hardcoded skip list into an IGNORED_DIRS Set and
extends it to cover common Python / Rust / JVM / IDE build
artefacts (.next, __pycache__, .pytest_cache, .tox, .venv,
target, .gradle, .idea, coverage, etc).
No API shape change; existing consumers get the same tree structure,
only much faster on large or remote-mounted projects.
* fix(file-tree): bound filesystem traversal concurrency
Prevent large file-tree scans from launching unbounded stat and readdir work.
Keep the parallel traversal benefit on high-latency mounts with a bounded queue.
Ignore skipped names only for directories so same-named files stay visible.
* fix(file-tree): inspect entries with lstat
Use lstat for file-tree metadata so symlink entries are identified without following targets.
---------
Co-authored-by: leonkong via Claude <leonkong.claude@users.noreply.github.com>
Claude's model catalog changes quickly enough that a shared three-day cache can
leave users selecting stale defaults or missing newly available model aliases.
Route Claude model lookups through the provider every time so the UI and slash
commands reflect the current provider result instead of an old disk snapshot.
Keep the static fallback catalog aligned with the latest Claude defaults so the
provider still has a sensible response when live discovery is unavailable.
The WebSocket gateway never sent ping frames, so any reverse proxy with
an idle timeout (Cloudflare Tunnel ~100s, AWS ALB 60s, nginx 60s, etc.)
would silently tear down /shell, /ws and /plugin-ws/* connections after
the idle window. The UI reconnects automatically but users see a
"Connecting to shell" toast every 1–3 minutes during normal use and any
in-flight PTY/chat traffic can race the reconnect.
Schedule a 30s ws.ping() per connection at the gateway level, cleared on
close/error. ping/pong counts as protocol activity for all proxies that
implement WebSocket correctly, so this single change covers every
deployment topology without per-proxy tuning.
Fixes#769
Co-authored-by: Haile <118998054+blackmammoth@users.noreply.github.com>
* fix: preserve WebSocket frame type in plugin proxy
The plugin WebSocket proxy relays all messages as binary frames
regardless of the original frame type. This causes text-based ready
messages to be forwarded as binary, so the browser never processes
them and plugin UIs (like web-terminal) show a spinner indefinitely.
Pass the isBinary flag through in both relay directions so the
original frame type is preserved.
FixesCoderLuii/HolyClaude#11
* fix(plugins): preserve websocket frame type in proxy
Ultraworked with [Sisyphus](https://github.com/code-yeongyu/oh-my-openagent)
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
---------
Co-authored-by: Sisyphus <clio-agent@sisyphuslabs.ai>
* fix: refresh Claude auth status after login flow
* fix: rely on refreshed auth status after login
---------
Co-authored-by: HolyCode User <noreply@holycode.local>
* fix: remove the hide cursor on windows logic
* feat(cursor): update fallback models
* fix(claude): force fallback models and disable supportedModels lookup
* feat: add opencode support
* fix: stabilize opencode session startup
* fix: /models
* fix: improveUI for commands
* fix: format commands.js
* feat: load models through provider adapters
Provider model selection had outgrown a single hardcoded service.
The old service mixed shared caching with provider catalogs and CLI lookup details.
That made stale model lists more likely as providers changed on separate schedules.
Move model discovery behind each provider so lookup lives next to the integration.
The shared service now focuses on provider resolution, caching, persistence, and dedupe.
Return cache metadata and add bypassCache because model availability changes outside the app.
The UI and /models command can show freshness and let users force a provider refresh.
Surface model descriptions while keeping fallback catalogs for unavailable CLIs or SDKs.
* feat(models): resolve active session models through provider adapters
The model inventory command was showing a mix of catalog defaults and
composer-local state instead of the model that is actually active for a
real provider session. That made /models, /cost, and /status
misleading once a session had already started, especially for providers
whose effective runtime model can differ from the optimistic model value
held in the UI.
Introduce an explicit getCurrentActiveModel() contract on
IProviderModels so model resolution lives next to each provider's
catalog logic and uses the provider-native source of truth:
- Claude reads the init event from a resumed stream-json run
- Codex reads model from ~/.codex/config.toml
- Cursor reads lastUsedModel from the chat store.db
- OpenCode reads the persisted session model from opencode.db
- Gemini intentionally returns its default because the CLI does not
provide a reliable active-session lookup
Keep the returned shape intentionally minimal ({ model }). The goal is
to expose only what downstream command consumers need and avoid leaking
provider-specific metadata into a shared transport shape that would
create extra UI coupling and future cleanup cost.
Also make command behavior session-aware: when there is no concrete
session id, do not spawn provider processes or inspect provider session
storage just to answer /models, /cost, or /status. In a new-session
view the correct answer is simply the provider default, and doing more
work there adds latency and unnecessary side effects for no user value.
As part of this, centralize two supporting concerns:
- add a shared helper for building the default current-model result from
a provider catalog so fallbacks stay aligned with DEFAULT
- move leaf-directory validation into shared utils so Cursor session
readers and model lookup code enforce the same path-safety rule
Tests were expanded to cover both the new service delegation path and
the sessionless command behavior, while keeping cache-sensitive tests
isolated from persisted host cache state.
Why this change:
- command output should reflect the model actually driving a session
- new-session views should stay fast and side-effect free
- provider-specific active-model lookup should not be scattered across
routes or UI code
- fallback behavior should be explicit, consistent, and limited to the
provider default when no true active model can be resolved
* feat: support session-scoped model overrides
Model selection was acting like a provider-level preference.
That made resumed sessions drift back to a default or request-time model.
Users expect /models changes made inside a conversation to affect that session.
Store explicit session choices in app-owned ~/.cloudcli state.
This avoids editing provider transcripts or native provider config.
Resolve the effective model before launching each provider runtime.
Claude, Cursor, Codex, Gemini, and OpenCode now honor stored resume choices.
Expose a backend active-model change endpoint for existing sessions.
The models modal can now distinguish default changes from session overrides.
It also shows when a selected model will apply on the next response.
For Claude, stop probing active model state by resuming with a dummy prompt.
Read the indexed JSONL transcript from the end instead.
This preserves provider history while honoring /model stdout or model fields.
Add service tests for adapter delegation and resume-model precedence.
The tests keep cache state, override state, and requested fallback separate.
* feat: make command modal more compact
* fix: preserve opencode session creation events
OpenCode emits the real session id asynchronously on its first JSON output. The runner
registered that id from a helper that could not see the spawned process because
the process reference was scoped inside the model-resolution callback. That
ReferenceError was swallowed by the generic JSON parse fallback, so the client
never received session_created. Without that event, a new OpenCode chat stayed
on / and the assistant stream was not attached to the new session view.
Keep the process reference in the outer spawn scope so registration can update
the active-process map and websocket writer as soon as OpenCode announces the
session id. Split JSON parsing from event processing so malformed non-JSON
output can still stream as raw text, while registration or adapter failures are
surfaced as real errors instead of being hidden as assistant content.
Add a fake opencode executable regression test to lock in the expected lifecycle
ordering: session_created must be sent before live assistant messages, and the
same session id must carry through stream_end and complete.
* fix: clarify model refresh and onboarding providers
OpenCode is now a supported chat provider, but first-run onboarding still only offered
Claude, Cursor, Codex, and Gemini. That made OpenCode harder to discover and
forced users to finish setup before finding the provider in settings or chat.
Adding it to onboarding keeps first-run setup aligned with the providers the
application already supports elsewhere.
The model refresh control was also doing too much visual work. In the new chat
model picker, the previous Hard Refresh label looked like the dialog heading,
which made the primary task unclear. Users open that dialog to choose a model;
refreshing catalogs is only a secondary maintenance action for stale cached
provider model lists.
Rename and reposition the refresh affordance so the model picker reads as a
model picker first. The copy now explains why catalogs are cached, when a refresh
is useful, and that the refresh checks every provider. The /models modal gets the
same clarification so both model-selection surfaces describe the cache behavior
consistently.
* fix: format opencode model catalog labels
OpenCode returns provider-prefixed ids directly from the CLI. Passing those ids through as
labels made the model picker hard to scan: users saw values like
anthropic/claude-3-5-sonnet-20241022 or lowercased, hyphen-split text instead
of readable model names.
Keep the exact OpenCode id as the option value because that is what the CLI
expects, but derive a presentation label for the frontend. The formatter is
intentionally generic rather than a catalog of known providers. It handles common
identifier structure such as provider/model, hyphen-delimited words, v-prefixed
versions, adjacent numeric version tokens, and 8-digit date suffixes.
This keeps OpenCode usable as its model list expands across many upstream
providers without requiring code changes for every new provider or model family.
The description keeps the raw provider-prefixed id visible so users can still
confirm the precise model being selected.
* feat: add more fallback models for cursor
* docs: move model catalog out of shared
The model catalog is no longer a frontend/backend runtime contract.
Keeping it under shared made ownership misleading. It implied the catalog was
application code shared by runtime consumers, even though it now only supports
README links and public API documentation.
Move the catalog into public so it lives beside the docs surfaces that need it.
This gives the API docs a stable, served module and gives README readers a
linkable source without suggesting frontend or backend runtime dependency.
Render the API docs model list from the exported provider registry instead of a
hardcoded Claude/Cursor/Codex subset. That keeps Gemini and OpenCode visible and
makes future provider documentation changes flow through one docs-specific file.
Update README links, provider maintenance notes, and package files so published
artifacts include the standalone docs page and model catalog without relying on
the old shared path.
* fix: simplify empty-state model selector
Keep the provider empty state focused on the setup action users need there:
choosing a model.
The refresh control, cache timestamp, and refresh explanation made the dialog feel
like a cache-management surface.
That extra action is out of place in the empty state, where the goal is to start
a chat with the selected provider and model.
Remove the refresh-specific UI from ProviderSelectionEmptyState and drop the
now-unused refresh/cache props from the ChatMessagesPane pass-through.
Refresh behavior remains available in the dedicated command result flow.
* feat(providers): surface skills in slash command menu
Provider skills were hidden behind provider-specific filesystem rules.
That made the backend and UI unable to offer one discovery path for skills.
Add a normalized skills contract, provider service, and provider skills API.
Keep provider-specific lookup rules inside adapters so routes and UI stay generic.
Claude needs plugin handling because enabled plugins resolve through installed_plugins.json.
Plugin folders can expose commands or skills, so Claude scans both forms.
Claude plugin commands are namespaced to avoid collisions with user and project skills.
Codex, Gemini, and Cursor adapters map their expected skill roots into the same contract.
The slash menu now shows skills beside built-in and custom commands for discovery.
The menu avoids mid-message activation, duplicate rows, loose namespace matches, and input overlap.
Provider tests cover discovery locations and Claude plugin edge cases.
* fix(providers): guard invalid skill command namespaces
Claude plugin ids come from local settings and installed plugin metadata.
Invalid ids such as empty strings or @ should not become command namespaces.
Skip plugin folders when no safe plugin name can be derived.
This prevents malformed slash commands like /:command from reaching the UI.
Add regression coverage for empty and @ plugin ids.
Keyboard selection in the slash menu should match mouse selection.
Only skills are inserted into the composer because they are provider invocations.
Built-in and custom commands execute directly and close the menu on success or failure.
* fix(security): centralize safe frontmatter parsing
Move frontmatter parsing into server/shared/frontmatter.ts so every backend caller
uses the same gray-matter configuration instead of importing gray-matter directly.
The goal is to keep executable JS and JSON frontmatter engines disabled for
all markdown discovered from the filesystem, not only command routes.
Provider skills and shared skill metadata now go through parseFrontMatter too.
That closes the gap where plugin or provider markdown could regain default
gray-matter behavior simply because it lived outside the original command path.
Classify the new parser in backend boundaries so modules can depend on the
safe shared API without reaching into legacy utility paths.
* feat(providers): add comprehensive guide for provider module setup and usage
* fix: reset-state-on-new-session-click
* fix(chat): preserve continuity while session ids settle
New conversations were crossing a short but important consistency gap.
The route could already point at a newly created session id while the
projects payload had not refreshed yet, and realtime/optimistic messages
could still be keyed under a provisional id. In that window the UI could
stop reading the active session store, briefly render the conversation as
missing, and then repopulate it a moment later.
That same gap also made duplication more likely. Optimistic local user
messages could survive long enough to appear beside the persisted copy,
and finalized assistant streaming rows could sit directly next to the
server-backed assistant message with the same content before realtime state
was cleared. The result was a chat view that felt unstable exactly when a
new session was being created.
This commit makes session-id reconciliation a first-class part of the chat
flow instead of assuming every layer will agree immediately. The session
store now understands canonical session aliases and can migrate one
conversation from a provisional id to the real id without dropping its
in-memory state. The route navigation path can replace the provisional URL
entry instead of stacking it in history, and the project/session selection
logic keeps a synthetic selected session alive long enough for the sidebar
and project payloads to catch up.
The practical goal is to keep one visible conversation throughout the whole
creation lifecycle: no dead window between websocket events and project
refresh, no stale provisional URL after the real id is known, and no extra
optimistic/local bubbles when server history catches up.
* fix(cli): resolve executable path for Claude CLI on Windows
* fix(session-synchronizer): improve session name extraction for Claude and Codex