Debugging Claude Code in RStudio's Terminal

1 — Setup & Initial Symptoms

The Problem Statement

Joe is running Claude Code (Anthropic's AI coding CLI, built on ink / React for terminals) inside RStudio's terminal pane. Something is clearly wrong with the rendering — overprinting, missing content, garbled UI — but the root cause is unknown. The question: how do we systematically find it?

Environment: RStudio 2026.04.0+526 · macOS Darwin 25.5.0 (Apple Silicon) · zsh · TERM=xterm-256color · 150×35 terminal pane

First Look: Environment Scan

Ran a quick battery of diagnostic commands to understand what the terminal reports about itself before writing any test scripts.

echo "TERM=$TERM"; echo "COLORTERM=$COLORTERM"
tput cols; tput lines; tput colors
infocmp -1 | head -60

Notable findings:
• COLORTERM is empty — truecolor not advertised to apps
• TERM_PROGRAM is empty — no terminal self-identification
• strikethrough (smxx) missing from terminfo
• Tc (truecolor flag) missing from terminfo
• Python and Node.js isatty() return false (expected — piped through Claude Code's Bash tool)

2 — Terminal Capability Test Script

Writing `terminal-test.sh`

Wrote a comprehensive standalone script covering both programmatic and visual tests — the key insight being that color rendering and attribute correctness require human eyes, while things like cursor position reports can be automated.

Sections covered:

Environment snapshot (TERM, COLORTERM, window size)
Terminfo capability probes via tput
CPR round-trip (ESC[6n → ESC[row;colR)
Device Attributes query (ESC[c)
SGR attributes (bold, dim, italic, underline, blink, strikethrough)
16-color, 256-color, and truecolor gradients
Unicode & emoji rendering
OSC 8 hyperlinks
Cursor shape sequences (DECSCUSR)

Communication Problem: The Terminal Is Too Broken to Read

⚠ At this point it became clear that Claude Code's own output in RStudio was unreadable — overprinting, garbled characters, and escape sequences rendering as literal text. Joe couldn't read Claude's responses in the chat window.

Solution: route all instructions through a plain text file (assistant.txt) in the project root. All subsequent diagnostics were written there and Joe would copy-paste commands from it. This workaround was used for the rest of the session.

cat > /Users/jcheng/Development/posit-dev/assistant/assistant.txt

Running the Test — Key Results

Joe ran bash /tmp/terminal-test.sh directly in the RStudio terminal and shared a screenshot of the results.

Issues found:
• Overprinting in cursor-shape test — using \r without ESC[2K (erase line) left remnants. (This was a bug in the test script itself, not the terminal.)
• /1;2c appeared after the script — the DA1 terminal response (ESC[?1;2c) leaked into shell stdin because the script didn't fully consume it. Required Ctrl+C to clear.
• COLORTERM unset and Tc missing confirmed

The leaked /1;2c was actually a useful data point: the terminal does respond to Device Attribute queries and identifies as VT100+AVO.

3 — Color Fix (Partial Win)

Fix: `export COLORTERM=truecolor`

The most actionable finding was the missing COLORTERM. Added to ~/.zshrc. This would tell truecolor-aware apps (including Claude Code) that 24-bit color is available. Joe tested it.

Result: colors improved but the fundamental overprinting problem persisted. The rendering was still a mess.

Width Hypothesis (First Red Herring)

Joe shared a screenshot of the garbled Claude Code UI — it showed diff output and chat elements overlapping vertically. Initial hypothesis: a terminal width mismatch between the PTY and the renderer.

stty size   # → 35 150
tput cols   # → 150
echo $COLUMNS  # → 150

All three sources agreed: 150×35. Width mismatch theory eliminated.

4 — Systematic Cursor Movement Tests

With width ruled out, focus shifted to cursor positioning — the mechanism ink uses for redraws.

Scrolling Hypothesis (Second Red Herring)

Theory: a 35-row terminal is short. When ink prints more than 35 rows, the terminal scrolls. Cursor-up can't reach into the scrollback buffer, so ink redraws at the wrong position. Designed a targeted test:

python3 -c "
import sys, time, shutil
rows = shutil.get_terminal_size().lines
for i in range(rows + 5):
    sys.stdout.write('Filler %02d\n' % i)
for i in range(5):
    sys.stdout.write('TARGET %d\n' % i)
sys.stdout.flush(); time.sleep(0.5)
sys.stdout.write('\033[5A')
for i in range(5):
    sys.stdout.write('\033[2KReplaced %d\n' % i)
sys.stdout.flush()
"

Prediction

TARGET lines visible even after cursor-up (scroll clamps cursor at row 1)

Result

PASS — no TARGET lines visible. Cursor-up after scrolling works correctly.

Full Cursor Test Battery (All Passed)

Eight more targeted tests, each isolating one possible failure mode:

Test	What it checked	Result
E	Cursor-up over ANSI-colored lines	✓ PASS
F	Line growing past terminal width mid-stream (simulates ink streaming)	✓ PASS
G	OSC 8 hyperlinks in lines	✓ PASS
H	DECSC/DECRC save-restore cursor	✓ PASS
I	Node.js (not Python) stdout cursor-up	✓ PASS

Wall hit: every synthetic test passed, yet Claude Code still garbled. The bug was not in any escape sequence we could hand-craft. Time for a new strategy.

Model switch to Fable 5

★

Switching to Fable 5

At this point Joe ran /model and switched from Claude Sonnet 4.6 to Fable 5, Anthropic's newest model for complex, long-running tasks. The instruction: "This investigation is not going anywhere, please think outside of the current box and come up with a new strategy."

Fable's insight: stop trying to synthesize the failure with test scripts. All synthetic tests pass because they don't emit the actual bytes Claude Code does. Instead, capture the real byte stream with script -q -r, then replay and analyze it programmatically.

5 — Capturing the Real Byte Stream

Strategy Pivot: Record with `script`

The script command records a terminal session — every byte the PTY emits — into a binary file (BSD script -r / -q format: 24-byte record headers with timestamps, followed by payload bytes).

script -q -r /tmp/claude-session.scr
# Inside: run claude, trigger the bug, exit, then type exit

Parsed the file in Python to extract the output stream separately from input:

import struct
data = open('/tmp/claude-session.scr','rb').read()
# Each record: u64 len, u64 sec, u32 usec, u32 dir ('i'=0x69 / 'o'=0x6f)
pos = 0
while pos + 24 <= len(data):
    ln, sec, usec, d = struct.unpack_from('<QQLl', data, pos)
    ...

Reference Emulator: pyte

Fed the captured output bytes into pyte, a pure-Python terminal emulator that correctly implements VT100/xterm. This told us what a standards-compliant terminal should display.

import pyte
screen = pyte.Screen(156, 35)
stream = pyte.ByteStream(screen)
stream.feed(open('/tmp/claude-out.bin','rb').read())
# Display the expected screen
for i, line in enumerate(screen.display):
    if line.strip(): print(f"{i:2d}| {line.rstrip()}")

Key finding: pyte rendered the banner, mascot art, separators, and input box perfectly at 156 columns wide. Claude Code's byte output was entirely correct. The problem was somewhere in RStudio's rendering pipeline.

Confirming Width: Claude Code Renders for 156 Columns

The separator lines were 156 ─ characters long, and column-absolute positioning sequences reached column 156. Claude Code's PTY reported 156 columns to it. Meanwhile Joe's ruler test showed 150 columns in the renderer — a 6-column discrepancy was briefly suspected as the cause.

The ruler test showed all three rows rendering correctly. Width mismatch eliminated again. The mystery deepened.

6 — Binary Bisection via Replay

Building a Replay Kit

Claude Code's startup emits several distinct bursts of output (each being a separate PTY read → rsession write → websocket frame → xterm.js render cycle). Split the captured output into 9 cumulative-prefix files, one per write record. Each file = everything Claude Code emitted up through record N.

Wrote a replay shell script (/tmp/replay/run.sh) that: sanitizes the terminal with stty sane, disables stale focus-reporting and bracketed-paste modes, cats prefix N, pauses for observation, and advances on Enter.

⚠ First run revealed more stale terminal state: Enter printed ^M (raw mode stuck on from a previous crashed session), and regaining focus printed ^[[I / ^[[O (focus events enabled and not cleaned up). Added stty sane and printf '\e[?1004l\e[?2004l\e[?25h\e[0m' to fix.

Bisection Results

Joe reported which prefixes showed the Claude Code mascot banner:

02 ✓ 03 ✗ 04 ✗ 05 ✗ 06 ✓ 07 ✗ 08 ✗ 09 ✓

The banner toggled on and off. This was compared against what pyte predicted:

Pyte predicted the banner visible in ALL prefixes (02–09). So pyte disagreed with RStudio in 5 out of 8 cases. The bytes were correct; RStudio was mishandling them.

Headless xterm.js 6.0.0 Confirms

RStudio bundles xterm.js 6.0.0 (found in src/gwt/tools/build-xterm in the cloned rstudio repo). Installed @xterm/headless@6.0.0 and ran the same prefix sweep.

npm install --legacy-peer-deps @xterm/headless@6.0.0
node -e "
const {Terminal} = require('@xterm/headless');
// feed each prefix, check buffer for '█'
..."

Same result as pyte — headless xterm.js said the banner was visible in all prefixes regardless of width or height swept from 100–200 cols and 6–45 rows. The bug was not in xterm.js rendering the bytes. It was happening upstream of xterm.js.

7 — Reading RStudio Source Code

Cloning RStudio and Tracing the Terminal Pipeline

Cloned the RStudio source at the exact matching tag and traced how terminal output flows from PTY to screen:

git clone --depth 1 --branch v2026.04.0+526 \
  https://github.com/rstudio/rstudio \
  ~/Development/_gh/rstudio/rstudio

The pipeline:

PTY → rsession (C++) → ConsoleProcess::onStdout()
    → enqueOutputEvent()
    → s_terminalSocket.sendText()  [WebSocket TEXT frame]
    → xterm.js in the browser

The Smoking Gun: `enqueOutputEvent()`

In src/cpp/session/SessionConsoleProcess.cpp, line 645 (at tag v2026.04.0+526):

if (procInfo_->getChannelMode() == Websocket)
{
    s_terminalSocket.sendText(procInfo_->getHandle(), output);
    return;  // <-- Error returned by sendText is silently discarded
}

sendText() calls websocketpp::server::send() with opcode::text. WebSocket text frames must be valid UTF-8. websocketpp validates outgoing text frames — if the payload isn't valid UTF-8, the send fails and returns an error.

Raw PTY reads are arbitrary byte windows over a UTF-8 stream. A read can end in the middle of a multi-byte character, making the chunk individually invalid UTF-8. That's exactly what happened in our capture: a 1024-byte output record ended with the first byte of ─ (U+2500, bytes E2 94 80), and the next record began with the remaining two bytes.

websocketpp refuses to send the frame → error is silently discarded → the entire ~1KB chunk is dropped. Any valid ASCII in the same chunk goes down with it. Claude Code's separator lines are 156 × 3-byte box-drawing characters, so a large fraction of its output constantly hits this.

8 — Confirmation Tests

First Confirmation Test

Designed a test with a split multibyte character and predicted exactly what RStudio vs. iTerm2 should show:

python3 -c "
import os, time
os.write(1, b'LINE 1: should be VISIBLE\n');  time.sleep(0.5)
os.write(1, b'LINE 2: invisible \xe2');        time.sleep(0.5)
os.write(1, b'\x94\x80 LINE 3: invisible\n');   time.sleep(0.5)
os.write(1, b'LINE 4: should be VISIBLE\n')
"

Prediction (RStudio)

Lines 1 & 4 visible; lines 2 & 3 missing (they are the invalid UTF-8 chunks)

Actual (RStudio)

ALL FOUR lines invisible

"All four invisible" is even more consistent with the theory. Without the sleep calls, the writes coalesce into fewer PTY reads. The two invalid chunks swallowed innocent ASCII with them.

Second Confirmation (Sleeps Prevent Coalescing)

Re-ran with explicit sleeps between each write to ensure each OS write becomes a separate PTY read chunk:

Prediction

Lines 1 & 4 visible; lines 2 & 3 invisible

Result

CONFIRMED. Lines 1 & 4 visible, lines 2 & 3 completely gone.

9 — Workaround & Fix

Workaround: Disable WebSockets

RStudio has an alternate output channel for its terminal: RPC mode, which doesn't use WebSocket text frames and therefore doesn't validate UTF-8.

Tools → Global Options → Terminal → uncheck "Connect with WebSockets"

Joe restarted the terminal and ran Claude Code again.

"It works way, way better!" — Claude Code's UI rendered correctly, matching iTerm2 output. The only residual artifact: occasional �� where a chunk boundary splits a multibyte character (the RPC path still decodes each chunk independently, so the bytes arrive but the character renders as three replacement chars rather than being assembled).

The �� characters are the bytes of ─ (U+2500, E2 94 80) each rendered as U+FFFD. Not an em-dash — a box-drawing light horizontal line. The proper fix would carry incomplete trailing UTF-8 bytes to the next chunk before sending, eliminating both the drops and the ��.

Filing the GitHub Issue

Drafted and filed a detailed bug report with minimal repro, root-cause analysis, code references, and suggested fix.

gh issue create -R rstudio/rstudio \
  --title "Terminal silently drops output chunks with invalid UTF-8 (WebSockets)" \
  --body-file issue-body.md

🐛

rstudio/rstudio #17941

Terminal silently drops entire output chunks containing invalid UTF-8 (split multibyte characters) when using WebSockets

🔍 Root Cause Summary

Location: SessionConsoleProcess.cpp::enqueOutputEvent(), line 645 — the call to s_terminalSocket.sendText()
Mechanism: Raw PTY chunks are sent as WebSocket text frames. websocketpp validates outgoing text frames for UTF-8 correctness. PTY reads are arbitrary byte windows and can split multi-byte characters across reads, making individual chunks invalid UTF-8. The send fails; the returned Error is silently discarded; the entire chunk (~1KB) is lost.
Why Claude Code: Its UI draws separator lines of 156 × 3-byte box-drawing characters. A large fraction of its output chunks end mid-character, constantly triggering the bug.
Why only RStudio: iTerm2 and Terminal.app read raw PTY bytes directly; they don't route through a WebSocket transport and have no UTF-8 frame validation.
Proper fix: Maintain per-terminal-handle state of incomplete trailing UTF-8 bytes; prepend them to the next chunk before sending. Also log / surface sendText errors instead of silently discarding.
Workaround: Tools → Global Options → Terminal → uncheck "Connect with WebSockets". The RPC channel doesn't validate UTF-8, so drops become occasional �� artifacts instead.

10 — What Made This Work

✦

Key Diagnostic Techniques

1. Routing around the broken medium. When the terminal itself was too garbled to read responses, using a plain assistant.txt file as the communication channel let the investigation continue without switching tools.

2. The model switch that unstuck things. Switching to Fable 5 mid-investigation introduced the critical insight: synthetic tests will always pass because they don't reproduce the real failure conditions. Capture the actual byte stream instead.

3. script -q -r as a byte-perfect capture tool. Recording a real session and parsing the binary format separated "what bytes were emitted" from "what the terminal did with them."

4. Reference emulator comparison. Feeding the same bytes into pyte (Python) and @xterm/headless (the same xterm.js version RStudio bundles) proved the data was correct before looking at the transport.

5. Reading the source. Once we knew the fault was in the transport layer, cloning the exact RStudio version and reading enqueOutputEvent() took about 10 minutes to find the discarded error and connect it to WebSocket text frame validation.

fin

Investigated June 11, 2026 · Claude Sonnet 4.6 + Fable 5 · Bug filed as rstudio/rstudio #17941