A blow-by-blow account of how a "something's wrong with ANSI codes" symptom turned into a root-cause patch for a WebSocket UTF-8 framing bug in rsession.
Joe is running Claude Code (Anthropic's AI coding CLI, built on ink / React for terminals) inside RStudio's terminal pane. Something is clearly wrong with the rendering โ overprinting, missing content, garbled UI โ but the root cause is unknown. The question: how do we systematically find it?
Ran a quick battery of diagnostic commands to understand what the terminal reports about itself before writing any test scripts.
echo "TERM=$TERM"; echo "COLORTERM=$COLORTERM" tput cols; tput lines; tput colors infocmp -1 | head -60
COLORTERM is empty โ truecolor not advertised to appsTERM_PROGRAM is empty โ no terminal self-identificationstrikethrough (smxx) missing from terminfoTc (truecolor flag) missing from terminfoisatty() return false (expected โ piped through Claude Code's Bash tool)
terminal-test.shWrote a comprehensive standalone script covering both programmatic and visual tests โ the key insight being that color rendering and attribute correctness require human eyes, while things like cursor position reports can be automated.
Sections covered:
tputESC[6n โ ESC[row;colR)ESC[c)
Solution: route all instructions through a plain text file (assistant.txt)
in the project root. All subsequent diagnostics were written there and Joe would
copy-paste commands from it. This workaround was used for the rest of the session.
cat > /Users/jcheng/Development/posit-dev/assistant/assistant.txt
Joe ran bash /tmp/terminal-test.sh directly in the RStudio terminal
and shared a screenshot of the results.
\r without
ESC[2K (erase line) left remnants. (This was a bug in the test
script itself, not the terminal.)/1;2c appeared after the script โ the DA1 terminal
response (ESC[?1;2c) leaked into shell stdin because the script
didn't fully consume it. Required Ctrl+C to clear.COLORTERM unset and Tc missing confirmed
The leaked /1;2c was actually a useful data point:
the terminal does respond to Device Attribute queries and identifies as VT100+AVO.
export COLORTERM=truecolor
The most actionable finding was the missing COLORTERM. Added to
~/.zshrc. This would tell truecolor-aware apps (including Claude Code)
that 24-bit color is available. Joe tested it.
Joe shared a screenshot of the garbled Claude Code UI โ it showed diff output and chat elements overlapping vertically. Initial hypothesis: a terminal width mismatch between the PTY and the renderer.
stty size # โ 35 150 tput cols # โ 150 echo $COLUMNS # โ 150
With width ruled out, focus shifted to cursor positioning โ the mechanism ink uses for redraws.
Theory: a 35-row terminal is short. When ink prints more than 35 rows, the terminal scrolls. Cursor-up can't reach into the scrollback buffer, so ink redraws at the wrong position. Designed a targeted test:
python3 -c "
import sys, time, shutil
rows = shutil.get_terminal_size().lines
for i in range(rows + 5):
sys.stdout.write('Filler %02d\n' % i)
for i in range(5):
sys.stdout.write('TARGET %d\n' % i)
sys.stdout.flush(); time.sleep(0.5)
sys.stdout.write('\033[5A')
for i in range(5):
sys.stdout.write('\033[2KReplaced %d\n' % i)
sys.stdout.flush()
"
Eight more targeted tests, each isolating one possible failure mode:
| Test | What it checked | Result |
|---|---|---|
| E | Cursor-up over ANSI-colored lines | โ PASS |
| F | Line growing past terminal width mid-stream (simulates ink streaming) | โ PASS |
| G | OSC 8 hyperlinks in lines | โ PASS |
| H | DECSC/DECRC save-restore cursor | โ PASS |
| I | Node.js (not Python) stdout cursor-up | โ PASS |
At this point Joe ran /model and switched from Claude Sonnet 4.6
to Fable 5, Anthropic's newest model for complex, long-running tasks.
The instruction: "This investigation is not going anywhere, please think outside of the
current box and come up with a new strategy."
script -q -r, then replay and
analyze it programmatically.
script
The script command records a terminal session โ every byte the PTY
emits โ into a binary file (BSD script -r / -q format:
24-byte record headers with timestamps, followed by payload bytes).
script -q -r /tmp/claude-session.scr # Inside: run claude, trigger the bug, exit, then type exit
Parsed the file in Python to extract the output stream separately from input:
import struct
data = open('/tmp/claude-session.scr','rb').read()
# Each record: u64 len, u64 sec, u32 usec, u32 dir ('i'=0x69 / 'o'=0x6f)
pos = 0
while pos + 24 <= len(data):
ln, sec, usec, d = struct.unpack_from('<QQLl', data, pos)
...
Fed the captured output bytes into pyte, a pure-Python terminal emulator that correctly implements VT100/xterm. This told us what a standards-compliant terminal should display.
import pyte
screen = pyte.Screen(156, 35)
stream = pyte.ByteStream(screen)
stream.feed(open('/tmp/claude-out.bin','rb').read())
# Display the expected screen
for i, line in enumerate(screen.display):
if line.strip(): print(f"{i:2d}| {line.rstrip()}")
The separator lines were 156 โ characters long, and column-absolute
positioning sequences reached column 156. Claude Code's PTY reported 156 columns
to it. Meanwhile Joe's ruler test showed 150 columns in the renderer โ a 6-column
discrepancy was briefly suspected as the cause.
Claude Code's startup emits several distinct bursts of output (each being a separate PTY read โ rsession write โ websocket frame โ xterm.js render cycle). Split the captured output into 9 cumulative-prefix files, one per write record. Each file = everything Claude Code emitted up through record N.
Wrote a replay shell script (/tmp/replay/run.sh) that: sanitizes
the terminal with stty sane, disables stale focus-reporting and
bracketed-paste modes, cats prefix N, pauses for observation, and
advances on Enter.
^M (raw mode stuck on from a previous crashed session), and
regaining focus printed ^[[I / ^[[O (focus events
enabled and not cleaned up). Added stty sane and
printf '\e[?1004l\e[?2004l\e[?25h\e[0m' to fix.
Joe reported which prefixes showed the Claude Code mascot banner:
The banner toggled on and off. This was compared against what pyte predicted:
RStudio bundles xterm.js 6.0.0 (found in
src/gwt/tools/build-xterm in the cloned rstudio repo).
Installed @xterm/headless@6.0.0 and ran the same prefix sweep.
npm install --legacy-peer-deps @xterm/headless@6.0.0
node -e "
const {Terminal} = require('@xterm/headless');
// feed each prefix, check buffer for 'โ'
..."
Cloned the RStudio source at the exact matching tag and traced how terminal output flows from PTY to screen:
git clone --depth 1 --branch v2026.04.0+526 \ https://github.com/rstudio/rstudio \ ~/Development/_gh/rstudio/rstudio
The pipeline:
PTY โ rsession (C++) โ ConsoleProcess::onStdout()
โ enqueOutputEvent()
โ s_terminalSocket.sendText() [WebSocket TEXT frame]
โ xterm.js in the browser
enqueOutputEvent()
In src/cpp/session/SessionConsoleProcess.cpp, line 645
(at tag v2026.04.0+526):
if (procInfo_->getChannelMode() == Websocket)
{
s_terminalSocket.sendText(procInfo_->getHandle(), output);
return; // <-- Error returned by sendText is silently discarded
}
sendText() calls websocketpp::server::send() with
opcode::text. WebSocket text frames must be valid UTF-8.
websocketpp validates outgoing text frames โ if the payload isn't valid UTF-8,
the send fails and returns an error.
Raw PTY reads are arbitrary byte windows over a UTF-8 stream. A read can end
in the middle of a multi-byte character, making the chunk individually
invalid UTF-8. That's exactly what happened in our capture: a 1024-byte output
record ended with the first byte of โ (U+2500, bytes
E2 94 80), and the next record began with the remaining two bytes.
Designed a test with a split multibyte character and predicted exactly what RStudio vs. iTerm2 should show:
python3 -c " import os, time os.write(1, b'LINE 1: should be VISIBLE\n'); time.sleep(0.5) os.write(1, b'LINE 2: invisible \xe2'); time.sleep(0.5) os.write(1, b'\x94\x80 LINE 3: invisible\n'); time.sleep(0.5) os.write(1, b'LINE 4: should be VISIBLE\n') "
sleep calls, the writes coalesce into fewer PTY reads. The two invalid
chunks swallowed innocent ASCII with them.
Re-ran with explicit sleeps between each write to ensure each OS write becomes a separate PTY read chunk:
RStudio has an alternate output channel for its terminal: RPC mode, which doesn't use WebSocket text frames and therefore doesn't validate UTF-8.
Tools โ Global Options โ Terminal โ uncheck "Connect with WebSockets"
Joe restarted the terminal and ran Claude Code again.
๏ฟฝ๏ฟฝ๏ฟฝ where a chunk boundary splits a multibyte character (the RPC path
still decodes each chunk independently, so the bytes arrive but the
character renders as three replacement chars rather than being assembled).
The ๏ฟฝ๏ฟฝ๏ฟฝ characters are the bytes of โ (U+2500,
E2 94 80) each rendered as U+FFFD. Not an em-dash โ a box-drawing
light horizontal line. The proper fix would carry incomplete trailing UTF-8 bytes
to the next chunk before sending, eliminating both the drops and the
๏ฟฝ๏ฟฝ๏ฟฝ.
Drafted and filed a detailed bug report with minimal repro, root-cause analysis, code references, and suggested fix.
gh issue create -R rstudio/rstudio \ --title "Terminal silently drops output chunks with invalid UTF-8 (WebSockets)" \ --body-file issue-body.md
Terminal silently drops entire output chunks containing invalid UTF-8 (split multibyte characters) when using WebSockets
SessionConsoleProcess.cpp::enqueOutputEvent(),
line 645 โ the call to s_terminalSocket.sendText()
Error
is silently discarded; the entire chunk (~1KB) is lost.
sendText errors instead of silently discarding.
๏ฟฝ๏ฟฝ๏ฟฝ artifacts instead.
1. Routing around the broken medium.
When the terminal itself was too garbled to read responses, using a plain
assistant.txt file as the communication channel let the investigation
continue without switching tools.
2. The model switch that unstuck things. Switching to Fable 5 mid-investigation introduced the critical insight: synthetic tests will always pass because they don't reproduce the real failure conditions. Capture the actual byte stream instead.
3. script -q -r as a byte-perfect capture tool.
Recording a real session and parsing the binary format separated "what bytes were
emitted" from "what the terminal did with them."
4. Reference emulator comparison.
Feeding the same bytes into pyte (Python) and @xterm/headless (the
same xterm.js version RStudio bundles) proved the data was correct before
looking at the transport.
5. Reading the source.
Once we knew the fault was in the transport layer, cloning the exact RStudio
version and reading enqueOutputEvent() took about 10 minutes to
find the discarded error and connect it to WebSocket text frame validation.
Investigated June 11, 2026 ยท Claude Sonnet 4.6 + Fable 5 ยท Bug filed as rstudio/rstudio #17941