Skip to content

281 caption switcher fix#282

Closed
maboa wants to merge 4 commits into
mainfrom
281-caption-switcher-fix
Closed

281 caption switcher fix#282
maboa wants to merge 4 commits into
mainfrom
281-caption-switcher-fix

Conversation

@maboa
Copy link
Copy Markdown
Member

@maboa maboa commented May 22, 2026

Summary

Three related caption-editor bugs surfaced while testing the Deepgram flow end-to-end.

1. Stale captions after a new transcript (9e1f849)

populateCaptionEditor short-circuits on a non-null captionCache, which is the right behavior for preserving user caption edits across transcript edits — but
the cache was never invalidated when a fresh transcript was produced. After running Deepgram or Whisper, or importing JSON/SRT/VTT, switching to the caption
page showed captions from the previous transcript.

Fix: invalidate captionCache on the hyperaudioInit event, which all three "new transcript" code paths already dispatch. The localStorage load path manages
its own cache (it loads captions alongside the transcript), so hyperaudioTranscriptLoaded is intentionally not used as a trigger. The Regenerate button is
unchanged.

2. Transcribing from caption mode produced no visible loader and no result (1a4a7c7)

Both getData() (Deepgram) and handleFormSubmission() (Whisper) write the "Transcribing…" loader and the final transcript to #hypertranscript. While the
user is in caption mode, #hypertranscript is detached from the DOM — so the loader never showed and the result never landed.

Fix: programmatically click #transcript-editor-btn at the start of each transcribe flow. The button is disabled whenever the user is already in transcript
view (initial HTML state + set explicitly on switch-back), so the click is a no-op in transcript mode and a clean mode-switch in caption mode.

3. Short-caption stop times were too early (aab08a2)

In caption-modified.js, the short-caption path (segment.chars < maxLineLength) used segment.start + segment.duration for the caption's stop time, where
segment.duration is the sum of word durations. When a segment contained silent gaps between words, the sum was shorter than the wall-clock span, and the
caption text visibly outran its time range — e.g., "So group one is the first one." with a real end at 5.04s was emitted with an Out of 3.28s.

Fix: use the last word's actual end (lastWord.start + lastWord.duration) for the stop time, with fallbacks for missing word durations. The long-segment path
already used per-word end times via lastOutTime and was unaffected.

Likely latent before #277 (which bumped the paragraph-break gap from 0.5s to 4.0s); short paragraphs rarely had large internal silent gaps so sum-of-durations ≈
span.

Test plan

  • Transcribe a file → switch to caption view → confirm captions reflect the new transcript.
  • Edit a caption in caption view → switch to transcript view → switch back → confirm caption edits persist (cache not over-invalidated).
  • Click Regenerate → confirm captions rebuild from current transcript and discard caption edits.
  • In caption view, start a new transcription (Deepgram and Whisper) → confirm the view switches to transcript and the "Transcribing…" loader is visible →
    confirm the resulting transcript renders.
  • Already in transcript view, start a transcription → confirm behavior unchanged (loader visible, result lands).
  • Load a session from localStorage that contains captions → confirm captions are not wiped on next tab switch.
  • Transcribe content with notable silent gaps within a paragraph → switch to caption view → confirm each caption's Out time matches the actual last-word end
    (not the truncated sum-of-durations).

maboa added 4 commits May 22, 2026 18:23
populateCaptionEditor short-circuits and reuses the cached caption
editor HTML whenever captionCache is non-null, which is the right
behavior for preserving caption edits across transcript edits. The
bug was that the cache was never invalidated when a fresh
transcript was created (Deepgram/Whisper transcription, JSON/SRT/
VTT import), so switching to the caption page after transcribing
a new file showed captions from the previous transcript.

Listen for hyperaudioInit — which all three new-transcript paths
already dispatch — and null captionCache. The localStorage load
path keeps its own explicit invalidation since it loads its own
captions, and the Regenerate button is unchanged.
Both Deepgram getData() and Whisper handleFormSubmission() write
the "Transcribing…" loader to #hypertranscript and later write the
final transcript to the same element. While the user is in caption
mode, #hypertranscript is detached from the DOM, so the loader
never shows and the result never lands.

Programmatically click #transcript-editor-btn at the start of each
transcribe flow. The button is disabled when the user is already
in transcript view (set in index.html and on the transcript-editor
click handler), so this is a no-op in transcript mode and switches
back in caption mode.
The short-caption path (segment.chars < maxLineLength) computed
its stop time as segment.start + segment.duration, but
segment.duration is the sum of word durations — not the
wall-clock span. When a segment contained silent gaps between
words (e.g., 0–0.64s, then 1.12–1.6s, etc.), the stop time fell
short of where the last word actually ends, and the caption text
visibly outran its time range.

Compute the stop time from the last word's start + duration
instead, with fallbacks for missing word durations.

The long-segment path already used per-word end times via
lastOutTime and is unaffected.
Restore the descriptive comments and surrounding whitespace in
populateCaptionEditor that were inadvertently removed alongside
the cache-invalidation fix in 9e1f849. Leaves out the commented-
out duplicate insertAdjacentElement line that was true dead code.
@maboa maboa closed this May 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant