Overview

Getting Started

Cuebird is a real-time AI copilot that streams intelligent suggestions as the other person speaks — not after. It works for job interviews, sales calls, negotiations, customer support, language practice, and live translation.

Quick start: Create a free account, open the Dashboard, click Interview Copilot, select your audio source, and you're live. No installation required.

1. Create an account

Visit the landing page and click Sign In → Create account. You can sign up with email/password or continue with Google. After creating your account, you'll be taken to the Dashboard.

2. Open the Interview Copilot

From the Dashboard, click Interview Copilot. This opens the main session interface with a left sidebar for configuration and a right panel for AI suggestion output.

3. Set your context

In the left sidebar, enter your target Role and Company. Paste the job description or drop in a job URL to auto-fill it. This context is injected into every AI prompt so suggestions are highly relevant.

4. Choose your audio source

Click one of the capture buttons:

Mic — captures your microphone only (useful if the interviewer's audio comes through speakers)
Screen — captures system audio from your screen share (picks up everything in the video call)
Screen + Mic — both simultaneously, merged into one stream

The status indicator in the top bar turns green when audio is flowing. Transcription starts immediately.

5. Get suggestions

As the interviewer speaks, their words appear in the transcript panel. When a question or prompt is detected, the AI suggestion panel begins streaming the response token-by-token. The first words typically appear within 1–2 seconds of the interviewer finishing their sentence.

Overview

How it Works

Understanding the full pipeline helps you get the most out of Cuebird and troubleshoot if something isn't working.

The real-time pipeline

1

Audio capture

Your browser captures raw PCM audio from the chosen source (mic, screen, or both). An AudioWorklet processes the stream at 16kHz mono and sends 250ms chunks to the server over WebSocket.

2

Speech-to-text

The raw audio arrives at Cuebird's Flask backend and is forwarded to the configured STT provider (AssemblyAI, Deepgram, or Google Cloud Speech). The provider returns partial and final transcription events in real time.

3

Transcript state

The backend maintains separate stable, partial, and live text for each speaker. When a turn finalizes, it's added to the conversation history (up to 8 recent turns).

4

Suggestion engine

On each new turn, the suggestion engine debounces (0ms for finals, 250ms for partials), builds a prompt from your context + conversation history, and streams a completion from OpenAI. Completed sentences are extracted for TTS.

5

Output

The streamed response is broadcast to your browser, the Electron overlay (if connected), and any viewer WebSocket clients. TTS audio is queued sentence-by-sentence to minimize latency.

WebSocket session

The entire session runs over a single persistent WebSocket connection (GET /ws). The connection carries audio upload, transcript events, suggestion chunks, and control messages. If the connection drops, the UI shows a disconnection indicator and the session state is preserved in memory for a brief window. Close the tab to end the session and discard all in-memory data.

Speaker detection

When using AssemblyAI Standard backend, speaker diarization is enabled. Each transcript word is tagged with a speaker label (Speaker A, Speaker B, etc.). In the Context panel, use the Speaker dropdown to tell Cuebird which speaker label is you. Cuebird then knows to generate suggestions only in response to the other speaker's words.

Core Features

Interview Copilot

The Interview Copilot page at /interview is the primary tool for job interviews. It combines real-time AI suggestions with context-loading, interview mode controls, and session analytics.

Left sidebar

Audio capture buttons: Mic / Screen / Screen+Mic / Stop. Clicking a capture button starts audio. The status indicator in the top bar shows green when the connection is live.
Job URL auto-fill: Paste a job posting URL and click the auto-fill button. Cuebird scrapes the page and extracts the company name, role title, and job description automatically.
Role & Company: Shown in every AI prompt. The more specific, the more relevant the suggestions.
Job Description: Pasted or auto-filled JD. Excerpts are included in the prompt so the AI can reference required skills and responsibilities.
Interview Mode: Switches the behavioral framing of the AI. Options: General · Behavioral · Technical · System Design · Coding. Behavioral strongly favors STAR-format answers; Technical emphasizes code, architecture, or data answers; Coding adds pseudocode-friendly formatting.
Response Style: Controls the surface format: Standard · Bullets · Short · STAR · Smart Q's. Smart Q's makes the AI suggest a clarifying follow-up question instead of a direct answer — useful when the question is ambiguous.
Tone & Seniority: Professional / Casual / Enthusiastic tone and Intern through Director+ seniority level. These affect word choice and depth.

More options (collapsible)

Notes: Free-form notes injected into the prompt. Use this for key points you want the AI to reference.
AI Model: Select from GPT-4o mini (default), GPT-4o, GPT-4.1 mini, GPT-4.1, o3-mini, o4-mini. Heavier models are slower but more thorough.
Q&A Prep pairs: Add specific question–answer pairs. When the AI detects a close match to a prep question, it incorporates your prepared answer. Add pairs with the + Add Pair button.
Active Stories: Shows how many Knowledge Base stories are currently injected. Click the indicator to open the Knowledge Base and manage your stories.
Resume: Upload or select a saved resume. The AI reads your background and uses it to personalize suggestions.
Speaker: Select your speaker label from the diarization output.

Main panel controls

Manual prompt: Type a question into the input at the bottom and press Enter or click Send. This injects the text directly into the suggestion engine as if the interviewer said it — useful for testing or when audio capture isn't working.
Suggest button: Forces the AI to generate a suggestion from the most recent interviewer turn, even if no new transcript has arrived. Use this when the AI skipped a turn or you want to regenerate.
Image attach: Drag-and-drop or paste a screenshot into the prompt area. The image is sent as a base64 data URL to the AI alongside the text question — useful for whiteboard problems or UI screenshots.
Scorecard: Opens a quick in-session scorecard generated from the current transcript turns. Shows a score and brief feedback for each dimension.
Generate question: Generates a single practice question based on your session context (role, JD, mode).

Core Features

Audio Setup

Getting the right audio source is the most critical setup step. Here's how each option works and when to use it.

Microphone capture

Captures audio from your device's microphone only. Use this when:

The interviewer's audio is loud enough to be picked up by your mic (in-person, or via speakers near your mic)
You want to minimize system latency

The browser will prompt you to grant microphone permission on first use.

Screen audio capture

Captures system audio from your screen. When you click this button, a browser dialog asks which screen or window to share — select your screen and check the Share audio checkbox. This captures everything the interviewer says through your video call software.

Tip: Screen audio capture is the most reliable way to transcribe the interviewer's voice in remote interviews. Pair it with microphone capture (Screen + Mic) to capture both sides of the conversation.

Screen + Mic (combined)

Captures both system audio and your microphone simultaneously, merging them into a single mixed stream. This is the recommended mode for full two-way transcription with speaker diarization.

Tab-specific capture (Chrome extension)

The Chrome extension enables capture of a specific browser tab's audio without displaying any share banners or dialogs. When the extension is installed, a Tab capture button appears. Clicking it opens a tab picker where you can select which tab to capture.

This is useful when you want to avoid the screen-share overlay visible to the interviewer.

STT backend selection

The STT backend determines transcription speed and accuracy. Available options:

BackendLatencyFeatures

AssemblyAI Fast 2–3 word lag Low latency, partial results

AssemblyAI Standard ~4–6 word lag Speaker diarization, higher accuracy

Deepgram Nova-3 2–3 word lag Very fast, high accuracy

Google Cloud Speech Configurable Best for CJK and Portuguese; configurable chunk size

The Interview Copilot page uses AssemblyAI Fast by default. The General app (/) lets you switch backends mid-session.

Core Features

AI Suggestions

The suggestion feed is the heart of Cuebird. Here's how to read it and get the most from it.

Reading the feed

The suggestion panel on the right shows streamed AI responses. Each bubble represents one suggestion, showing:

The trigger transcript (what the interviewer said that prompted this suggestion)
The AI response, rendered as it streams token-by-token
A timestamp and turn sequence number

Suggestion gating (SKIP)

The AI can return __SKIP__ when it determines no suggestion is needed — for example, when the other person is just acknowledging something or the conversation doesn't require a response. The __SKIP__ token is filtered from the UI and no bubble is shown.

Force Suggest

If the AI skips a turn you wanted a suggestion for, click the Suggest button. This bypasses the gating logic and forces a response from the most recent interviewer turn. Force-suggest also suppresses the "don't repeat" instruction so the AI won't re-answer the previous topic.

Manual prompt injection

Use the input box at the bottom of the suggestion panel to type a question directly and press Send. This is equivalent to the interviewer saying those words. You can also attach an image by dragging a file or pasting a screenshot (Ctrl+V) into the input.

Suggestion controls

Suggestions on/off: Toggle whether new suggestions are generated. Does not clear the existing feed.
TTS toggle: Enables/disables audio playback of suggestions as they stream.
Font size: Cycles through three sizes for readability.
Clear feed: Clears all suggestion bubbles from the current session view.

Core Features

TTS Playback

Cuebird can read suggestions aloud as they stream, using one of three TTS providers.

Providers

ProviderQualityNotes

OpenAI TTS High Voices: alloy, echo, fable, onyx, nova, shimmer. Models: tts-1 / tts-1-hd

ElevenLabs Very high Requires ElevenLabs API key. Voice cloning supported.

Smallest.AI High Ultra-low latency. Voices: magnus, olivia, and more.

Speed control

Adjust TTS playback speed from 0.75× (slower, clearer) to 3.0× (very fast). The default is 1.0×. Speed is applied per-sentence as audio is generated.

Output language

Set a different output language to have the AI translate suggestions into another language before speaking them. This is independent of the session's input language and the AI's suggestion language. Useful for real-time spoken translation.

Input audio toggle

When enabled, Cuebird plays back the transcribed input speech (in the input language) before speaking the AI's suggestion. This is used in Translation Mode so you hear the original, then the translation.

Tools

Knowledge Base

The Knowledge Base at /knowledge-base is your personal library of STAR stories and Q&A pairs. Stories you activate are automatically injected into every interview session's AI context.

What is a Story?

A story is a structured answer — typically in STAR format (Situation, Task, Action, Result) — that you've written in advance. When the AI detects a behavioral question that relates to a story you've saved, it can incorporate the story naturally into its suggestion.

Creating a story

Open Knowledge Base from the Dashboard sidebar.
Click + New Story.
Enter a title (e.g., "Led a high-stakes product launch under pressure").
Write your story in the text area, or click AI Generate to have Cuebird draft a full STAR story from the title alone.
Review, edit, and save.

Activating stories

Each story has an Active toggle. Only active stories are injected into interview sessions. Activate the 3–5 most relevant stories for each job application. The Interview Copilot sidebar shows an Active Stories indicator with the current count.

AI story generation

Click the AI Generate button on any story editor. Cuebird will create a complete, coherent STAR story from just your title. The generated story uses placeholder metrics — replace them with your real numbers and specifics for maximum impact.

Best practice: Keep each story focused on one specific achievement. Use concrete numbers. Write in the first person as you'd say it naturally in an interview. 150–250 words per story is ideal.

Tools

Job Tracker

The Job Tracker at /jobs organizes your job applications in one place and connects them to interview sessions.

Creating a job

Click + New Job in the left list pane. Fill in the job title, company, and status. Paste the job description into the JD field, or paste the job URL and use Auto-fill from URL to have Cuebird scrape and parse the posting automatically.

Job fields

Title & Company: Used to pre-fill the Interview Copilot context when you launch from this job.
Job Description: Excerpts are included in AI prompts. The more complete the JD, the more relevant the suggestions.
Resume: Attach a specific saved resume to this job. It will be used automatically when you open an interview session from this listing.
Status & Stage: Track where you are in the process: Applied → Phone Screen → Technical → Offer → Closed.

Session transcripts & reports

After each interview session, you can generate a Post-session Report from the transcript. The report includes a summary, overall score, key strengths, improvement areas, and a breakdown by question. Reports are stored with the job listing for future reference.

Plan limits

Free: up to 10 job listings
Pro: up to 50 job listings

Tools

Mock Interview

Mock Interview mode lets you practice with AI-generated questions before the real interview. Questions are tailored to your target role and job description, and each answer is AI-evaluated.

Starting a mock session

Open Mock Interview from the Dashboard.
Select an interview domain (behavioral, software engineering, data analytics, product management, finance, consulting, system design, or general).
Choose how many questions (1–10).
Click Generate Questions. The AI tailors questions to your session context (role, JD, profile) if provided.

Answering questions

For each question, type or speak your answer. When you're done, click Evaluate. The AI evaluates your answer and returns:

Score (1–10 overall)
Relevance — how well you addressed what was actually asked
Clarity — how clear and organized the response is
Depth — amount of insight, specificity, and evidence provided
Feedback — concrete improvement suggestions
Strengths — what you did well

Supported domains

Behavioral Software Engineering Data Analytics Product Management Finance Consulting System Design General

Tools

Resume Upload

Uploading your resume gives the AI real context about your background, making suggestions specific to your actual experience rather than generic examples.

Supported formats

PDF, DOCX, and TXT. Files are parsed server-side (PDF via pypdf, DOCX via python-docx) and the extracted text is stored. Maximum file size: 5 MB.

Uploading a resume

In the Interview Copilot sidebar (under More Options), click Upload Resume or drag a file onto the upload area. The resume is parsed and saved to your account. You can save multiple resumes and switch between them per job listing.

Plan limits

Free: 1 saved resume
Pro: up to 10 saved resumes

Tools

Session Analytics

Cuebird provides two types of AI-generated analytics from your session transcript: in-session scorecards and post-session reports.

Quick scorecard

Click the Scorecard button in the Interview Copilot panel at any point during a session. Cuebird sends the current transcript turns to the AI and returns a rapid scorecard covering:

Overall score (1–10)
Communication clarity
Answer structure (STAR adherence)
Relevance to the question asked
Brief feedback and suggestions

Post-session report

After a session ends, open the Job Tracker, find the linked session, and click Generate Report. The full transcript is analyzed and produces a comprehensive report with:

Executive summary of the session
Turn-by-turn question and answer review
Aggregate scores across all dimensions
Top 3 strengths
Top 3 areas for improvement
Recommended follow-up preparation

Provider usage dashboard

The Usage panel (accessible from the top bar in the General app) shows your quota and session cost across all AI providers: AssemblyAI hours, ElevenLabs characters, Deepgram requests, and OpenAI tokens. Session cost is calculated in real time using the current pricing table for each model.

Configuration

Settings & Options

Configuration options are available in the Interview Copilot sidebar and the General app (/). Settings are applied per-session and are not persisted across sessions.

Audio settings

STT backend: AssemblyAI Fast · AssemblyAI Standard · Deepgram Nova-3 · Google Cloud Speech
Input language: The language being spoken. Used by the STT provider for transcription accuracy.
Google chunk size: For Google Cloud Speech, the audio chunk sent per request: 3s / 5s / 8s / 12s / 20s. Smaller chunks have lower latency; larger chunks improve accuracy for complex speech.

TTS settings

TTS provider: OpenAI · ElevenLabs · Smallest.AI · None
Voice: Provider-specific voice selection. OpenAI voices: alloy, echo, fable, onyx, nova, shimmer.
Speed: 0.75× to 3.0×. Applied to each sentence as it's generated.
Response language: Language in which TTS speaks. Can differ from input language for translation workflows.
Input audio: Play back the original transcribed speech before the AI's suggestion. Used in Translation Mode.

AI settings

AI model: GPT-4o mini · GPT-4o · GPT-4.1 mini · GPT-4.1 · o3-mini · o4-mini
Suggestions on/off: Disables automatic suggestion generation. You can still trigger suggestions manually via the Suggest button.

Configuration

Language Support

Cuebird supports over 12 languages for speech-to-text and can generate AI suggestions in any language the model supports.

Supported STT languages

English Spanish French German Italian Japanese ¹ Chinese ¹ Korean ¹ Portuguese ¹ Hindi Arabic Russian

¹ These languages automatically route to Google Cloud Speech for higher accuracy.

STT backend auto-selection

When you select Chinese, Japanese, Korean, or Portuguese as the input language, Cuebird automatically switches to the Google Cloud Speech backend regardless of your manual backend selection. This ensures the best transcription accuracy for those language families.

AI suggestion language

AI suggestions are generated in whatever language you're conversing in by default. To force a specific output language, set the Response Language in the TTS settings. This instructs both the AI and the TTS engine to use that language.

Configuration

AI Models

Cuebird supports multiple OpenAI models. You can switch models mid-session — the change takes effect on the next suggestion.

ModelSpeedBest for

GPT-4o mini default Fastest General use, behavioral questions

GPT-4o Fast Complex technical questions, deeper reasoning

GPT-4.1 mini Fast Balanced speed and quality

GPT-4.1 Medium High-quality responses for senior roles

o3-mini Medium Reasoning-heavy questions (system design, math)

o4-mini Medium Best reasoning model, slightly slower

Reasoning models (o3, o4-mini) do not support a temperature parameter. Cuebird automatically omits the temperature field when these models are selected.

Desktop & Extensions

Desktop Overlay

The Electron overlay is a transparent, always-on-top window that displays suggestions directly on your screen — without switching apps, and without appearing in screen-share captures.

How it works

The overlay connects to your running Cuebird session as a viewer via a local WebSocket at ws://localhost:3001/ws?viewer=1. Suggestions streamed to your main session are also broadcast to the overlay automatically. The overlay has no audio capture of its own — all STT and AI processing happens in the web app.

Installation

Clone the Cuebird repository and run npm install in electron-overlay/.
Start the Flask backend and Vite frontend (npm run dev).
Run cd electron-overlay && npm start to launch the overlay.

The overlay window appears as a small 440×700 px frameless window with rounded corners and a translucent dark background. You can drag it to any screen position.

Overlay controls

Manual Prompt: Type a question in the overlay's input and press Enter to inject it directly into the active session — the same as using the web app's manual prompt.
Suggest button: Triggers a Force Suggest from the overlay — generates a suggestion from the most recent interviewer turn.
Opacity slider: Adjusts the overlay window transparency (20% – 100%).
Close / Minimize: Standard window controls accessible via the overlay UI.

Screen-share invisibility

On Windows, the Electron window is created with setContentProtection(true), which prevents it from being captured by screen-sharing software. The window renders normally on your physical display but appears as a black rectangle or is entirely invisible in recordings and screen shares.

Desktop & Extensions

Chrome Extension

The Chrome extension provides two capabilities: silent tab audio capture and suggestion feed relay to the Electron overlay.

Tab capture

The extension's background.js uses the chrome.tabCapture API to capture audio from a specific tab without displaying any screen-sharing banners or popups. When you click the Tab capture button in the Interview Copilot, a tab picker opens. Select the tab containing your video call. Audio from that tab flows directly into the STT pipeline.

Overlay relay

The extension's content.js observes the suggestion feed DOM in the Cuebird web app and relays suggestion events (add, update, remove) to the Electron overlay via a local WebSocket server at 127.0.0.1:3334. This allows the overlay to show suggestions even when the browser window is minimized or behind other windows.

Installation

In Chrome, go to chrome://extensions.
Enable Developer mode (toggle in the top right).
Click Load unpacked and select the extension/ folder from the Cuebird repository.
The Cuebird extension icon appears in the Chrome toolbar.

Account

Account & Billing

Manage your account, plan, and billing from the Account page at /account.

Free plan

Interview & Conversation agent modes
Real-time AI suggestions (GPT-4o mini)
1 saved resume
Up to 10 job listings
Knowledge Base (unlimited stories)

Pro plan (coming soon)

All six agent modes
Unlimited STT minutes
TTS playback (OpenAI, ElevenLabs, Smallest.AI)
All AI models (GPT-4o, GPT-4.1, o4-mini)
Up to 10 saved resumes
Up to 50 job listings
Mock interview mode
Session scorecards & reports
Desktop overlay
Multi-language STT

Billing

Cuebird uses Stripe for payment processing. Manage your subscription, view invoices, and update payment methods from the Billing button on the Account page. This opens the Stripe Customer Portal in a new tab.

Authentication

Cuebird uses Supabase for authentication. Supported methods: email/password and Google OAuth. Passwords are never stored by Cuebird — all credential handling is done by Supabase. You can reset your password via the "Forgot password" flow in the sign-in modal.

Account

Privacy & Data

Cuebird is designed to be privacy-preserving by default. Here's exactly what is and isn't stored.

What is NOT stored

Audio — Raw audio is streamed directly to the STT provider (AssemblyAI, Deepgram, or Google). Cuebird's servers never store audio files.
Transcripts — Live transcripts exist only in server memory for the duration of the WebSocket session. Closing the tab discards all transcript data.
AI suggestions — Suggestions are generated on-demand and discarded when the session ends.

What IS stored

Your profile — Email, name, plan status (stored in Supabase).
Resumes — Uploaded resume text (parsed from PDF/DOCX) is stored in Supabase and accessible only to your account.
Knowledge Base stories — Your STAR stories are stored in Supabase.
Job listings — Job titles, companies, JD text, status, and any generated session reports are stored in Supabase.

Third-party providers

Cuebird passes audio to STT providers and text to OpenAI for AI completions. Each provider's privacy policy governs how they handle that data. Cuebird does not share your profile data with any third party other than Supabase (auth/storage) and Stripe (billing).

Getting Started

1. Create an account

2. Open the Interview Copilot

3. Set your context

4. Choose your audio source

5. Get suggestions

How it Works

The real-time pipeline

WebSocket session

Speaker detection

Interview Copilot

Left sidebar

More options (collapsible)

Main panel controls

Audio Setup

Microphone capture

Screen audio capture

Screen + Mic (combined)

Tab-specific capture (Chrome extension)

STT backend selection

AI Suggestions

Reading the feed

Suggestion gating (__SKIP__)

Force Suggest

Manual prompt injection

Suggestion controls

TTS Playback

Providers

Speed control

Output language

Input audio toggle

Knowledge Base

What is a Story?

Creating a story

Activating stories

AI story generation

Job Tracker

Creating a job

Job fields

Session transcripts & reports

Plan limits

Mock Interview

Starting a mock session

Answering questions

Supported domains

Resume Upload

Supported formats

Uploading a resume

Plan limits

Session Analytics

Quick scorecard

Post-session report

Provider usage dashboard

Settings & Options

Audio settings

TTS settings

AI settings

Language Support

Supported STT languages

STT backend auto-selection

AI suggestion language

AI Models

Desktop Overlay

How it works

Installation

Overlay controls

Screen-share invisibility

Chrome Extension

Tab capture

Overlay relay

Installation

Account & Billing

Free plan

Pro plan (coming soon)

Billing

Authentication

Privacy & Data

What is NOT stored

What IS stored

Third-party providers

Suggestion gating (SKIP)