Getting Started
Cuebird is a real-time AI copilot that streams intelligent suggestions as the other person speaks — not after. It works for job interviews, sales calls, negotiations, customer support, language practice, and live translation.
1. Create an account
Visit the landing page and click Sign In → Create account. You can sign up with email/password or continue with Google. After creating your account, you'll be taken to the Dashboard.
2. Open the Interview Copilot
From the Dashboard, click Interview Copilot. This opens the main session interface with a left sidebar for configuration and a right panel for AI suggestion output.
3. Set your context
In the left sidebar, enter your target Role and Company. Paste the job description or drop in a job URL to auto-fill it. This context is injected into every AI prompt so suggestions are highly relevant.
4. Choose your audio source
Click one of the capture buttons:
- Mic — captures your microphone only (useful if the interviewer's audio comes through speakers)
- Screen — captures system audio from your screen share (picks up everything in the video call)
- Screen + Mic — both simultaneously, merged into one stream
The status indicator in the top bar turns green when audio is flowing. Transcription starts immediately.
5. Get suggestions
As the interviewer speaks, their words appear in the transcript panel. When a question or prompt is detected, the AI suggestion panel begins streaming the response token-by-token. The first words typically appear within 1–2 seconds of the interviewer finishing their sentence.
How it Works
Understanding the full pipeline helps you get the most out of Cuebird and troubleshoot if something isn't working.
The real-time pipeline
Your browser captures raw PCM audio from the chosen source (mic, screen, or both). An AudioWorklet processes the stream at 16kHz mono and sends 250ms chunks to the server over WebSocket.
The raw audio arrives at Cuebird's Flask backend and is forwarded to the configured STT provider (AssemblyAI, Deepgram, or Google Cloud Speech). The provider returns partial and final transcription events in real time.
The backend maintains separate stable, partial, and live text for each speaker. When a turn finalizes, it's added to the conversation history (up to 8 recent turns).
On each new turn, the suggestion engine debounces (0ms for finals, 250ms for partials), builds a prompt from your context + conversation history, and streams a completion from OpenAI. Completed sentences are extracted for TTS.
The streamed response is broadcast to your browser, the Electron overlay (if connected), and any viewer WebSocket clients. TTS audio is queued sentence-by-sentence to minimize latency.
WebSocket session
The entire session runs over a single persistent WebSocket connection (GET /ws). The connection carries audio upload, transcript events, suggestion chunks, and control messages. If the connection drops, the UI shows a disconnection indicator and the session state is preserved in memory for a brief window. Close the tab to end the session and discard all in-memory data.
Speaker detection
When using AssemblyAI Standard backend, speaker diarization is enabled. Each transcript word is tagged with a speaker label (Speaker A, Speaker B, etc.). In the Context panel, use the Speaker dropdown to tell Cuebird which speaker label is you. Cuebird then knows to generate suggestions only in response to the other speaker's words.
Interview Copilot
The Interview Copilot page at /interview is the primary tool for job interviews. It combines real-time AI suggestions with context-loading, interview mode controls, and session analytics.
Left sidebar
- Audio capture buttons
- Mic / Screen / Screen+Mic / Stop. Clicking a capture button starts audio. The status indicator in the top bar shows green when the connection is live.
- Job URL auto-fill
- Paste a job posting URL and click the auto-fill button. Cuebird scrapes the page and extracts the company name, role title, and job description automatically.
- Role & Company
- Shown in every AI prompt. The more specific, the more relevant the suggestions.
- Job Description
- Pasted or auto-filled JD. Excerpts are included in the prompt so the AI can reference required skills and responsibilities.
- Interview Mode
- Switches the behavioral framing of the AI. Options: General · Behavioral · Technical · System Design · Coding. Behavioral strongly favors STAR-format answers; Technical emphasizes code, architecture, or data answers; Coding adds pseudocode-friendly formatting.
- Response Style
- Controls the surface format: Standard · Bullets · Short · STAR · Smart Q's. Smart Q's makes the AI suggest a clarifying follow-up question instead of a direct answer — useful when the question is ambiguous.
- Tone & Seniority
- Professional / Casual / Enthusiastic tone and Intern through Director+ seniority level. These affect word choice and depth.
More options (collapsible)
- Notes
- Free-form notes injected into the prompt. Use this for key points you want the AI to reference.
- AI Model
- Select from GPT-4o mini (default), GPT-4o, GPT-4.1 mini, GPT-4.1, o3-mini, o4-mini. Heavier models are slower but more thorough.
- Q&A Prep pairs
- Add specific question–answer pairs. When the AI detects a close match to a prep question, it incorporates your prepared answer. Add pairs with the + Add Pair button.
- Active Stories
- Shows how many Knowledge Base stories are currently injected. Click the indicator to open the Knowledge Base and manage your stories.
- Resume
- Upload or select a saved resume. The AI reads your background and uses it to personalize suggestions.
- Speaker
- Select your speaker label from the diarization output.
Main panel controls
- Manual prompt
- Type a question into the input at the bottom and press Enter or click Send. This injects the text directly into the suggestion engine as if the interviewer said it — useful for testing or when audio capture isn't working.
- Suggest button
- Forces the AI to generate a suggestion from the most recent interviewer turn, even if no new transcript has arrived. Use this when the AI skipped a turn or you want to regenerate.
- Image attach
- Drag-and-drop or paste a screenshot into the prompt area. The image is sent as a base64 data URL to the AI alongside the text question — useful for whiteboard problems or UI screenshots.
- Scorecard
- Opens a quick in-session scorecard generated from the current transcript turns. Shows a score and brief feedback for each dimension.
- Generate question
- Generates a single practice question based on your session context (role, JD, mode).
Audio Setup
Getting the right audio source is the most critical setup step. Here's how each option works and when to use it.
Microphone capture
Captures audio from your device's microphone only. Use this when:
- The interviewer's audio is loud enough to be picked up by your mic (in-person, or via speakers near your mic)
- You want to minimize system latency
The browser will prompt you to grant microphone permission on first use.
Screen audio capture
Captures system audio from your screen. When you click this button, a browser dialog asks which screen or window to share — select your screen and check the Share audio checkbox. This captures everything the interviewer says through your video call software.
Screen + Mic (combined)
Captures both system audio and your microphone simultaneously, merging them into a single mixed stream. This is the recommended mode for full two-way transcription with speaker diarization.
Tab-specific capture (Chrome extension)
The Chrome extension enables capture of a specific browser tab's audio without displaying any share banners or dialogs. When the extension is installed, a Tab capture button appears. Clicking it opens a tab picker where you can select which tab to capture.
This is useful when you want to avoid the screen-share overlay visible to the interviewer.
STT backend selection
The STT backend determines transcription speed and accuracy. Available options:
The Interview Copilot page uses AssemblyAI Fast by default. The General app (/) lets you switch backends mid-session.
AI Suggestions
The suggestion feed is the heart of Cuebird. Here's how to read it and get the most from it.
Reading the feed
The suggestion panel on the right shows streamed AI responses. Each bubble represents one suggestion, showing:
- The trigger transcript (what the interviewer said that prompted this suggestion)
- The AI response, rendered as it streams token-by-token
- A timestamp and turn sequence number
Suggestion gating (__SKIP__)
The AI can return __SKIP__ when it determines no suggestion is needed — for example, when the other person is just acknowledging something or the conversation doesn't require a response. The __SKIP__ token is filtered from the UI and no bubble is shown.
Force Suggest
If the AI skips a turn you wanted a suggestion for, click the Suggest button. This bypasses the gating logic and forces a response from the most recent interviewer turn. Force-suggest also suppresses the "don't repeat" instruction so the AI won't re-answer the previous topic.
Manual prompt injection
Use the input box at the bottom of the suggestion panel to type a question directly and press Send. This is equivalent to the interviewer saying those words. You can also attach an image by dragging a file or pasting a screenshot (Ctrl+V) into the input.
Suggestion controls
- Suggestions on/off
- Toggle whether new suggestions are generated. Does not clear the existing feed.
- TTS toggle
- Enables/disables audio playback of suggestions as they stream.
- Font size
- Cycles through three sizes for readability.
- Clear feed
- Clears all suggestion bubbles from the current session view.
TTS Playback
Cuebird can read suggestions aloud as they stream, using one of three TTS providers.
Providers
Speed control
Adjust TTS playback speed from 0.75× (slower, clearer) to 3.0× (very fast). The default is 1.0×. Speed is applied per-sentence as audio is generated.
Output language
Set a different output language to have the AI translate suggestions into another language before speaking them. This is independent of the session's input language and the AI's suggestion language. Useful for real-time spoken translation.
Input audio toggle
When enabled, Cuebird plays back the transcribed input speech (in the input language) before speaking the AI's suggestion. This is used in Translation Mode so you hear the original, then the translation.
Knowledge Base
The Knowledge Base at /knowledge-base is your personal library of STAR stories and Q&A pairs. Stories you activate are automatically injected into every interview session's AI context.
What is a Story?
A story is a structured answer — typically in STAR format (Situation, Task, Action, Result) — that you've written in advance. When the AI detects a behavioral question that relates to a story you've saved, it can incorporate the story naturally into its suggestion.
Creating a story
- Open Knowledge Base from the Dashboard sidebar.
- Click + New Story.
- Enter a title (e.g., "Led a high-stakes product launch under pressure").
- Write your story in the text area, or click AI Generate to have Cuebird draft a full STAR story from the title alone.
- Review, edit, and save.
Activating stories
Each story has an Active toggle. Only active stories are injected into interview sessions. Activate the 3–5 most relevant stories for each job application. The Interview Copilot sidebar shows an Active Stories indicator with the current count.
AI story generation
Click the AI Generate button on any story editor. Cuebird will create a complete, coherent STAR story from just your title. The generated story uses placeholder metrics — replace them with your real numbers and specifics for maximum impact.
Job Tracker
The Job Tracker at /jobs organizes your job applications in one place and connects them to interview sessions.
Creating a job
Click + New Job in the left list pane. Fill in the job title, company, and status. Paste the job description into the JD field, or paste the job URL and use Auto-fill from URL to have Cuebird scrape and parse the posting automatically.
Job fields
- Title & Company
- Used to pre-fill the Interview Copilot context when you launch from this job.
- Job Description
- Excerpts are included in AI prompts. The more complete the JD, the more relevant the suggestions.
- Resume
- Attach a specific saved resume to this job. It will be used automatically when you open an interview session from this listing.
- Status & Stage
- Track where you are in the process: Applied → Phone Screen → Technical → Offer → Closed.
Session transcripts & reports
After each interview session, you can generate a Post-session Report from the transcript. The report includes a summary, overall score, key strengths, improvement areas, and a breakdown by question. Reports are stored with the job listing for future reference.
Plan limits
- Free: up to 10 job listings
- Pro: up to 50 job listings
Mock Interview
Mock Interview mode lets you practice with AI-generated questions before the real interview. Questions are tailored to your target role and job description, and each answer is AI-evaluated.
Starting a mock session
- Open Mock Interview from the Dashboard.
- Select an interview domain (behavioral, software engineering, data analytics, product management, finance, consulting, system design, or general).
- Choose how many questions (1–10).
- Click Generate Questions. The AI tailors questions to your session context (role, JD, profile) if provided.
Answering questions
For each question, type or speak your answer. When you're done, click Evaluate. The AI evaluates your answer and returns:
- Score (1–10 overall)
- Relevance — how well you addressed what was actually asked
- Clarity — how clear and organized the response is
- Depth — amount of insight, specificity, and evidence provided
- Feedback — concrete improvement suggestions
- Strengths — what you did well
Supported domains
Resume Upload
Uploading your resume gives the AI real context about your background, making suggestions specific to your actual experience rather than generic examples.
Supported formats
PDF, DOCX, and TXT. Files are parsed server-side (PDF via pypdf, DOCX via python-docx) and the extracted text is stored. Maximum file size: 5 MB.
Uploading a resume
In the Interview Copilot sidebar (under More Options), click Upload Resume or drag a file onto the upload area. The resume is parsed and saved to your account. You can save multiple resumes and switch between them per job listing.
Plan limits
- Free: 1 saved resume
- Pro: up to 10 saved resumes
Session Analytics
Cuebird provides two types of AI-generated analytics from your session transcript: in-session scorecards and post-session reports.
Quick scorecard
Click the Scorecard button in the Interview Copilot panel at any point during a session. Cuebird sends the current transcript turns to the AI and returns a rapid scorecard covering:
- Overall score (1–10)
- Communication clarity
- Answer structure (STAR adherence)
- Relevance to the question asked
- Brief feedback and suggestions
Post-session report
After a session ends, open the Job Tracker, find the linked session, and click Generate Report. The full transcript is analyzed and produces a comprehensive report with:
- Executive summary of the session
- Turn-by-turn question and answer review
- Aggregate scores across all dimensions
- Top 3 strengths
- Top 3 areas for improvement
- Recommended follow-up preparation
Provider usage dashboard
The Usage panel (accessible from the top bar in the General app) shows your quota and session cost across all AI providers: AssemblyAI hours, ElevenLabs characters, Deepgram requests, and OpenAI tokens. Session cost is calculated in real time using the current pricing table for each model.
Settings & Options
Configuration options are available in the Interview Copilot sidebar and the General app (/). Settings are applied per-session and are not persisted across sessions.
Audio settings
- STT backend
- AssemblyAI Fast · AssemblyAI Standard · Deepgram Nova-3 · Google Cloud Speech
- Input language
- The language being spoken. Used by the STT provider for transcription accuracy.
- Google chunk size
- For Google Cloud Speech, the audio chunk sent per request: 3s / 5s / 8s / 12s / 20s. Smaller chunks have lower latency; larger chunks improve accuracy for complex speech.
TTS settings
- TTS provider
- OpenAI · ElevenLabs · Smallest.AI · None
- Voice
- Provider-specific voice selection. OpenAI voices: alloy, echo, fable, onyx, nova, shimmer.
- Speed
- 0.75× to 3.0×. Applied to each sentence as it's generated.
- Response language
- Language in which TTS speaks. Can differ from input language for translation workflows.
- Input audio
- Play back the original transcribed speech before the AI's suggestion. Used in Translation Mode.
AI settings
- AI model
- GPT-4o mini · GPT-4o · GPT-4.1 mini · GPT-4.1 · o3-mini · o4-mini
- Suggestions on/off
- Disables automatic suggestion generation. You can still trigger suggestions manually via the Suggest button.
Language Support
Cuebird supports over 12 languages for speech-to-text and can generate AI suggestions in any language the model supports.
Supported STT languages
¹ These languages automatically route to Google Cloud Speech for higher accuracy.
STT backend auto-selection
When you select Chinese, Japanese, Korean, or Portuguese as the input language, Cuebird automatically switches to the Google Cloud Speech backend regardless of your manual backend selection. This ensures the best transcription accuracy for those language families.
AI suggestion language
AI suggestions are generated in whatever language you're conversing in by default. To force a specific output language, set the Response Language in the TTS settings. This instructs both the AI and the TTS engine to use that language.
AI Models
Cuebird supports multiple OpenAI models. You can switch models mid-session — the change takes effect on the next suggestion.
Desktop Overlay
The Electron overlay is a transparent, always-on-top window that displays suggestions directly on your screen — without switching apps, and without appearing in screen-share captures.
How it works
The overlay connects to your running Cuebird session as a viewer via a local WebSocket at ws://localhost:3001/ws?viewer=1. Suggestions streamed to your main session are also broadcast to the overlay automatically. The overlay has no audio capture of its own — all STT and AI processing happens in the web app.
Installation
- Clone the Cuebird repository and run
npm installinelectron-overlay/. - Start the Flask backend and Vite frontend (
npm run dev). - Run
cd electron-overlay && npm startto launch the overlay.
The overlay window appears as a small 440×700 px frameless window with rounded corners and a translucent dark background. You can drag it to any screen position.
Overlay controls
- Manual Prompt
- Type a question in the overlay's input and press Enter to inject it directly into the active session — the same as using the web app's manual prompt.
- Suggest button
- Triggers a Force Suggest from the overlay — generates a suggestion from the most recent interviewer turn.
- Opacity slider
- Adjusts the overlay window transparency (20% – 100%).
- Close / Minimize
- Standard window controls accessible via the overlay UI.
Screen-share invisibility
On Windows, the Electron window is created with setContentProtection(true), which prevents it from being captured by screen-sharing software. The window renders normally on your physical display but appears as a black rectangle or is entirely invisible in recordings and screen shares.
Chrome Extension
The Chrome extension provides two capabilities: silent tab audio capture and suggestion feed relay to the Electron overlay.
Tab capture
The extension's background.js uses the chrome.tabCapture API to capture audio from a specific tab without displaying any screen-sharing banners or popups. When you click the Tab capture button in the Interview Copilot, a tab picker opens. Select the tab containing your video call. Audio from that tab flows directly into the STT pipeline.
Overlay relay
The extension's content.js observes the suggestion feed DOM in the Cuebird web app and relays suggestion events (add, update, remove) to the Electron overlay via a local WebSocket server at 127.0.0.1:3334. This allows the overlay to show suggestions even when the browser window is minimized or behind other windows.
Installation
- In Chrome, go to
chrome://extensions. - Enable Developer mode (toggle in the top right).
- Click Load unpacked and select the
extension/folder from the Cuebird repository. - The Cuebird extension icon appears in the Chrome toolbar.
Account & Billing
Manage your account, plan, and billing from the Account page at /account.
Free plan
- Interview & Conversation agent modes
- Real-time AI suggestions (GPT-4o mini)
- 1 saved resume
- Up to 10 job listings
- Knowledge Base (unlimited stories)
Pro plan (coming soon)
- All six agent modes
- Unlimited STT minutes
- TTS playback (OpenAI, ElevenLabs, Smallest.AI)
- All AI models (GPT-4o, GPT-4.1, o4-mini)
- Up to 10 saved resumes
- Up to 50 job listings
- Mock interview mode
- Session scorecards & reports
- Desktop overlay
- Multi-language STT
Billing
Cuebird uses Stripe for payment processing. Manage your subscription, view invoices, and update payment methods from the Billing button on the Account page. This opens the Stripe Customer Portal in a new tab.
Authentication
Cuebird uses Supabase for authentication. Supported methods: email/password and Google OAuth. Passwords are never stored by Cuebird — all credential handling is done by Supabase. You can reset your password via the "Forgot password" flow in the sign-in modal.
Privacy & Data
Cuebird is designed to be privacy-preserving by default. Here's exactly what is and isn't stored.
What is NOT stored
- Audio — Raw audio is streamed directly to the STT provider (AssemblyAI, Deepgram, or Google). Cuebird's servers never store audio files.
- Transcripts — Live transcripts exist only in server memory for the duration of the WebSocket session. Closing the tab discards all transcript data.
- AI suggestions — Suggestions are generated on-demand and discarded when the session ends.
What IS stored
- Your profile — Email, name, plan status (stored in Supabase).
- Resumes — Uploaded resume text (parsed from PDF/DOCX) is stored in Supabase and accessible only to your account.
- Knowledge Base stories — Your STAR stories are stored in Supabase.
- Job listings — Job titles, companies, JD text, status, and any generated session reports are stored in Supabase.
Third-party providers
Cuebird passes audio to STT providers and text to OpenAI for AI completions. Each provider's privacy policy governs how they handle that data. Cuebird does not share your profile data with any third party other than Supabase (auth/storage) and Stripe (billing).