VeeFly AI — Case Study by Smaran

The Brief

VeeFly had reach.
Creators still couldn't create.

VeeFly had built a strong YouTube promotion platform with 56,000+ active creators. But there was a gap no one had addressed: the hours between "I have an idea" and "I'm ready to promote."

The task was to design an AI-powered content creation layer inside VeeFly — not just another GPT wrapper, but a system that understood each creator's unique channel, niche, and audience well enough to generate content that actually worked for them.

"I can amplify a video the moment it's ready. But getting to 'ready' still takes my creators 4 days. That's where we're losing them."

— VeeFly Product Lead, Initial Brief

Research scope

38 creator interviews · 6 marketing teams · 64-person survey · 14 channel analytics audits · Competitive benchmarking of 12 tools

User Interviews · Verbatim

We talked to 38 creators.
In their own words.

Semi-structured 45-minute interviews. I followed creators through their actual production workflow — screen-sharing, thinking aloud, showing me where things fell apart.

Recruited across four segments: solo creators under 100K, growing channels 100K–500K, brand content teams, and educator-creators. Each session was recorded, transcribed, and coded against a taxonomy of friction points, workaround behaviours, and unmet needs.

45-min sessions Screen-sharing walkthroughs Think-aloud protocol Affinity coded 4 creator segments

🎬

Arjun, 28

Solo Lifestyle
180K Subscribers

"I have ChatGPT, Descript, TubeBuddy, and Google Docs open at the same time. By the time I've stitched them together, I've forgotten the energy I had when the idea hit. The AI doesn't know who I am — it gives me content that sounds like it was written for someone else's channel."

Core Pain

Tool fragmentation kills creative momentum. Generic AI outputs don't match his channel voice or audience — requiring full rewrites that negate any time saved.

📊

Sarah, 34

Content Marketing Lead
SaaS Brand · 4-person team

"Every video sounds different. We have 2 years of channel data proving what works for our audience — but every AI tool ignores it completely. I have to manually brief the AI like it's a new intern every single time."

Core Pain

Lacks channel-context persistence. Each AI session starts from zero — wasting the institutional knowledge in 2 years of channel performance data.

🎙️

Marcus, 41

Educator & Course Creator
72K Subscribers

"I know what I want to say. I just don't know how to make people watch past 2 minutes. I have 3 years of analytics sitting there but I have no idea what they're telling me. I feel like I'm leaving performance on the table."

Core Pain

Has deep subject expertise and rich channel history, but lacks the analytical capability to translate retention data into actionable content decisions.

⚡

Priya, 26

Finance Creator
310K Subscribers

"I've tried every AI tool. They all write like a generic content farm. My audience is really specific — they want a certain tone, certain depth. No tool has ever produced something I could publish without spending 2 hours fixing it first."

Core Pain

High niche specificity means generic outputs are unusable. The cost of editing AI output exceeds the cost of writing from scratch for her technically-oriented audience.

🏋️

Karan, 31

Fitness Creator
88K Subscribers

"I had a great idea on Monday. The video wasn't out until Thursday. I spent Monday writing a script that wasn't good enough, Tuesday re-scripting, Wednesday recording, Thursday editing. The idea was stale by the time it published."

Core Pain

Idea-to-publish latency of 3–4 days. Scripting is the bottleneck — not recording or editing. First draft quality determines the entire downstream timeline.

Quantitative Survey · n=64

The numbers confirmed
every interview signal.

64 creators across segments. 4-week diary study combined with a structured survey covering workflow friction, tool usage, AI adoption, and unmet needs.

The survey validated qualitative signals at scale. The standout finding: 83% of creators who had tried AI tools reported that the outputs required so much revision they questioned whether AI saved any time at all. The problem wasn't the AI's writing ability — it was that the AI had no idea who it was writing for.

83%

Of creators who tried AI tools said outputs required full rewrites — the AI had no context about their niche, voice, or audience expectations.

74%

Said they needed YouTube-specific AI that understood narrative structure — hooks, retention loops, CTAs — not just generic text generation.

50%

Of total production time was consumed by scripting alone. The single highest friction point in the entire video production workflow.

3–5

Average number of separate tools creators used per video. Every tool switch costs time, context, and the creative momentum built in the previous session.

Synthesised Pain Points

Six friction points.
One broken system.

After coding 38 interview transcripts and 64 survey responses, I mapped every friction point to six categories. These became the design brief.

The pattern was clear: creators weren't struggling with any single tool. They were struggling with a system that had no memory, no niche awareness, and no continuity across production phases. Each tool was an island.

🧠

No Channel Memory

Every AI session starts from zero. Years of channel data — what performed, what flopped, what their audience rewards — sits untouched in YouTube Studio.

⏱️

4–6h

Production Bottleneck

Average video production time. Scripting accounts for ~50%. Most of that time is spent wrestling tools, not creating.

🔀

3–5

Tool Fragmentation

Separate tools for scripting, voice, and SEO. Each switch resets context, loses momentum, and introduces inconsistency.

🎭

69%

Generic AI Output

Creators rejected AI outputs because they lacked niche specificity, channel voice, and understanding of what their specific audience responds to.

📉

Analytics Blindness

Creators have years of performance data but no system to translate it into content decisions. The data exists — the intelligence layer doesn't.

🌀

3×

Re-work Cost

Time spent revising and stitching AI outputs was 3× the time it took to generate them. The hidden cost of poor context made AI slower than manual writing.

Insight Synthesis · How Might We

Turning pain into
design questions.

After synthesis, I converted every key pain point into a "How Might We" question. These became the design brief that guided every subsequent decision.

The framing process was deliberate. Bad HMWs lead to feature solutions. Good HMWs lead to system solutions. I rewrote each one until it was broad enough to invite creative approaches but specific enough to stay grounded in real creator behaviour.

💡

How might we give the AI a memory of the creator's channel — so it already knows their niche, voice, and what works for their audience before a prompt is typed?

🎯

How might we surface the right content idea at the right moment — so creators never start from a blank canvas, but from a suggestion grounded in their own performance data?

🔗

How might we unify script, voice, and SEO into a single workflow — so the context built in one phase carries invisibly into the next, with no tool-switching friction?

📊

How might we translate a creator's analytics into plain-language content strategy — so 3 years of performance data becomes actionable, not just informational?

⚡

How might we compress idea-to-first-draft to under 60 seconds — so the creative energy of a good idea isn't lost to production friction?

🎭

How might we make the AI feel like a collaborator who knows this creator — not a generic text generator that could be talking to anyone?

Design Principles · Defined Week 3

Six principles set before the first wireframe.

Defining principles before ideating forces rigour. Every subsequent concept was evaluated against these. If a design couldn't justify itself against at least 3 of these, it went back to the drawing board.

🧠

Know the Creator Before They Speak

The AI's first job is to ingest channel data. Every output must reflect what the system already knows — not what the creator had to explain.

🧭

Guide, Don't Generate

Creative director, not vending machine. Every output step is a collaborative refinement — not a single-shot generation that the creator has to fix.

🧵

Context Persistence

Channel niche, audience intent, and proven narrative patterns must travel through the entire workflow without the creator re-explaining anything.

🎭

Platform-Native Intelligence

Every output is shaped by YouTube's mechanics — hooks, retention curves, title psychology — calibrated to the creator's specific niche data.

🔄

Edit-First Outputs

All AI outputs are editable drafts. The UI must make editing faster and lower-effort than regenerating from scratch.

⚡

Zero Cold-Start Friction

Personalised suggestions eliminate the blank-canvas problem. Creators always start from something — never from nothing.

Process Timeline · 14 Weeks

Five phases. One north star question.

"How do we build an AI that already knows the creator before they type a word?" — every phase was structured to answer a part of that question.

Weeks 1–3 / Discover

Research, Synthesis & Problem Framing

38 creator interviews, 6 team walkthroughs, competitive audit of 12 tools, 64-person survey, and 14 channel analytics audits. Synthesised into 6 pain point categories and 6 HMW questions that became the design brief.

User InterviewsDiary StudiesChannel AuditsCompetitive AnalysisAffinity Mapping

Weeks 4–5 / Define

Channel Intelligence Architecture & Design Principles

Defined the intelligence layer's data model: what to ingest from YouTube API, what signals to extract, how those signals translate into personalised suggestions and content generation context. Wrote 6 design principles before touching wireframes.

Persona DevelopmentData ArchitectureJourney MappingDesign PrinciplesJobs-to-be-Done

Weeks 6–8 / Ideate

Three Structural Models — Tested, One Chosen

Explored wizard-based, dashboard-based, and conversational UI models. Rapid prototype test (n=18) validated conversational with channel-aware suggestion chips as highest-converting and lowest friction. Suggestion-to-session conversion was 3× higher than the dashboard model.

Concept ExplorationRapid PrototypingA/B Concept TestingSuggestion System Design

Weeks 9–12 / Design

High-Fidelity UI, Component System & Usability Testing

Three rounds of usability testing. Each round surfaced a distinct class of problem: Round 1 — input structure; Round 2 — output editing mechanics; Round 3 — voiceover selection and context persistence. Major iterations documented below.

WireframingHigh-Fi PrototypingComponent System3× Usability Testing

Weeks 13–14 / Deliver

Engineering Handoff, QA & Launch

Detailed interaction specs covering suggestion ranking edge cases, new-channel onboarding (limited data handling), AI response streaming, and error state design. Post-launch 90-day cohort measurement.

Dev HandoffQA ReviewEdge Case SpecImpact Measurement

Design Iterations · Before → After + Why

Every decision was
earned, not assumed.

Below are the five most consequential design pivots. Each one shows the initial direction, the decision made, and the specific evidence that drove the change.

"The first version felt like using a very fast Google Form. We want it to feel like working with someone who gets you."

— Usability Test Participant, Round 1

Initial Direction

Generic trending topic suggestions

Show creators platform-wide trending YouTube topics — the same approach every competitor used. Low engineering cost, easy to justify with "people want popular ideas."

→

Iteration Made

Personalised suggestions from channel analytics

Connect to each creator's YouTube channel via API. Analyse their top-performing content, identify topic clusters with proven engagement, and rank suggestions by predicted performance for their specific audience.

Why? 83% of interview participants explicitly said generic suggestions were useless because they didn't match their niche. Prototype A/B test: personalised suggestions had 3× higher tap rate and 2.1× higher session completion vs generic. The data made this an easy call.

Initial Direction

Form-based structured input

A traditional form with fields for Topic, Audience, Tone, and Keywords — mirroring what ChatGPT custom instructions and competitor tools used. Felt organised. Stakeholders liked the visual structure.

→

Iteration Made

Single open prompt + channel-aware chip suggestions

A single natural-language input field, with personalised chip suggestions below it. The channel intelligence layer means the AI already has the "form" filled in — tone, niche, audience. The creator just describes the idea.

Why? Usability round 1: 7 of 9 participants hesitated or abandoned at the Tone field — they didn't know how to describe their own tone in a dropdown. Open input + channel context eliminated this friction entirely. Session start rate improved 40%.

Initial Direction

Full script delivered in one pass

Engineering proposed generating the complete 600–800 word script in a single API call. Simpler architecture, lower latency variation, cleaner state management.

→

Iteration Made

Progressive section-by-section streaming

Stream the script section by section — Hook → Setup → Part 1 → Part 2 → CTA — with each section editable before the next appears. Matches how creators actually think about and review scripts.

Why? Usability round 2: receiving an 800-word wall of text led to 6 of 8 participants scrolling to the bottom without reading — then regenerating. Section streaming increased engagement with each block by 63% and self-reported "felt in control" from 34% to 71%.

Initial Direction

Separate tabs — Script / Voice / SEO

A tabbed navigation structure to organise the three output types. Clean, predictable, easy to build. Product team initially preferred this for its visual clarity.

→

Iteration Made

Single unified scrollable thread

All outputs — title, description, hashtags, script, voiceover player — in one continuous document. The mental model shifted from "three tools in one UI" to "one content package being built progressively."

Why? Usability round 2: participants on the tabbed version kept switching back to re-read the script while editing the SEO section — reintroducing the exact context-switching friction we'd designed to eliminate. 5 of 7 said the scroll version felt "like a real document."

Initial Direction

Voice selection by name only

A dropdown list of voice names (Rosey, Miley, David, Frank…). Engineering wanted this for simplicity. PM felt names were memorable enough to become familiar.

→

Iteration Made

Visual grid with tone labels + per-voice previews

A modal grid showing avatar, name, personality trait (Friendly / Cheerful / Matured / Professional / Excited / Natural), and an inline play button to preview the voice before selecting.

Why? Usability round 3: 8 of 10 participants selected a voice, listened to the output, then regenerated because the voice "wasn't what they expected." Adding tone labels reduced wrong-voice selections by 58%. Preview buttons cut voice-related regenerations by 40% post-launch.

System Architecture

From raw channel data
to personalised intelligence.

The intelligence layer has three stages: ingest, analyse, surface. Each stage is invisible to the creator — but its output shapes every part of the experience they see.

The design challenge wasn't building the system — that was engineering's job. The design challenge was making the intelligence feel natural and trustworthy: showing creators that the AI knows them without making them feel surveilled, and surfacing confidence signals without overwhelming the interface.

📡

Channel Scan

YouTube API ingests upload history, titles, descriptions, tags, engagement metrics, and retention data across full channel history.

→

🧠

Pattern Analysis

NLP clustering identifies topic niches. Retention analysis extracts hook formats and narrative structures. Engagement patterns reveal what this audience rewards.

→

✨

Personalised Output

Suggestions ranked by predicted performance. Generated content tuned to the creator's proven topic niche, narrative format, and audience tone.

📊

What Performs

Views, watch time, CTR and engagement rate per topic cluster reveal the content formula that works for this specific audience.

🎯

What Topics Stick

NLP clustering of titles surfaces the recurring themes that define the creator's niche territory and topic authority with their audience.

🔁

How They Work

Narrative structure analysis — hook formats, section pacing, CTA placement — extracts the storytelling patterns this audience rewards with watch time.

📈

Why They Work

Retention curve analysis combined with comment sentiment mining reveals the emotional triggers and content qualities driving loyal repeat viewing.

Analytics — What the AI Sees

Live channel intelligence
before the first prompt.

This is the data layer that makes every suggestion and every generated output personal. The analytics dashboard shows creators the same intelligence the AI is using — building transparency and trust in the system.

One of the key design decisions was making this data visible to creators — not hiding it behind the "magic." Showing creators their own performance data validated the AI's suggestions and increased trust in outputs by giving creators a reason to believe the suggestions were grounded in evidence, not guesswork.

Scanning channel: @smaran · Analysing 56 videos

✅ Fetched video metadata, titles & tags (56 videos)

✅ Analysed retention curves & viewer drop-off points

✅ Clustered content into 4 topic niches by NLP

Ranking topics by predicted performance for your audience…

⏳ Generating personalised content suggestions

Channel Analytics · @smaran

Overview

Topics

Retention

Suggestions

4.2K ↑ 18%

Avg Views / Video

68% ↑ 5pt

Avg Retention

3.1% ↓ 0.4pt

Click-Through Rate

2.4K ↑ 32%

New Subs (30d)

Views per upload — last 12 videos

Avg retention curve — top 5 performing videos

Topic cluster performance score

Productivity

88%

Morning Routines

74%

Day in Life

61%

Mindset

45%

Numbers that moved the business.

68%

Reduction in end-to-end video production time across the 90-day cohort

2×

Increase in weekly publishing frequency among active VeeFly AI users

41%

Improvement in script completion rate and self-reported narrative coherence

3×

Higher 60-day retention for AI users — became platform's top retention driver

Creator Sentiment

"This is the first AI tool that actually understood I was making a YouTube video for my audience — not just generating content for anyone."

Team Feedback

"The suggestions alone cut our planning time in half. We went from 3-day turnarounds to publishing the same week we have the idea."

Business Signal

"VeeFly AI users with personalised suggestions active showed 2.4× higher weekly engagement compared to those without — it became our strongest retention lever."

Scope — What Shipped vs What Didn't

Ruthless scope protection
made the product work.

Early stakeholder pressure pushed for video editing, thumbnail generation, and social scheduling. We pushed back — backed by research showing channel intelligence + script + voice + SEO covered 80% of creator friction.

"The temptation is always to add more. The discipline is knowing that adding thumbnail generation would have split engineering focus and delayed the intelligence layer by 6 weeks — and the intelligence layer was the whole product."

— Post-launch retrospective note

✓ Shipped in V1

Channel intelligence layer — YouTube API scan + pattern analysis

Personalised topic suggestion chips with performance ranking

Conversational content generation with section streaming

Unified scroll output — title, description, hashtags, script

AI voiceover with tone-labelled voice grid + inline preview

Channel analytics dashboard visible to creators

→ Deferred to V2

Thumbnail generation (would split AI model focus)

Video editing integration (outside core job-to-be-done)

Social media scheduling (separate workflow, different user)

Competitor channel analysis (privacy + ethical concerns)

Auto-publish to YouTube (needed additional OAuth scope)

Key Takeaways · What This Project Taught Me

Four things that
changed how I work.

Not lessons I read in a book. Lessons extracted from specific moments in this project where the evidence forced a belief update.

"Designing with AI is different from designing for AI. The AI is a collaborator, not a feature — and the interface has to reflect that relationship."

— Personal reflection, post-launch retrospective

The best brief is written from data, not by the user

Asking creators to describe their niche every time was friction disguised as "personalisation." When the AI already knew from channel data, the experience transformed from "briefing a new intern" to "working with someone who genuinely gets your audience." The input box became a direction, not a specification. This was the single biggest UX unlock of the project — and it came from research, not from a design idea.

Latency is a design material, not a constraint

We couldn't eliminate AI response latency. But we could design around it. Progressive section streaming, skeleton states with thoughtful loading copy, and section-by-section reveals turned wait time into "watching the AI think" — which paradoxically increased perceived quality. Creators who saw section streaming reported higher satisfaction with the same underlying output quality as those who received it all at once.

Trust in AI is built in micro-moments, not overall quality

Creators don't evaluate AI output holistically — they have trust checkpoints. A suggestion that matches their proven topic cluster. A hook that sounds like something they'd actually say. A title that reflects their audience's search behaviour. Designing for trust means identifying and optimising those specific validation moments — not just improving average output quality. This changed how I think about AI product metrics.

Scope protection requires evidence, not conviction

Stakeholder pressure to expand scope is normal. What's not normal is having the research evidence to pushback credibly. Every feature deferred to V2 was deferred because the research showed it didn't address the core friction. "Our survey of 64 creators showed that scripting accounts for 50% of production friction — thumbnail generation doesn't appear in the top 5 pain points" is a conversation-ending answer. Research doesn't just inform design — it protects it.

From ChannelIntelligenceto Publish-Readyin Minutes.

Understanding the space before designing a single pixel.

VeeFly had reach.Creators still couldn't create.

We talked to 38 creators.In their own words.

The numbers confirmedevery interview signal.

Six friction points.One broken system.

No Channel Memory

Production Bottleneck

Tool Fragmentation

Generic AI Output

Analytics Blindness

Re-work Cost

Turning pain intodesign questions.

What we tried, what failed, and why we changed it.

Six principles set before the first wireframe.

Know the Creator Before They Speak

Guide, Don't Generate

Context Persistence

Platform-Native Intelligence

Edit-First Outputs

Zero Cold-Start Friction

Five phases. One north star question.

Research, Synthesis & Problem Framing

Channel Intelligence Architecture & Design Principles

Three Structural Models — Tested, One Chosen

High-Fidelity UI, Component System & Usability Testing

Engineering Handoff, QA & Launch

Every decision wasearned, not assumed.

Generic trending topic suggestions

Personalised suggestions from channel analytics

Form-based structured input

Single open prompt + channel-aware chip suggestions

Full script delivered in one pass

Progressive section-by-section streaming

Separate tabs — Script / Voice / SEO

Single unified scrollable thread

Voice selection by name only

Visual grid with tone labels + per-voice previews

The AI reads your channel before it writes a word.

From raw channel datato personalised intelligence.

Channel Scan

Pattern Analysis

Personalised Output

What Performs

What Topics Stick

How They Work

Why They Work

Live channel intelligencebefore the first prompt.

The complete UI.Interact with it live.

What shipped. What it changed. What I learned.

Numbers that moved the business.

Ruthless scope protectionmade the product work.

Four things thatchanged how I work.

The best brief is written from data, not by the user

Latency is a design material, not a constraint

Trust in AI is built in micro-moments, not overall quality

Scope protection requires evidence, not conviction

From Channel
Intelligence
to Publish-Ready
in Minutes.

VeeFly had reach.
Creators still couldn't create.

We talked to 38 creators.
In their own words.

The numbers confirmed
every interview signal.

Six friction points.
One broken system.

Turning pain into
design questions.

Every decision was
earned, not assumed.

From raw channel data
to personalised intelligence.

Live channel intelligence
before the first prompt.

The complete UI.
Interact with it live.

Ruthless scope protection
made the product work.

Four things that
changed how I work.