Skip to content

Commit 5cda134

Browse files
authored
Video quality iteration: Alex's v2 feedback fixes
Three PRs: infographic-first SceneRouter (keep CodeMorphScene), Gemini generateContent model detection, longer scripts (2-4 min), purple #7c3aed, black backgrounds, aggressive Ken Burns, no-text hook frame.
2 parents e6c1e83 + 1910358 commit 5cda134

3 files changed

Lines changed: 91 additions & 17 deletions

File tree

app/api/cron/check-research/route.ts

Lines changed: 3 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -686,7 +686,7 @@ Return ONLY a JSON object:
686686
"list": { "items": ["Item 1", "Item 2"], "icon": "🚀" },
687687
"comparison": { "leftLabel": "A", "rightLabel": "B", "rows": [{ "left": "...", "right": "..." }] },
688688
"mockup": { "deviceType": "browser | phone | terminal", "screenContent": "..." },
689-
"imagePrompts": ["Infographic 2D architecture style, black background. [specific visual for this scene]. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations."]
689+
"imagePrompts": ["Infographic 2D architecture style. STRICTLY black (#000000) background only \u2014 no gradients, no blue. [specific visual for this scene]. Highlighted elements filled with vivid purple (#7c3aed) only. White lines connecting components and white text annotations. Large, readable labels."]
690690
}
691691
],
692692
"cta": "string - call to action"
@@ -697,8 +697,9 @@ Return ONLY a JSON object:
697697
Requirements:
698698
- 3-5 scenes totaling 60-90 seconds
699699
- Use at least 2 different scene types
700-
- Each scene MUST include 2-5 imagePrompts following this exact template: "Infographic 2D architecture style, black background. [specific visual]. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations."
700+
- Each scene MUST include 2-5 imagePrompts following this exact template: "Infographic 2D architecture style. STRICTLY black (#000000) background only \u2014 no gradients, no blue. [specific visual]. Highlighted elements filled with vivid purple (#7c3aed) only. White lines connecting components and white text annotations. Large, readable labels."
701701
- imagePrompts should describe specific 2D infographic visuals that illustrate the narration content
702+
- The FIRST scene's imagePrompts must be purely visual and eye-catching — no text labels, no annotations, no words. This is the thumbnail/hook frame.
702703
- Do NOT include any script text, titles, or word overlays in the video. The narration audio carries all words.
703704
- Think of each imagePrompt as a frame that will be shown for 3-5 seconds while the narration plays
704705
- Include REAL code snippets from the research where applicable

app/api/cron/ingest/route.ts

Lines changed: 8 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -258,7 +258,7 @@ Your style is inspired by Cleo Abram's "Huge If True" — you make complex techn
258258
- End with a clear takeaway that makes the viewer feel smarter
259259
- Target audience: developers who want to stay current but don't have time to read everything
260260
261-
Script format: 60-90 second explainer videos. Think TikTok/YouTube Shorts energy with real educational depth.
261+
Script format: 2-4 minute explainer videos for horizontal YouTube, 60-90 seconds for Shorts. Think Cleo Abram energy with real educational depth.
262262
263263
CodingCat.dev covers: React, Next.js, TypeScript, Svelte, web APIs, CSS, Node.js, cloud services, AI/ML for developers, and web platform updates.`;
264264

@@ -311,7 +311,7 @@ function buildPrompt(trends: TrendResult[], research?: ResearchPayload): string
311311
312312
${topicList}${researchContext}
313313
314-
Pick the MOST interesting and timely topic for a short explainer video (60-90 seconds). Then generate a complete video script as JSON.
314+
Pick the MOST interesting and timely topic for an explainer video (2-4 minutes for horizontal YouTube). Then generate a complete video script as JSON.
315315
316316
## Scene Types
317317
@@ -337,7 +337,7 @@ CRITICAL: This video will be a visual infographic explainer. There will be NO te
337337
338338
For EACH scene, generate an "imagePrompts" array with 2-5 image generation prompts. Each prompt should follow this exact template:
339339
340-
"Infographic 2D architecture style, black background. [SPECIFIC VISUAL FOR THIS SCENE]. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations."
340+
"Infographic 2D architecture style, pure black background. [SPECIFIC VISUAL FOR THIS SCENE]. Highlighted elements filled with vivid purple (#7c3aed). White lines connecting components and white text annotations. Color palette: ONLY black, purple (#7c3aed), and white — no blue, no green, no gradients."
341341
342342
Replace [SPECIFIC VISUAL FOR THIS SCENE] with a detailed description of what the infographic should show for that particular scene. Be specific — reference the actual technical concepts, comparisons, or workflows being discussed.
343343
@@ -349,6 +349,8 @@ Guidelines for image prompts:
349349
- For comparison scenes: show side-by-side comparison charts or feature matrices
350350
- For list scenes: show each item as a distinct visual element in the infographic
351351
- Make prompts visually varied — don't repeat the same layout
352+
- STRICT color palette: pure black background (#000000), vivid purple (#7c3aed) for highlighted elements, white for lines and text annotations. Do NOT use blue, green, orange, red, or gradient backgrounds
353+
- FIRST SCENE image prompts must be purely visual and eye-catching — NO text labels, NO annotations, NO words. This is the thumbnail/hook frame that needs to stop the scroll. Show a striking visual metaphor for the topic
352354
353355
## JSON Schema
354356
@@ -369,7 +371,7 @@ Return ONLY a JSON object matching this exact schema:
369371
"visualDescription": "string - what to show on screen (fallback for all types)",
370372
"bRollKeywords": ["keyword1", "keyword2"],
371373
"durationEstimate": 15,
372-
"imagePrompts": ["Infographic 2D architecture style, black background. [specific visual]. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations."],
374+
"imagePrompts": ["Infographic 2D architecture style, pure black background. [specific visual]. Highlighted elements filled with vivid purple (#7c3aed). White lines and text annotations. ONLY black, purple, white colors."],
373375
"code": {
374376
"snippet": "string - actual code to display (only for sceneType: code)",
375377
"language": "typescript | javascript | jsx | tsx | css | html | json | bash",
@@ -398,14 +400,14 @@ Return ONLY a JSON object matching this exact schema:
398400
}
399401
400402
Requirements:
401-
- The script should have 3-5 scenes totaling 60-90 seconds
403+
- The script should have 8-15 scenes totaling 2-4 minutes (120-240 seconds)
402404
- The hook should be punchy and curiosity-driven
403405
- Use at least 2 different scene types for visual variety
404406
- Only include the type-specific field that matches the sceneType (e.g., only include "code" when sceneType is "code")
405407
- For "code" scenes, provide real, syntactically correct code
406408
- The qualityScore should be your honest self-assessment (0-100)
407409
- Each scene MUST include an "imagePrompts" array with 2-5 image generation prompts
408-
- Image prompts must follow the template: "Infographic 2D architecture style, black background. [specific]. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations."
410+
- Image prompts must follow the template: "Infographic 2D architecture style, pure black background. [specific]. Highlighted elements filled with vivid purple (#7c3aed). White lines connecting components and white text annotations. Color palette: ONLY black, purple (#7c3aed), and white."
409411
- Do NOT include any text overlays, titles, or script words in the video — narration audio carries all words
410412
- Calculate prompt count per scene: Math.ceil(durationEstimate / 4)
411413
- Return ONLY the JSON object, no markdown or extra text`;

lib/services/gemini-infographics.ts

Lines changed: 80 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -85,25 +85,97 @@ function getAI(): GoogleGenAI {
8585
// ---------------------------------------------------------------------------
8686

8787
/**
88-
* Generate a single infographic image using Imagen 4 Fast.
88+
* Detect if a model uses the Gemini generateContent API (for image generation)
89+
* vs the Imagen generateImages API.
90+
*
91+
* Gemini image models: gemini-*-image-*, gemini-*-flash-image-*
92+
* Imagen models: imagen-*
93+
*/
94+
function isGeminiImageModel(model: string): boolean {
95+
return model.startsWith('gemini-');
96+
}
97+
98+
/**
99+
* Generate a single image using Gemini's generateContent API.
100+
* Used for models like gemini-3.1-flash-image-preview.
101+
*/
102+
async function generateWithGeminiContent(
103+
prompt: string,
104+
model: string,
105+
aspectRatio: string,
106+
): Promise<{ imageBase64: string; mimeType: string }> {
107+
const ai = getAI();
108+
109+
const response = await ai.models.generateContent({
110+
model,
111+
contents: prompt,
112+
config: {
113+
responseModalities: ['IMAGE'],
114+
imageConfig: {
115+
aspectRatio: aspectRatio as '1:1' | '3:4' | '4:3' | '9:16' | '16:9',
116+
},
117+
},
118+
});
119+
120+
// Extract image from response
121+
const parts = response.candidates?.[0]?.content?.parts;
122+
if (!parts || parts.length === 0) {
123+
throw new Error(`Gemini returned no parts for prompt "${prompt.slice(0, 80)}…"`);
124+
}
125+
126+
// Find the image part
127+
for (const part of parts) {
128+
if (part.inlineData?.data) {
129+
return {
130+
imageBase64: part.inlineData.data,
131+
mimeType: part.inlineData.mimeType || 'image/png',
132+
};
133+
}
134+
}
135+
136+
throw new Error(`Gemini returned no image data for prompt "${prompt.slice(0, 80)}…"`);
137+
}
138+
139+
/**
140+
* Generate a single infographic image.
141+
*
142+
* Automatically detects whether to use Gemini generateContent (for gemini-*
143+
* models) or Imagen generateImages (for imagen-* models).
89144
*
90145
* @param request - Prompt and generation options.
91-
* @param model - Imagen model ID (e.g. "imagen-4.0-fast-generate-001").
146+
* @param model - Model ID (e.g. "gemini-3.1-flash-image-preview" or "imagen-4.0-fast-generate-001").
92147
* @returns InfographicResult with base64 image bytes.
93148
* @throws If the API call fails or no image is returned.
94149
*/
95150
export async function generateInfographic(
96151
request: InfographicRequest,
97152
model: string = "imagen-4.0-fast-generate-001",
98153
): Promise<InfographicResult> {
154+
const aspectRatio = request.aspectRatio ?? "16:9";
155+
156+
if (isGeminiImageModel(model)) {
157+
// Gemini path: generateContent with responseModalities: ["IMAGE"]
158+
const result = await generateWithGeminiContent(
159+
request.prompt,
160+
model,
161+
aspectRatio,
162+
);
163+
return {
164+
imageBase64: result.imageBase64,
165+
mimeType: result.mimeType,
166+
prompt: request.prompt,
167+
};
168+
}
169+
170+
// Imagen path: generateImages (existing code)
99171
const ai = getAI();
100172

101173
const response = await ai.models.generateImages({
102174
model,
103175
prompt: request.prompt,
104176
config: {
105177
numberOfImages: 1,
106-
aspectRatio: request.aspectRatio ?? "16:9",
178+
aspectRatio: aspectRatio,
107179
...(request.negativePrompt && { negativePrompt: request.negativePrompt }),
108180
},
109181
});
@@ -117,7 +189,6 @@ export async function generateInfographic(
117189
}
118190

119191
const imageBytes = generated.image.imageBytes;
120-
// imageBytes may be a Uint8Array or base64 string depending on SDK version
121192
const imageBase64 =
122193
typeof imageBytes === "string"
123194
? imageBytes
@@ -210,11 +281,11 @@ export function buildInfographicPrompt(
210281

211282
/** Default infographic instructions if Sanity contentConfig is not set up */
212283
const DEFAULT_INSTRUCTIONS: string[] = [
213-
'Infographic 2D architecture style, black background. A high-level technical architecture overview showing system components and data flow. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations.',
214-
'Infographic 2D architecture style, black background. A comparison chart showing key features and alternatives side by side. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations.',
215-
'Infographic 2D architecture style, black background. A step-by-step workflow diagram showing the process from start to finish. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations.',
216-
'Infographic 2D architecture style, black background. A timeline of key developments, milestones, and version releases. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations.',
217-
'Infographic 2D architecture style, black background. A pros and cons visual summary with clear icons and labels. Highlighted elements filled with #15b27b. White lines connecting components and white text annotations.',
284+
'Infographic 2D architecture style. STRICTLY black (#000000) background only \u2014 no gradients, no blue. A high-level technical architecture overview showing system components and data flow. Highlighted elements filled with vivid purple (#7c3aed) only. White lines connecting components and white text annotations. Large, readable text labels suitable for mobile viewing at 360px width. No watermarks.',
285+
'Infographic 2D architecture style. STRICTLY black (#000000) background only \u2014 no gradients, no blue. A comparison chart showing key features and alternatives side by side. Highlighted elements filled with vivid purple (#7c3aed) only. White lines connecting components and white text annotations. Large, readable text labels suitable for mobile viewing at 360px width. No watermarks.',
286+
'Infographic 2D architecture style. STRICTLY black (#000000) background only \u2014 no gradients, no blue. A step-by-step workflow diagram showing the process from start to finish. Highlighted elements filled with vivid purple (#7c3aed) only. White lines connecting components and white text annotations. Large, readable text labels suitable for mobile viewing at 360px width. No watermarks.',
287+
'Infographic 2D architecture style. STRICTLY black (#000000) background only \u2014 no gradients, no blue. A timeline of key developments, milestones, and version releases. Highlighted elements filled with vivid purple (#7c3aed) only. White lines connecting components and white text annotations. Large, readable text labels suitable for mobile viewing at 360px width. No watermarks.',
288+
'Infographic 2D architecture style. STRICTLY black (#000000) background only \u2014 no gradients, no blue. A pros and cons visual summary with clear icons and labels. Highlighted elements filled with vivid purple (#7c3aed) only. White lines connecting components and white text annotations. Large, readable text labels suitable for mobile viewing at 360px width. No watermarks.',
218289
];
219290

220291
// ---------------------------------------------------------------------------

0 commit comments

Comments
 (0)