Gemini API Integration

UniPulse uses Google Gemini 2.5 Flash as its primary AI model via the @google/genai SDK for all text generation, and text-embedding-004 for vector embeddings.

SDK Setup

import { GoogleGenAI } from '@google/genai';

const genai = new GoogleGenAI({ apiKey: process.env.GEMINI_API_KEY });

Model	Purpose	Configuration
`gemini-2.5-flash`	Text generation (captions, replies, analysis)	Temperature varies by task
`text-embedding-004`	Vector embeddings (similarity, classification)	Fixed dimensions

Core AI Functions

All AI functions are in apps/api/src/services/ai.service.ts:

Function	Purpose	Input	Output
`callGemini()`	Base function for all Gemini API calls	Prompt, system prompt, temperature	Generated text
`generateCaption()`	Generate post captions	Topic, platform, brand voice, language, count	Array of caption options
`rewriteCaption()`	Rewrite an existing caption	Caption, instruction, platform	Rewritten caption
`generateHashtags()`	Generate relevant hashtags	Caption, platform, count	Array of hashtags
`generateCTA()`	Generate call-to-action text	Context, goal, platform	CTA options
`translateCaption()`	Translate caption to another language	Caption, target language	Translated text
`generateImage()`	Generate image from prompt	Description, style, dimensions	Image URL

API Endpoints

All AI routes are under /api/v1/ai:

Endpoint	Method	Function Called	Rate Limited
`/api/v1/ai/caption/generate`	POST	`generateCaption()`	`aiLimiter` + `requireQuota`
`/api/v1/ai/caption/rewrite`	POST	`rewriteCaption()`	`aiLimiter` + `requireQuota`
`/api/v1/ai/hashtags/generate`	POST	`generateHashtags()`	`aiLimiter` + `requireQuota`
`/api/v1/ai/cta/generate`	POST	`generateCTA()`	`aiLimiter` + `requireQuota`
`/api/v1/ai/translate`	POST	`translateCaption()`	`aiLimiter` + `requireQuota`
`/api/v1/ai/image/generate`	POST	`generateImage()`	`aiLimiter` + `requireQuota`
`/api/v1/ai/chat`	POST	AI chat	`aiLimiter`
`/api/v1/ai/suggestions`	GET	Background suggestions	`requireFeature`

Where AI Is Used Across the Platform

Feature	Service	Description	Temperature
Caption generation	`ai.service`	Generate social media captions	0.8 (creative)
Caption rewriting	`ai.service`	Rewrite with specific instructions	0.7
Hashtag generation	`ai.service`	Platform-optimized hashtags	0.5
CTA generation	`ai.service`	Call-to-action text	0.7
Translation	`ai.service`	Multi-language caption translation	0.3 (precise)
Brand voice application	`brand-voice.service`	Apply trained brand voice style	0.6
Content calendar	`content-calendar.service`	Plan content schedules with AI	0.8
Content repurposing	`content-repurpose.service`	Transform content across formats	0.7
Trend analysis	`trend-scanner.service`	Detect and analyze trending topics	0.5
Performance advisor	`ai-advisor.service`	Generate performance recommendations	0.4
Conversation replies	`conversation-brain.service`	Generate context-aware replies	0.6
Intent classification	`intent-classifier.service`	Classify message intent and sentiment	0.2 (deterministic)
Post classification	`classification.service`	Categorize post content	0.2
A/B test evaluation	`ab-test.service`	Analyze test results with AI	0.3
Gap analysis	`competitor.service`	Competitive content gap analysis	0.5
Predictions	`prediction.service`	Engagement and revenue forecasting	0.3

Usage Tracking

All AI calls are tracked per workspace via incrementUsage():

// After each successful AI call
await incrementUsage(workspaceId, 'ai_generations');

Usage is stored in the WorkspaceUsage model and checked by requireQuota middleware before each AI request. The AiPromptLog model records individual API calls for debugging and cost analysis:

Field	Purpose
`workspaceId`	Workspace that made the call
`promptType`	Type of prompt (caption, hashtag, reply, etc.)
`tokens`	Token count (input + output)
`model`	Model used (gemini-2.5-flash)
`latencyMs`	Response time
`createdAt`	Timestamp

Prompt Type Index (PROMPT_CONFIGS)

All prompt types are indexed in PROMPT_CONFIGS for consistent configuration:

const PROMPT_CONFIGS = {
  caption_generate: { model: 'gemini-2.5-flash', temperature: 0.8, maxTokens: 2000 },
  caption_rewrite:  { model: 'gemini-2.5-flash', temperature: 0.7, maxTokens: 2000 },
  hashtag_generate: { model: 'gemini-2.5-flash', temperature: 0.5, maxTokens: 500 },
  cta_generate:     { model: 'gemini-2.5-flash', temperature: 0.7, maxTokens: 500 },
  translate:        { model: 'gemini-2.5-flash', temperature: 0.3, maxTokens: 2000 },
  intent_classify:  { model: 'gemini-2.5-flash', temperature: 0.2, maxTokens: 200 },
  conversation:     { model: 'gemini-2.5-flash', temperature: 0.6, maxTokens: 1000 },
  // ... more prompt types
};

Rate Limiting & Cost Control

Control	Implementation
Per-IP rate limiting	`aiLimiter` in `rateLimiter.ts`
Per-workspace quotas	`requireQuota('ai_generations')` middleware
Plan-based limits	`PlanFeature` defines max AI calls per billing period
Usage dashboard	Admin panel shows AI consumption per workspace
Cost tracking	`AiPromptLog` records tokens for cost analysis

Environment Variables

Variable	Required	Description
`GEMINI_API_KEY`	Yes (for AI features)	Google AI API key from ai.google.dev

API Key Security

The GEMINI_API_KEY should never be exposed to the frontend. All AI calls go through the backend API, which adds authentication, rate limiting, and usage tracking before calling Gemini.

Cross-Reference

Conversation Engine -- ICE 3-step LLM pipeline
Prompt Templates -- prompt engineering patterns
Embeddings -- vector embedding details
Memory System -- conversation memory tiers
Services -- AI service layer
Queue System -- async AI queues

SDK Setup​

Core AI Functions​

API Endpoints​

Where AI Is Used Across the Platform​

Usage Tracking​

Prompt Type Index (PROMPT_CONFIGS)​

Rate Limiting & Cost Control​

Environment Variables​