Cogitator
Memory & RAG

Embedding Providers

Configure embedding providers for semantic search — OpenAI, Ollama, and Google — and understand how embeddings power agent memory.

How Embeddings Work

Embeddings convert text into dense numerical vectors that capture semantic meaning. Two pieces of text about the same topic will have vectors that are close together in high-dimensional space, even if they use different words.

In Cogitator, embeddings enable:

  • Semantic search over stored documents and past conversations
  • Hybrid search combining vector similarity with keyword matching
  • Knowledge graph node search by meaning rather than exact name
  • Context building that retrieves the most relevant memories for the current query

All embedding providers implement the EmbeddingService interface:

interface EmbeddingService {
  embed(text: string): Promise<number[]>;
  embedBatch(texts: string[]): Promise<number[][]>;
  readonly dimensions: number;
  readonly model: string;
}

Factory Function

Use createEmbeddingService to instantiate a provider from a config object:

import { createEmbeddingService } from '@cogitator-ai/memory';

const embeddings = createEmbeddingService({
  provider: 'openai',
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
});

const vector = await embeddings.embed('What is Cogitator?');
console.log(vector.length); // 1536

OpenAI

Uses the OpenAI embeddings API. Supports text-embedding-3-small (1536 dimensions) and text-embedding-3-large (3072 dimensions).

import { OpenAIEmbeddingService } from '@cogitator-ai/memory';

const embeddings = new OpenAIEmbeddingService({
  apiKey: process.env.OPENAI_API_KEY!,
  model: 'text-embedding-3-small',
});

const vector = await embeddings.embed('Agent memory architecture');
console.log(embeddings.dimensions); // 1536
console.log(embeddings.model); // 'text-embedding-3-small'

Batch embedding sends all texts in a single API call:

const vectors = await embeddings.embedBatch([
  'First document about agents',
  'Second document about tools',
  'Third document about memory',
]);
// vectors.length === 3, each vector.length === 1536
OptionTypeDefaultDescription
apiKeystringOpenAI API key
modelstringtext-embedding-3-smallEmbedding model name
baseUrlstringhttps://api.openai.com/v1API base URL (for proxies)

Dimensions by model:

ModelDimensions
text-embedding-3-small1536
text-embedding-3-large3072

Ollama

Run embedding models locally with zero API costs. Ollama's /api/embed endpoint supports both single and batch embedding natively.

import { OllamaEmbeddingService } from '@cogitator-ai/memory';

const embeddings = new OllamaEmbeddingService({
  model: 'nomic-embed-text',
  baseUrl: 'http://localhost:11434',
});

const vector = await embeddings.embed('Local embedding example');
console.log(embeddings.dimensions); // 768

Pull a model first:

ollama pull nomic-embed-text
OptionTypeDefaultDescription
modelstringnomic-embed-textOllama embedding model
baseUrlstringhttp://localhost:11434Ollama server URL

Supported models and dimensions:

ModelDimensions
nomic-embed-text768
nomic-embed-text-v2-moe768
mxbai-embed-large1024
all-minilm384
snowflake-arctic-embed1024

For models not in this list, the service defaults to 768 dimensions. You can use any Ollama-compatible embedding model.

Google

Uses the Gemini text-embedding-004 model via the Google AI API.

import { GoogleEmbeddingService } from '@cogitator-ai/memory';

const embeddings = new GoogleEmbeddingService({
  apiKey: process.env.GOOGLE_API_KEY!,
  model: 'text-embedding-004',
});

const vector = await embeddings.embed('Google embedding example');
console.log(embeddings.dimensions); // 768

Batch embedding uses Google's batchEmbedContents endpoint:

const vectors = await embeddings.embedBatch(['First passage', 'Second passage']);
OptionTypeDefaultDescription
apiKeystringGoogle AI API key
modelstringtext-embedding-004Embedding model name

Dimensions: text-embedding-004 produces 768-dimensional vectors.

Using with Memory Adapters

Embeddings are stored via EmbeddingAdapter (implemented by PostgresAdapter, QdrantAdapter, and InMemoryEmbeddingAdapter).

import { PostgresAdapter, OpenAIEmbeddingService } from '@cogitator-ai/memory';

const db = new PostgresAdapter({
  provider: 'postgres',
  connectionString: process.env.DATABASE_URL!,
});
await db.connect();

const embeddings = new OpenAIEmbeddingService({
  apiKey: process.env.OPENAI_API_KEY!,
});

const text = 'Cogitator agents can use tools to interact with external systems.';
const vector = await embeddings.embed(text);

await db.addEmbedding({
  sourceId: 'doc-1',
  sourceType: 'document',
  vector,
  content: text,
});

const queryVector = await embeddings.embed('How do agents use tools?');
const { data: results } = await db.search({
  vector: queryVector,
  limit: 5,
  threshold: 0.7,
});

for (const result of results) {
  console.log(`${result.content} (score: ${result.score.toFixed(3)})`);
}

Dimension Matching

The vector dimensions of your embedding model must match the adapter configuration:

  • PostgresAdapter defaults to 768. Call setVectorDimensions(n) before connect() if using a different model.
  • QdrantAdapter requires dimensions in the constructor. The collection is auto-created with the specified size.
  • InMemoryEmbeddingAdapter handles any dimension since it uses in-process cosine similarity.
const db = new PostgresAdapter({
  provider: 'postgres',
  connectionString: process.env.DATABASE_URL!,
});
db.setVectorDimensions(1536);
await db.connect();

Provider Comparison

ProviderDimensionsLatencyCostPrivacy
OpenAI1536 / 3072~100msPay per tokenData sent to API
Ollama384–1024~50msFreeFully local
Google768~100msFree tier availableData sent to API

Choose Ollama for local-first deployments with zero data egress. Choose OpenAI for highest quality embeddings at scale. Choose Google for a good balance with generous free tier.

On this page