pinecone

Store and retrieve vectors from Pinecone vector database.

The Pinecone module provides functions for vector operations including upsert, query, fetch, and delete. This enables semantic search, similarity matching, and RAG (Retrieval Augmented Generation) capabilities in workflows.

Overview

Pinecone is a managed vector database optimized for similarity search. The module automatically:

Organizes vectors by organization/environment namespace
Handles authentication and API communication
Supports both entity argument and environment variable configuration

Namespace Format:

{namespacePrefix}/{organizationId}/{environmentId}/{customNamespace}

Configuration

Pinecone can be configured via entity arguments (recommended) or environment variables.

Entity Arguments (Preferred)

Set these as arguments on your workflow entity:

Argument Name	Description
`pineconeApiKey`	Your Pinecone API key
`pineconeIndexHost`	Index host URL (e.g., `my-index-abc123.svc.us-east1-gcp.pinecone.io`)
`pineconeNamespacePrefix`	Optional prefix for namespaces

Example Entity Configuration:

{
  "arguments": [
    {
      "argumentName": "pineconeApiKey",
      "argumentValue": "{{PINECONE_API_KEY}}",
      "argumentDescription": "Pinecone API key from environment"
    },
    {
      "argumentName": "pineconeIndexHost",
      "argumentValue": "{{PINECONE_INDEX_HOST}}",
      "argumentDescription": "Pinecone index host"
    }
  ]
}

Environment Variables (Fallback)

Variable	Description
`PINECONE_API_KEY`	Your Pinecone API key
`PINECONE_INDEX_HOST`	Index host URL
`PINECONE_NAMESPACE_PREFIX`	Optional namespace prefix

Entity arguments take precedence over environment variables.

pineconeUpsert

Insert or update vectors in Pinecone.

Signature

await pineconeUpsert(vectors, namespace)

Description

Upserts one or more vectors to Pinecone. Each vector must have an id, values (the embedding), and optional metadata.

Parameters

Parameter	Type	Description
`vectors`	object[]	Array of vector objects
`namespace`	string	Optional custom namespace (appended to org/env)

Vector Object Structure:

{
  id: string;           // Unique identifier
  values: number[];     // Embedding vector (float array)
  metadata?: object;    // Optional metadata for filtering
}

Returns

Promise<{upsertedCount: number}> — Number of vectors upserted.

Example

// Generate embedding for content
const embedding = await createEmbedding(message.content);

// Upsert single vector
await pineconeUpsert([{
  id: `event-${message.eventId}`,
  values: embedding,
  metadata: {
    type: message.type,
    timestamp: Date.now(),
    content: message.content.substring(0, 500) // Store preview
  }
}]);

// Upsert multiple vectors
const vectors = messages.map(msg => ({
  id: `msg-${msg.id}`,
  values: msg.embedding,
  metadata: { source: msg.source }
}));
await pineconeUpsert(vectors, 'messages');

pineconeQuery

Query Pinecone for similar vectors.

Signature

await pineconeQuery(vector, topK, namespace, filter)

Description

Finds vectors most similar to the query vector using cosine similarity. Returns scored matches with metadata.

Parameters

Parameter	Type	Description
`vector`	number[]	Query embedding vector
`topK`	number	Number of results to return (default: 10)
`namespace`	string	Optional custom namespace (appended to org/env)
`filter`	object	Optional metadata filter conditions

Returns

Promise<{matches: Array<{id: string, score: number, metadata?: object}>}> — Scored matches.

Example

// Simple similarity search
const queryEmbedding = await createEmbedding('touchdown celebration');
const results = await pineconeQuery(queryEmbedding, 5);

for (const match of results.matches) {
  print(`${match.id}: ${match.score.toFixed(3)}`);
  print(`  Content: ${match.metadata?.content}`);
}

// Filtered search
const filtered = await pineconeQuery(queryEmbedding, 10, '', {
  type: { $eq: 'touchdown' },
  timestamp: { $gt: Date.now() - 86400000 } // Last 24 hours
});

// Search specific namespace
const archived = await pineconeQuery(queryEmbedding, 3, 'archive-2024');

pineconeFetch

Fetch vectors by their IDs.

Signature

await pineconeFetch(ids, namespace)

Description

Retrieves specific vectors by their IDs. More efficient than querying when you know the exact IDs.

Parameters

Parameter	Type	Description
`ids`	string[]	Array of vector IDs to fetch
`namespace`	string	Optional custom namespace

Returns

Promise<{vectors: Record<string, {id: string, values: number[], metadata?: object}>}> — Map of ID to vector data.

Example

// Fetch specific vectors
const result = await pineconeFetch([
  'event-12345',
  'event-12346',
  'event-12347'
]);

for (const [id, vector] of Object.entries(result.vectors)) {
  print(`${id}: ${vector.metadata?.content}`);
}

// Check if vector exists
const existing = await pineconeFetch([`event-${message.eventId}`]);
if (Object.keys(existing.vectors).length > 0) {
  print('Vector already exists');
}

pineconeDelete

Delete vectors from Pinecone by ID.

Signature

await pineconeDelete(ids, namespace)

Description

Deletes specific vectors by their IDs.

Parameters

Parameter	Type	Description
`ids`	string[]	Array of vector IDs to delete
`namespace`	string	Optional custom namespace

Returns

Promise<{}> — Empty object on success.

Example

// Delete specific vectors
await pineconeDelete(['event-12345', 'event-12346']);

// Delete from specific namespace
await pineconeDelete(['event-99999'], 'temp-data');

Use Cases

1. Semantic Search for Context

Find relevant historical content for prompt enrichment:

// Generate embedding for current event
const queryEmbedding = await createEmbedding(
  `${message.payload.event.description} ${message.payload.event.play_type}`
);

// Find similar past events
const similar = await pineconeQuery(queryEmbedding, { topK: 3 });

// Build context for prompt
let context = 'Similar past events:\n';
for (const match of similar.matches) {
  context += `- ${match.metadata?.content}\n`;
}

// Use context in prompt
var prompt = `
Based on these similar past events:
${context}

Generate a social media post for: ${message.payload.event.description}
`;

2. Deduplication with Similarity

Prevent near-duplicate content:

const embedding = await createEmbedding(message.content);

// Check for similar existing content
const similar = await pineconeQuery(embedding, { topK: 1 });

if (similar.matches.length > 0 && similar.matches[0].score > 0.95) {
  print('Content too similar to existing:', similar.matches[0].id);
  return; // Skip processing
}

// Content is unique enough, proceed
await pineconeUpsert([{
  id: `content-${message.id}`,
  values: embedding,
  metadata: { content: message.content }
}]);

3. RAG (Retrieval Augmented Generation)

Enhance AI prompts with relevant documentation:

// User's question
const question = message.payload.question;
const questionEmbedding = await createEmbedding(question);

// Search documentation
const docs = await pineconeQuery(questionEmbedding, {
  topK: 5,
  namespace: 'documentation'
});

// Build RAG context
let context = '';
for (const doc of docs.matches) {
  context += `${doc.metadata?.title}:\n${doc.metadata?.content}\n\n`;
}

// Generate answer with context
var prompt = `
Answer the following question using only the provided context.

Context:
${context}

Question: ${question}

Answer:
`;

await executePromptWithModel();

4. Content Indexing Pipeline

Index new content as it arrives:

// Only index certain event types
if (!['touchdown', 'field_goal', 'interception'].includes(message.payload.event.play_type)) {
  return;
}

// Generate embedding
const content = `${message.payload.event.description}. ${message.payload.event.player?.name || ''} ${message.payload.event.team?.name || ''}`;
const embedding = await createEmbedding(content);

// Upsert to Pinecone
await pineconeUpsert([{
  id: `event-${message.payload.event.id}`,
  values: embedding,
  metadata: {
    playType: message.payload.event.play_type,
    gameId: message.payload.game.id,
    timestamp: message.payload.timestamp,
    content: content
  }
}], 'events');

print('Indexed event:', message.payload.event.id);

Filter Syntax

Pinecone supports metadata filtering with these operators:

Operator	Description	Example
`$eq`	Equal	`{ type: { $eq: 'touchdown' } }`
`$ne`	Not equal	`{ type: { $ne: 'timeout' } }`
`$gt`	Greater than	`{ score: { $gt: 7 } }`
`$gte`	Greater than or equal	`{ quarter: { $gte: 3 } }`
`$lt`	Less than	`{ timestamp: { $lt: 1234567890 } }`
`$lte`	Less than or equal	`{ quarter: { $lte: 2 } }`
`$in`	In array	`{ type: { $in: ['touchdown', 'field_goal'] } }`
`$nin`	Not in array	`{ type: { $nin: ['timeout', 'penalty'] } }`

Combining Filters:

const results = await pineconeQuery(embedding, {
  topK: 10,
  filter: {
    $and: [
      { type: { $eq: 'touchdown' } },
      { quarter: { $gte: 3 } },
      { team: { $in: ['Chiefs', 'Eagles'] } }
    ]
  }
});

Best Practices

Embedding Dimensions

Ensure your embeddings match the Pinecone index dimension:

Amazon Titan text-embedding-v2: 1024 dimensions
OpenAI text-embedding-3-small: 1536 dimensions
OpenAI text-embedding-ada-002: 1536 dimensions

Metadata Guidelines

Store searchable fields — Fields you'll filter on
Include content previews — For debugging and display
Add timestamps — For time-based queries
Keep metadata small — Pinecone has metadata size limits

Namespace Organization

{prefix}/
  {org}/{env}/
    events/           # Event embeddings
    documentation/    # RAG documents
    responses/        # Generated content

Error Handling

try {
  await pineconeUpsert(vectors);
} catch (error) {
  print('Pinecone error:', error.message);
  // Common errors:
  // - "Pinecone API key and index host are required"
  // - "message.organizationId and message.environmentId are required"
  // - HTTP errors (401 unauthorized, 400 bad request)
}

prompt - createEmbedding — Generate embeddings
STM — Short-term memory storage
Consumer Evaluator — Workflow execution

Overview​

Configuration​

Entity Arguments (Preferred)​

Environment Variables (Fallback)​

pineconeUpsert​

Signature​

Description​

Parameters​

Returns​

Example​

pineconeQuery​

Signature​

Description​

Parameters​

Returns​

Example​

pineconeFetch​

Signature​

Description​

Parameters​

Returns​

Example​

pineconeDelete​

Signature​

Description​

Parameters​

Returns​

Example​

Use Cases​

1. Semantic Search for Context​

2. Deduplication with Similarity​

3. RAG (Retrieval Augmented Generation)​

4. Content Indexing Pipeline​

Filter Syntax​

Best Practices​

Embedding Dimensions​

Metadata Guidelines​

Namespace Organization​

Error Handling​

Related Topics​

Overview

Configuration

Entity Arguments (Preferred)

Environment Variables (Fallback)

pineconeUpsert

Signature

Description

Parameters

Returns

Example

pineconeQuery

Signature

Description

Parameters

Returns

Example

pineconeFetch

Signature

Description

Parameters

Returns

Example

pineconeDelete

Signature

Description

Parameters

Returns

Example

Use Cases

1. Semantic Search for Context

2. Deduplication with Similarity

3. RAG (Retrieval Augmented Generation)

4. Content Indexing Pipeline

Filter Syntax

Best Practices

Embedding Dimensions

Metadata Guidelines

Namespace Organization

Error Handling

Related Topics