Time-Weighted Retrieval Integration

Abstract

A LangChain JavaScript reference for the TimeWeightedVectorStoreRetriever — a retriever that scores documents by combining vector similarity with temporal recency. The scoring formula is score = (1 - decayRate)^hoursPassed + vectorRelevance, where hoursPassed is the time since the document was last accessed (not created). This access-tracking semantics means frequently retrieved documents remain fresh regardless of age, implementing an LRU-like memory that rewards active knowledge over stale-but-relevant knowledge. The configurable decayRate parameter (0–1) controls the memory horizon: values near 0 retain long memory, values near 1 strongly favor recently-accessed items. Setting decayRate to exactly 0 or 1 degenerates the retriever to standard vector similarity. A key constraint is that documents must be added via retriever.addDocuments() — not directly to the backing vector store — to populate the access history metadata required for scoring.


Key Concepts

  • Time-Weighted Scoring: score = (1 - decayRate)^hoursPassed + vectorRelevance — an additive combination of a decaying recency bonus and static vector similarity
  • Last-Accessed vs. Created Timestamp: The recency component uses time since last retrieval, not time since insertion — documents accessed frequently remain fresh; documents never retrieved decay regardless of recency
  • Decay Rate: Configurable float in [0, 1]. Near 0 → long memory horizon (documents stay relevant for a long time). Near 1 → short memory horizon (only recently accessed documents receive a recency bonus)
  • Boundary Conditions: decayRate = 0score = 1 + vectorRelevance (constant recency bonus, effectively pure vector search). decayRate = 1score = 0 + vectorRelevance (no recency bonus, pure vector search)
  • addDocuments via Retriever: Documents must be added through the retriever instance, not the underlying vector store, to ensure the retriever’s access history metadata (memoryStream) is initialized correctly

Key Claims and Findings

  • Time-weighted retrieval rewards access frequency, not age alone — this is a maintenance-friendly pattern for agent working memory where active context naturally persists
  • Exact boundary values (0 or 1) collapse the retriever to standard vector similarity, making the time-weighting component optional and backward-compatible
  • Access metadata initialization is non-obvious: using the vector store’s addDocuments directly bypasses the retriever’s metadata setup, silently producing incorrect scores

Implementation Reference (JavaScript)

import { TimeWeightedVectorStoreRetriever } from "@langchain/classic/retrievers/time_weighted";
import { MemoryVectorStore } from "@langchain/classic/vectorstores/memory";
import { OpenAIEmbeddings } from "@langchain/openai";
 
const retriever = new TimeWeightedVectorStoreRetriever({
  vectorStore: new MemoryVectorStore(new OpenAIEmbeddings()),
  memoryStream: [],
  searchKwargs: 2,   // top-K from vector store before time-weighting
  decayRate: 0.999,  // short memory; set lower for longer retention
});
 
// Must use retriever.addDocuments(), not vectorStore.addDocuments()
await retriever.addDocuments(documents);
 
const results = await retriever.invoke("query text");

Terminology

  • TimeWeightedVectorStoreRetriever: LangChain retriever class implementing additive time-weighted vector scoring
  • memoryStream: Internal array tracking document access history; auto-populated by addDocuments() on the retriever
  • searchKwargs: Number of top candidates to fetch from the underlying vector store before applying time-weighting reranking
  • decayRate: Per-hour fractional decay applied to the recency bonus; controls how fast documents lose their temporal advantage

Connections to Existing Wiki Pages