POST/v1/rerank

Rerank

Improve search relevance by reranking documents based on their semantic similarity to a query. Perfect for RAG pipelines and search result optimization.

Request

Example Request

curl https://api.assisters.dev/v1/rerank \
  -H "Authorization: Bearer YOUR_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "rerank-v1",
    "query": "What is machine learning?",
    "documents": [
      "Machine learning is a subset of AI that enables systems to learn from data.",
      "The weather today is sunny with clear skies.",
      "Deep learning uses neural networks with multiple layers."
    ],
    "top_n": 2
  }'

Parameters

modelrequiredstring

ID of the reranking model to use.

Available models:

rerank-v1 - Fast, general-purpose reranking
rerank-multilingual - 100+ languages support

queryrequiredstring

The search query to rank documents against.

documentsrequiredarray

Array of documents to rerank. Each document can be a string or object with text field. Maximum 1000 documents per request.

top_noptionalinteger

Number of most relevant documents to return. Defaults to all documents.

return_documentsoptionalbooleandefault: false

If true, include the document text in the response.

Response

Example Response

{
  "id": "rerank-abc123",
  "results": [
    {
      "index": 0,
      "relevance_score": 0.92,
      "document": {
        "text": "Machine learning is a subset of AI..."
      }
    },
    {
      "index": 2,
      "relevance_score": 0.78,
      "document": {
        "text": "Deep learning uses neural networks..."
      }
    }
  ],
  "meta": {
    "billed_units": {
      "search_units": 1
    }
  }
}

Use Cases

RAG Pipeline Optimization

Improve context quality by reranking retrieved chunks before passing to LLM.

Search Result Improvement

Enhance search relevance beyond keyword matching with semantic reranking.

Code Examples

Python - RAG Pipeline

rag_rerank.py

from assisters import Assisters

client = Assisters(api_key="YOUR_API_KEY")

# Initial retrieval (from vector DB)
query = "How do neural networks learn?"
retrieved_docs = vector_db.search(query, top_k=20)

# Rerank for better relevance
reranked = client.rerank.create(
    model="rerank-v1",
    query=query,
    documents=[doc["text"] for doc in retrieved_docs],
    top_n=5,
    return_documents=True
)

# Use top reranked docs as context
context = "\n".join([r.document.text for r in reranked.results])

# Generate answer with LLM
response = client.chat.completions.create(
    model="assisters-chat-v1",
    messages=[
        {"role": "system", "content": f"Context:\n{context}"},
        {"role": "user", "content": query}
    ]
)