Semantic AI Search Across Tenants

AI Search

Keyword search finds records when you know the exact words. AI Search finds them when you only know what you mean.

AI Search understands the intent behind a query, not just its tokens. A query like users created recently returns records from PostgreSQL, MySQL, MongoDB, and Redis simultaneously, even though none of those records contain the words "created", "recently", or "users" as a search term. The engine understands what those words mean and finds the semantically closest data.

Like regular Search, AI Search is zero-config, always tenant-isolated, and queries all databases at once. The difference is the matching engine underneath.

Shell

# Keyword search - only finds exact word matches
$ tdb search --tenant demo --query "users created recently"
0 results in 48ms

# AI Search - understands meaning, returns semantically relevant records
$ tdb ai-search --tenant demo --query "users created recently"

DATABASE    COLLECTION  ID   SCORE  CREATED       CONTENT
PostgreSQL  users       52   0.379  Mar 05 10:04  email: [email protected] | name: CrossDB PG
MySQL       users       -    0.374  -             email: [email protected] | name: MySQL User
PostgreSQL  users       -    0.371  -             email: [email protected] | name: UniqueID PG
Redis       users       -    0.371  -             email: [email protected] | name: HashUser
20 results in 326ms

Keyword vs AI Search

Both search all databases. The difference is what counts as a match.

Keyword Search

Matches exact words and tokens
Query must appear in the data
Score based on term frequency
Fast, deterministic
Best for IDs, names, codes, emails

AI Search

Matches meaning and intent
Query can be in natural language
Score based on semantic distance
Slightly slower, probabilistic
Best for natural language, concepts

Query	Keyword Search	AI Search
`NativeCheck1`	✓ Exact match	✓ Also finds it
`users created recently`	✗ No match	✓ Returns recent records
`email addresses`	Partial	✓ All records with email fields
`overdue invoice`	Exact word only	✓ Also finds "late payment"

AI Search also runs a keyword phase internally. An exact word match will still score highly even when using AI Search.

How It Works

Every AI Search query runs two phases in parallel, then blends the scores.

When a query arrives, it is first converted into a 768-dimension vector - a numerical representation of its meaning. That vector is used to find semantically similar documents in the index. Simultaneously, a standard keyword phase runs BM25 term matching on the same index. Both phases produce independent ranked lists. The final results are a weighted blend of both scores, controlled by the weights parameter.

Query pipeline

Query

"overdue invoice"

→

Vectorize

768 dimensions

→

BM25 phase

Keyword matching

knn phase

Vector similarity

→

Blend

Weighted score

→

Results

Ranked, sorted

Both phases run in parallel. The blend step applies your weights. Default is equal weight on both phases.

Indexing is identical to regular Search. Data is already indexed by the time you call AI Search. There is no additional indexing step.

Multilingual

Your query language does not need to match the language of your data.

AI Search operates on semantic vectors, not on surface text. Because the embedding model maps meaning across languages into the same vector space, a query written in one language will find semantically relevant records written in another. The same tenant data is queryable regardless of what language the query arrives in.

Shell

# Query in a different language - returns English-language records
$ tdb ai-search --tenant demo \
    --query "query written in any language" \
    --weights keyword=0.0,semantic=1.0

DATABASE    COLLECTION  ID   SCORE  CONTENT
PostgreSQL  users       54   0.375  email: [email protected] | name: Pedro Garcia
PostgreSQL  users       52   0.363  email: [email protected] | name: CrossDB PG
Redis       users       -    0.361  email: [email protected] | name: CrossDB Redis
PostgreSQL  users       53   0.358  email: [email protected] | name: Final4 PG
20 results in 287ms

Pure semantic mode (keyword=0.0, semantic=1.0) gives the best cross-language results since BM25 keyword matching is language-specific.

Weights Tuning

Weights control how much keyword matching versus semantic understanding contributes to the final score.

The weights field takes two values that must sum to 1.0. The first is the keyword weight, the second is the semantic weight. The final score for each result is computed as:

Formula

final_score = (bm25_score * keyword_weight) + (vector_score * semantic_weight)

A higher keyword weight makes exact word presence matter more. A higher semantic weight makes conceptual closeness matter more. The default 0.5/0.5 is a balanced starting point.

keyword 0.8semantic 0.2

Keyword dominant

Use when the exact term must appear. Best for IDs, codes, email addresses, product SKUs.

scores ~0.22

keyword 0.5semantic 0.5

Balanced (default)

Good starting point. Exact matches score high, but semantically relevant records still surface.

scores ~0.37

keyword 0.3semantic 0.7

Semantic dominant

Intent and meaning matter more than exact words. Best for natural language search bars and concept queries.

scores ~0.52

keyword 0.0semantic 1.0

Pure semantic

Exact words have no influence. Best for cross-language queries and exploratory concept search.

scores ~0.74

Recommended by Use Case

Use Case	Weights	Why
Support lookup by ID or email	`0.8, 0.2`	The exact identifier must appear in the result.
In-app natural language search	`0.3, 0.7`	Users describe what they want, not its exact field value.
Compliance or audit scan	`0.5, 0.5`	Balanced - exact keywords matter but related concepts should surface too.
Cross-language queries	`0.0, 1.0`	BM25 is language-specific. Pure semantic works across languages.
General purpose default	`0.5, 0.5`	Start here, tune from live results.

Scores are not normalized between keyword and pure semantic modes. A pure semantic score of 0.74 and a keyword-heavy score of 0.22 both represent strong matches - the absolute value depends on the weights used.

Available on All Plans

AI Search runs on every tier, including free. Monthly quotas scale with your plan.

Free tier includes 1,000 AI searches per month at 10 requests per minute. Higher tiers offer larger monthly quotas and faster per-minute rates. See app.tenantsdb.com/billing for plan details.

Rate Limits and Quotas

AI Search has two limits: a per-minute rate limit and a monthly quota. Both scale with your plan. Exceeding either returns an error.

Per-Minute Rate Limit

HTTP 429

{
  "success": false,
  "http_status": 429,
  "code": "rate_limited",
  "error": "AI Search rate limit exceeded. Retry after 60 seconds."
}

Monthly Quota

HTTP 402

{
  "success": false,
  "http_status": 402,
  "code": "quota_exceeded",
  "error": "Monthly AI search quota exceeded. Upgrade at app.tenantsdb.com/billing"
}

API Reference

Same endpoint structure as keyword search, with one additional field: weights.

POST /tenants/{tenantID}/ai-search

Hybrid AI search across all databases for a single tenant. Queries only the namespace belonging to this tenant.

Request Body

Field	Type		Description
query	string	required	Natural language or keyword query string.
weights	object	optional	Blend weights. `{"keyword": 0.5, "semantic": 0.5}`. Values must sum to 1.0. Default: equal weight.
databases	string[]	optional	Limit to specific database types: `PostgreSQL`, `MySQL`, `MongoDB`, `Redis`.
collections	string[]	optional	Limit to specific tables or collections.
limit	int	optional	Maximum results to return. Default: `20`.

Example Request

Shell

curl -X POST https://api.tenantsdb.com/tenants/wayne/ai-search \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "users created recently",
    "weights": {"keyword": 0.3, "semantic": 0.7}
  }'

Response

HTTP 200

{
  "success": true,
  "total": 20,
  "took_ms": 326,
  "results": [
    {
      "source_db": "PostgreSQL",
      "collection": "users",
      "doc_id": "52",
      "score": 0.379,
      "content": {
        "id": 52,
        "name": "CrossDB PG",
        "email": "[email protected]",
        "created_at": "2026-03-05T10:04:54Z"
      }
    }
  ]
}

Response Fields

Field	Type	Description
total	int	Total matching results.
took_ms	int	Search execution time in milliseconds. Slightly higher than keyword search due to vectorization.
results[].source_db	string	Database type: `PostgreSQL`, `MySQL`, `MongoDB`, or `Redis`.
results[].collection	string	Table or collection name.
results[].score	float	Blended relevance score. Absolute value depends on weights used. Sort order is always descending.
results[].content	object	Full indexed document.

POST /tenants/_all/ai-search

Hybrid AI search across all active tenants simultaneously. Each result includes tenant_id. Soft-deleted tenants are excluded. Request body and response shape are identical to the single-tenant endpoint.

Example Request

Shell

curl -X POST https://api.tenantsdb.com/tenants/_all/ai-search \
  -H "Authorization: Bearer $API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "query": "accounts with overdue payments",
    "collections": ["invoices"],
    "weights": {"keyword": 0.4, "semantic": 0.6}
  }'

Use _all only for your own internal admin and compliance use. Always use POST /tenants/{id}/ai-search to serve search results to your customers.

CLI Reference

tdb ai-search Hybrid AI search across tenant data

Usage

tdb ai-search --tenant <id|_all> --query <text> [--weights keyword=0.5,semantic=0.5] [--databases <list>] [--collections <list>] [--limit <n>]

Flag		Default	Description
--tenant	required	-	Tenant ID to search, or `_all` for all active tenants.
--query	required	-	Natural language or keyword query string.
--weights	optional	0.5,0.5	Blend weights as `keyword=N,semantic=N`. Values must sum to 1.0.
--databases	optional	all	Comma-separated database types.
--collections	optional	all	Comma-separated table or collection names.
--limit	optional	20	Maximum results to return.
--json	optional	false	Output full raw JSON.

Examples

Shell

# Natural language query, default weights
$ tdb ai-search --tenant demo --query "users created recently"

DATABASE    COLLECTION  ID   SCORE  CREATED       CONTENT
PostgreSQL  users       52   0.379  Mar 05 10:04  email: [email protected] | name: CrossDB PG
MySQL       users       -    0.374  -             email: [email protected] | name: MySQL User
PostgreSQL  users       -    0.371  -             email: [email protected] | name: UniqueID PG
Redis       users       -    0.371  -             email: [email protected] | name: HashUser
20 results in 326ms

# Semantic dominant - intent over exact words
$ tdb ai-search --tenant wayne --query "accounts with late payments" \
    --weights keyword=0.3,semantic=0.7

# Pure semantic - best for cross-language queries
$ tdb ai-search --tenant wayne --query "query written in any language" \
    --weights keyword=0.0,semantic=1.0

# Keyword dominant - exact term must appear
$ tdb ai-search --tenant wayne --query "NativeCheck1" \
    --weights keyword=0.8,semantic=0.2

# Scope to a collection
$ tdb ai-search --tenant wayne --query "overdue invoices" \
    --collections invoices --weights keyword=0.4,semantic=0.6

# All tenants
$ tdb ai-search --tenant _all --query "accounts with overdue payments" \
    --collections invoices

# Raw JSON
$ tdb ai-search --tenant wayne --query "recent users" --json

Next Steps

←

→

→