Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
65 commits
Select commit Hold shift + click to select a range
0dc589e
Initial plan
Copilot Feb 3, 2026
d3187bd
Add EmbeddingsOptions and EmbeddingProviderType configuration models
Copilot Feb 3, 2026
6064826
Add CLI configure options for embeddings and register embedding service
Copilot Feb 3, 2026
0653f15
Add unit tests for embeddings and update JSON schema with embeddings …
Copilot Feb 3, 2026
21e81b9
Simplify HttpClient registration for embedding service
Copilot Feb 3, 2026
0cd8e53
Plan for embedding service enhancements
Copilot Feb 3, 2026
7fa1c49
Refactor embedding code into dedicated namespaces
Copilot Feb 3, 2026
c3f6937
Fix property renames and update tests
Copilot Feb 3, 2026
1e18c25
Add EmbeddingsOptionsConverter and fix all tests
Copilot Feb 3, 2026
857203a
Address code review feedback
Copilot Feb 3, 2026
89cb2d9
Address PR feedback: Add Azure OpenAI validation, cache key security,…
Copilot Feb 3, 2026
64b592e
Optimize ProviderName to avoid repeated string allocations
Copilot Feb 3, 2026
e8d7238
Fix schema mismatch, remove unused field, add enabled handling, valid…
Copilot Feb 3, 2026
d9c8a29
Add embedding health check execution and update JSON schema with endp…
Copilot Feb 4, 2026
3e02c0f
Add EmbeddingController for /embed REST endpoint with role-based auth…
Copilot Feb 4, 2026
5c05464
Address code review feedback for EmbeddingController
Copilot Feb 4, 2026
c9eba20
fix: manually deserialize EmbeddingsEndpointOptions and EmbeddingsHea…
robertopc1 Feb 6, 2026
d3a5209
feat: add embeddings config validation and unit tests
robertopc1 Feb 6, 2026
203e232
Update src/Core/Services/Embeddings/EmbeddingService.cs
robertopc1 Feb 13, 2026
2cb3999
Update src/Config/Converters/EmbeddingsOptionsConverterFactory.cs
robertopc1 Feb 13, 2026
b432e4f
Update src/Service/Startup.cs
robertopc1 Feb 13, 2026
a8442f4
Taking care of feedback
robertopc1 Feb 18, 2026
a301290
Merge branch 'main' into add-internal-text-embedding-system
robertopc1 Feb 18, 2026
5f12542
Taking care of copilot feedback, adding embedding service tests, embe…
robertopc1 Feb 26, 2026
bd84c84
Adding default return of application/json and only text/plain if expl…
robertopc1 Feb 26, 2026
fb59389
Remove endpoint.path configuration from embeddings feature. The
robertopc1 Mar 6, 2026
7100b8a
feat(embeddings): require L2 cache and include provider/model in cach…
robertopc1 Mar 9, 2026
73658de
fix(embeddings): keep L1 allow distributed cache when configured/opti…
robertopc1 Mar 10, 2026
ece97ae
Adding max text-count validation for embedding requests
robertopc1 Mar 19, 2026
d065f4b
Phase 1 Added Embedding Support with Chunking
ajtiwari07 Apr 9, 2026
c33c643
Resolve merge conflicts
ajtiwari07 Apr 13, 2026
a3f0b1d
Fix PR comments.
ajtiwari07 Apr 20, 2026
1d6f9ea
Revert "Fix PR comments."
ajtiwari07 Apr 21, 2026
8a00bad
Pull latest Merge
ajtiwari07 Apr 21, 2026
f2261f6
Address review comments
ajtiwari07 Apr 21, 2026
1ece78a
Standardize API Error response
ajtiwari07 Apr 22, 2026
896a111
Post review commit
ajtiwari07 Apr 22, 2026
d3e0413
Parameter Validation Check
ajtiwari07 Apr 22, 2026
4c073aa
Uniform API response with tests
ajtiwari07 Apr 23, 2026
17414c6
Fix tests
ajtiwari07 Apr 23, 2026
d8f66b0
Update the config defaults
ajtiwari07 Apr 23, 2026
17fba18
Merge branch 'main' into add-internal-text-embedding-system
souvikghosh04 Apr 24, 2026
d61b340
Post review test refactoring
ajtiwari07 Apr 24, 2026
0d97b0b
Merge branch 'add-internal-text-embedding-system' of https://github.c…
ajtiwari07 Apr 24, 2026
da2bc83
Ensure consistency across config defaults
ajtiwari07 Apr 24, 2026
b8f84bc
Merge branch 'main' into add-internal-text-embedding-system
ajtiwari07 Apr 27, 2026
b00e9ef
Post review commit
ajtiwari07 Apr 28, 2026
f83e119
Merge branch 'main' into add-internal-text-embedding-system
ajtiwari07 Apr 28, 2026
c220bbd
Fix all existing tests
ajtiwari07 Apr 29, 2026
215873a
Merge branch 'add-internal-text-embedding-system' of https://github.c…
ajtiwari07 Apr 29, 2026
8095db5
Merge branch 'main' into add-internal-text-embedding-system
ajtiwari07 Apr 29, 2026
a4a1b6d
Extract validations into a method
ajtiwari07 Apr 29, 2026
1c44f45
Merge branch 'add-internal-text-embedding-system' of https://github.c…
ajtiwari07 Apr 29, 2026
6cd5fe8
Merge branch 'main' into add-internal-text-embedding-system
ajtiwari07 Apr 30, 2026
437bad5
Avoid duplicate texts in embedding creation API request
ajtiwari07 Apr 30, 2026
48b1020
Merge branch 'add-internal-text-embedding-system' of https://github.c…
ajtiwari07 Apr 30, 2026
f6b830a
Fix test failures
ajtiwari07 May 1, 2026
2ffd643
Merge branch 'add-internal-text-embedding-system' of https://github.c…
ajtiwari07 May 1, 2026
fac1c38
Fix formatting
ajtiwari07 May 1, 2026
02b72c9
Merge branch 'main' into add-internal-text-embedding-system
ajtiwari07 May 1, 2026
85326de
Set default dev mode role and use G9 string format for embeddings
ajtiwari07 May 1, 2026
202aed4
Merge branch 'add-internal-text-embedding-system' of https://github.c…
ajtiwari07 May 1, 2026
d79c80e
Test fix
ajtiwari07 May 1, 2026
32448bc
Merge branch 'main' into add-internal-text-embedding-system
ajtiwari07 May 1, 2026
fc26f9e
Merge branch 'main' into add-internal-text-embedding-system
Aniruddh25 May 2, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
189 changes: 189 additions & 0 deletions schemas/dab.draft.schema.json
Original file line number Diff line number Diff line change
Expand Up @@ -770,6 +770,195 @@
"default": 4
}
}
},
"embeddings": {
Comment thread
ajtiwari07 marked this conversation as resolved.
"type": "object",
"description": "Configuration for text embedding/vectorization service. Supports OpenAI and Azure OpenAI providers.",
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean",
"description": "Whether the embedding service is enabled. Defaults to true.",
"default": true
},
"provider": {
"type": "string",
"description": "The embedding provider type.",
"enum": ["azure-openai", "openai"]
},
"base-url": {
"type": "string",
"description": "The provider base URL. For Azure OpenAI, use the Azure resource endpoint. For OpenAI, use https://api.openai.com."
},
"api-key": {
"type": "string",
"description": "The API key for authentication. Supports environment variable substitution with @env('VAR_NAME')."
},
"model": {
"type": "string",
"description": "The model or deployment name. Required for Azure OpenAI (deployment name). For OpenAI, defaults to 'text-embedding-3-small' if not specified."
},
"api-version": {
"type": "string",
"description": "Azure API version. Only used for Azure OpenAI provider.",
"default": "2023-05-15"
},
"dimensions": {
"type": "integer",
"description": "Output vector dimensions. Defaults to 1536 if not specified. Useful for Redis schema alignment.",
"default": 1536,
"minimum": 1
},
"timeout-ms": {
"type": "integer",
"description": "Request timeout in milliseconds.",
"default": 30000,
"minimum": 1,
"maximum": 300000
},
"endpoint": {
"type": "object",
"description": "REST endpoint configuration for the embedding service.",
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean",
"description": "Whether the /embed REST endpoint is enabled. Defaults to false.",
"default": false
},
"path": {
"type": "string",
"description": "The URL path for the embedding endpoint. Defaults to '/embed'.",
"default": "/embed"
},
"roles": {
"type": "array",
"description": "The roles allowed to access the embedding endpoint. Defaults to ['authenticated'].",
"default": ["authenticated"],
"items": {
"type": "string"
}
}
}
},
"health": {
"type": "object",
"description": "Health check configuration for the embedding service.",
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean",
"description": "Whether health checks are enabled for embeddings. Defaults to false.",
"default": false
},
"threshold-ms": {
"type": "integer",
"description": "The maximum response time in milliseconds to be considered healthy.",
"default": 1000,
"minimum": 1,
"maximum": 300000
},
"test-text": {
"type": "string",
"description": "The text to use for health check validation.",
"default": "health check"
},
"expected-dimensions": {
"type": "integer",
"description": "The expected number of dimensions in the embedding result. If specified, dimension validation is performed.",
"minimum": 1
}
}
},
"cache": {
"type": "object",
"description": "Cache configuration for embedding results.",
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean",
"description": "Whether caching is enabled for embeddings. Defaults to true.",
"default": true
},
"level": {
"type": "string",
"description": "Cache level (L1 for in-memory only, L1L2 for in-memory + distributed). Defaults to L1.",
"enum": ["L1", "L1L2"],
"default": "L1"
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldnt the default cache be L1L2 ? since we would like the embeddings to be stored in Redis Cache? and Redis is L2.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this phase we only support L1 via Fusion cache. In phase 2 we will enable support for L2 with redis which would default to L1 or L1L2 like you have suggested.

},
"ttl-seconds": {
"type": "integer",
"description": "Time-to-live for cached embeddings in seconds. Defaults to 86400 (24 hours).",
"default": 86400,
"minimum": 1
}
}
},
"chunking": {
"type": "object",
"description": "Chunking configuration for text processing before embedding. Used to split large text inputs into smaller chunks.",
"additionalProperties": false,
"properties": {
"enabled": {
"type": "boolean",
"description": "Whether chunking is enabled. Defaults to true.",
"default": true
},
"size-chars": {
"type": "integer",
"description": "The size of each chunk in characters.",
"default": 800,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR description says default is 1000. What is the correct default?

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Modified the PR description, you can refer to PRD as a SOT: #3331

"minimum": 1
},
"overlap-chars": {
"type": "integer",
"description": "The number of characters to overlap between consecutive chunks. Overlap helps maintain context across chunk boundaries.",
"default": 100,
"minimum": 0
}
}
}
},
"required": ["provider", "base-url", "api-key"],
"allOf": [
{
"$comment": "Azure OpenAI requires the model (deployment name) to be specified.",
"if": {
"properties": {
"provider": {
"const": "azure-openai"
}
},
"required": ["provider"]
},
"then": {
"required": ["model"],
"properties": {
"api-version": {
"type": "string",
"description": "Azure API version. Required for Azure OpenAI provider.",
"default": "2023-05-15"
}
}
}
},
{
"$comment": "OpenAI does not require model (defaults to text-embedding-3-small) and does not use api-version.",
"if": {
"properties": {
"provider": {
"const": "openai"
}
},
"required": ["provider"]
},
"then": {
"properties": {
"api-version": false
}
}
}
]
}
}
},
Expand Down
Loading
Loading