This version is still in development and is not considered stable yet. For the latest snapshot version, please use Spring AI 1.0.0-SNAPSHOT!spring-doc.cn

Observability

Spring AI builds upon the observability features in the Spring ecosystem to provide insights into AI-related operations. Spring AI provides metrics and tracing capabilities for its core components: ChatClient (including Advisor), ChatModel, EmbeddingModel, ImageModel, and VectorStore.spring-doc.cn

Low cardinality keys will be added to metrics and traces, while high cardinality keys will only be added to traces.

Chat Client

The spring.ai.chat.client observations are recorded when a ChatClient call() or stream() operations are invoked. They measure the time spent performing the invocation and propagate the related tracing information.spring-doc.cn

Table 1. Low Cardinality Keys
Name Description

gen_ai.operation.namespring-doc.cn

Always framework.spring-doc.cn

gen_ai.systemspring-doc.cn

Always spring_ai.spring-doc.cn

spring.ai.chat.client.streamspring-doc.cn

Is the chat model response a stream - true or falsespring-doc.cn

spring.ai.kindspring-doc.cn

The kind of framework API in Spring AI: chat_client.spring-doc.cn

Table 2. High Cardinality Keys
Name Description

spring.ai.chat.client.advisor.paramsspring-doc.cn

Map of advisor parameters.spring-doc.cn

spring.ai.chat.client.advisorsspring-doc.cn

List of configured chat client advisors.spring-doc.cn

spring.ai.chat.client.system.paramsspring-doc.cn

Chat client system parameters. Optional.spring-doc.cn

spring.ai.chat.client.system.textspring-doc.cn

Chat client system text. Optional.spring-doc.cn

spring.ai.chat.client.tool.function.namesspring-doc.cn

Enabled tool function names.spring-doc.cn

spring.ai.chat.client.tool.function.callbacksspring-doc.cn

List of configured chat client function callbacks.spring-doc.cn

spring.ai.chat.client.user.paramsspring-doc.cn

Chat client user parameters. Optional.spring-doc.cn

spring.ai.chat.client.user.textspring-doc.cn

Chat client user text. Optional.spring-doc.cn

Input Data

The ChatClient input data is typically big and possibly containing sensitive information. For those reasons, it is not exported by default.spring-doc.cn

Spring AI supports exporting input data as span attributes across all tracing backends.spring-doc.cn

Property Description Default

spring.ai.chat.client.observations.include-inputspring-doc.cn

Whether to include the input content in the observations.spring-doc.cn

falsespring-doc.cn

If you enable the inclusion of the input content in the observations, there’s a risk of exposing sensitive or private information. Please, be careful!

Chat Client Advisors

The spring.ai.advisor observations are recorded when a call or stream around advisors is performed. They measure the time spent in the advisor (including the time spend on the inner advisors) and propagate the related tracing information.spring-doc.cn

Table 3. Low Cardinality Keys
Name Description

gen_ai.operation.namespring-doc.cn

Always framework.spring-doc.cn

gen_ai.systemspring-doc.cn

Always spring_ai.spring-doc.cn

spring.ai.advisor.typespring-doc.cn

Where the advisor applies it’s logic in the request processing, one of BEFORE, AFTER, or AROUND.spring-doc.cn

spring.ai.kindspring-doc.cn

The kind of framework API in Spring AI: advisor.spring-doc.cn

Table 4. High Cardinality Keys
Name Description

spring.ai.advisor.namespring-doc.cn

Name of the advisor.spring-doc.cn

spring.ai.advisor.orderspring-doc.cn

Advisor order in the advisor chain.spring-doc.cn

Chat Model

Observability features are currently supported only for ChatModel implementations from the following AI model providers: Anthropic, Azure OpenAI, Mistral AI, Ollama, OpenAI, Vertex AI, MiniMax, Moonshot, QianFan, Zhiu AI. Additional AI model providers will be supported in a future release.

The gen_ai.client.operation observations are recorded when calling the ChatModel call or stream methods. They measure the time spent on method completion and propagate the related tracing information.spring-doc.cn

The gen_ai.client.token.usage metrics measures number of input and output tokens used by a single model call.
Table 5. Low Cardinality Keys
Name Description

gen_ai.operation.namespring-doc.cn

The name of the operation being performed.spring-doc.cn

gen_ai.systemspring-doc.cn

The model provider as identified by the client instrumentation.spring-doc.cn

gen_ai.request.modelspring-doc.cn

The name of the model a request is being made to.spring-doc.cn

gen_ai.response.modelspring-doc.cn

The name of the model that generated the response.spring-doc.cn

Table 6. High Cardinality Keys
Name Description

gen_ai.request.frequency_penaltyspring-doc.cn

The frequency penalty setting for the model request.spring-doc.cn

gen_ai.request.max_tokensspring-doc.cn

The maximum number of tokens the model generates for a request.spring-doc.cn

gen_ai.request.presence_penaltyspring-doc.cn

The presence penalty setting for the model request.spring-doc.cn

gen_ai.request.stop_sequencesspring-doc.cn

List of sequences that the model will use to stop generating further tokens.spring-doc.cn

gen_ai.request.temperaturespring-doc.cn

The temperature setting for the model request.spring-doc.cn

gen_ai.request.top_kspring-doc.cn

The top_k sampling setting for the model request.spring-doc.cn

gen_ai.request.top_pspring-doc.cn

The top_p sampling setting for the model request.spring-doc.cn

gen_ai.response.finish_reasonsspring-doc.cn

Reasons the model stopped generating tokens, corresponding to each generation received.spring-doc.cn

gen_ai.response.idspring-doc.cn

The unique identifier for the AI response.spring-doc.cn

gen_ai.usage.input_tokensspring-doc.cn

The number of tokens used in the model input (prompt).spring-doc.cn

gen_ai.usage.output_tokensspring-doc.cn

The number of tokens used in the model output (completion).spring-doc.cn

gen_ai.usage.total_tokensspring-doc.cn

The total number of tokens used in the model exchange.spring-doc.cn

gen_ai.promptspring-doc.cn

The full prompt sent to the model. Optional.spring-doc.cn

gen_ai.completionspring-doc.cn

The full response received from the model. Optional.spring-doc.cn

For measuring user tokens, the previous table lists the values present in an observation trace. Use the metric name gen_ai.client.token.usage that is provided by the ChatModel.
Table 7. Events
Name Description

gen_ai.content.promptspring-doc.cn

Event including the content of the chat prompt. Optional.spring-doc.cn

gen_ai.content.completionspring-doc.cn

Event including the content of the chat completion. Optional.spring-doc.cn

Chat Prompt and Completion Data

The chat prompt and completion data is typically big and possibly containing sensitive information. For those reasons, it is not exported by default.spring-doc.cn

Spring AI supports exporting chat prompt and completion data as span events if you use an OpenTelemetry tracing backend, whereas data is exported as span attributes if you use an OpenZipkin tracing backend.spring-doc.cn

Furthermore, Spring AI supports logging chat prompt and completion data, useful for troubleshooting scenarios.spring-doc.cn

Property Description Default

spring.ai.chat.observations.include-promptspring-doc.cn

Include the prompt content in observations. true or falsespring-doc.cn

falsespring-doc.cn

spring.ai.chat.observations.include-completionspring-doc.cn

Include the completion content in observations. true or falsespring-doc.cn

falsespring-doc.cn

spring.ai.chat.observations.include-error-loggingspring-doc.cn

Include error logging in observations. true or falsespring-doc.cn

falsespring-doc.cn

If you enable the inclusion of the chat prompt and completion data in the observations, there’s a risk of exposing sensitive or private information. Please, be careful!

EmbeddingModel

Observability features are currently supported only for EmbeddingModel implementations from the following AI model providers: Azure OpenAI, Mistral AI, Ollama, and OpenAI. Additional AI model providers will be supported in a future release.

The gen_ai.client.operation observations are recorded on embedding model method calls. They measure the time spent on method completion and propagate the related tracing information.spring-doc.cn

The gen_ai.client.token.usage metrics measures number of input and output tokens used by a single model call.
Table 8. Low Cardinality Keys
Name Description

gen_ai.operation.namespring-doc.cn

The name of the operation being performed.spring-doc.cn

gen_ai.systemspring-doc.cn

The model provider as identified by the client instrumentation.spring-doc.cn

gen_ai.request.modelspring-doc.cn

The name of the model a request is being made to.spring-doc.cn

gen_ai.response.modelspring-doc.cn

The name of the model that generated the response.spring-doc.cn

Table 9. High Cardinality Keys
Name Description

gen_ai.request.embedding.dimensionsspring-doc.cn

The number of dimensions the resulting output embeddings have.spring-doc.cn

gen_ai.usage.input_tokensspring-doc.cn

The number of tokens used in the model input.spring-doc.cn

gen_ai.usage.total_tokensspring-doc.cn

The total number of tokens used in the model exchange.spring-doc.cn

For measuring user tokens, the previous table lists the values present in an observation trace. Use the metric name gen_ai.client.token.usage that is provided by the EmbeddingModel.

Image Model

Observability features are currently supported only for ImageModel implementations from the following AI model providers: OpenAI. Additional AI model providers will be supported in a future release.

The gen_ai.client.operation observations are recorded on image model method calls. They measure the time spent on method completion and propagate the related tracing information.spring-doc.cn

The gen_ai.client.token.usage metrics measures number of input and output tokens used by a single model call.
Table 10. Low Cardinality Keys
Name Description

gen_ai.operation.namespring-doc.cn

The name of the operation being performed.spring-doc.cn

gen_ai.systemspring-doc.cn

The model provider as identified by the client instrumentation.spring-doc.cn

gen_ai.request.modelspring-doc.cn

The name of the model a request is being made to.spring-doc.cn

Table 11. High Cardinality Keys
Name Description

gen_ai.request.image.response_formatspring-doc.cn

The format in which the generated image is returned.spring-doc.cn

gen_ai.request.image.sizespring-doc.cn

The size of the image to generate.spring-doc.cn

gen_ai.request.image.stylespring-doc.cn

The style of the image to generate.spring-doc.cn

gen_ai.response.idspring-doc.cn

The unique identifier for the AI response.spring-doc.cn

gen_ai.response.modelspring-doc.cn

The name of the model that generated the response.spring-doc.cn

gen_ai.usage.input_tokensspring-doc.cn

The number of tokens used in the model input (prompt).spring-doc.cn

gen_ai.usage.output_tokensspring-doc.cn

The number of tokens used in the model output (generation).spring-doc.cn

gen_ai.usage.total_tokensspring-doc.cn

The total number of tokens used in the model exchange.spring-doc.cn

gen_ai.promptspring-doc.cn

The full prompt sent to the model. Optional.spring-doc.cn

For measuring user tokens, the previous table lists the values present in an observation trace. Use the metric name gen_ai.client.token.usage that is provided by the ImageModel.
Table 12. Events
Name Description

gen_ai.content.promptspring-doc.cn

Event including the content of the image prompt. Optional.spring-doc.cn

Image Prompt Data

The image prompt data is typically big and possibly containing sensitive information. For those reasons, it is not exported by default.spring-doc.cn

Spring AI supports exporting image prompt data as span events if you use an OpenTelemetry tracing backend, whereas data is exported as span attributes if you use an OpenZipkin tracing backend.spring-doc.cn

Property Description Default

spring.ai.image.observations.include-promptspring-doc.cn

true or falsespring-doc.cn

falsespring-doc.cn

If you enable the inclusion of the image prompt data in the observations, there’s a risk of exposing sensitive or private information. Please, be careful!

Vector Stores

All vector store implementations in Spring AI are instrumented to provide metrics and distributed tracing data through Micrometer.spring-doc.cn

The db.vector.client.operation observations are recorded when interacting with the Vector Store. They measure the time spent on the query, add and remove operations and propagate the related tracing information.spring-doc.cn

Table 13. Low Cardinality Keys
Name Description

db.operation.namespring-doc.cn

The name of the operation or command being executed. One of add, delete, or query.spring-doc.cn

db.systemspring-doc.cn

The database management system (DBMS) product as identified by the client instrumentation. One of pg_vector, azure, cassandra, chroma, elasticsearch, milvus, neo4j, opensearch, qdrant, redis, typesense, weaviate, pinecone, oracle, mongodb, gemfire, hana, simple.spring-doc.cn

spring.ai.kindspring-doc.cn

The kind of framework API in Spring AI: vector_store.spring-doc.cn

Table 14. High Cardinality Keys
Name Description

db.collection.namespring-doc.cn

The name of a collection (table, container) within the database.spring-doc.cn

db.namespacespring-doc.cn

The name of the database, fully qualified within the server address and port.spring-doc.cn

db.record.idspring-doc.cn

The record identifier if present.spring-doc.cn

db.search.similarity_metricspring-doc.cn

The metric used in similarity search.spring-doc.cn

db.vector.dimension_countspring-doc.cn

The dimension of the vector.spring-doc.cn

db.vector.field_namespring-doc.cn

The name field as of the vector (e.g. a field name).spring-doc.cn

db.vector.query.contentspring-doc.cn

The content of the search query being executed.spring-doc.cn

db.vector.query.filterspring-doc.cn

The metadata filters used in the search query.spring-doc.cn

db.vector.query.response.documentsspring-doc.cn

Returned documents from a similarity search query. Optional.spring-doc.cn

db.vector.query.similarity_thresholdspring-doc.cn

Similarity threshold that accepts all search scores. A threshold value of 0.0 means any similarity is accepted or disable the similarity threshold filtering. A threshold value of 1.0 means an exact match is required.spring-doc.cn

db.vector.query.top_kspring-doc.cn

The top-k most similar vectors returned by a query.spring-doc.cn

Table 15. Events
Name Description

db.vector.content.query.responsespring-doc.cn

Event including the vector search response data. Optional.spring-doc.cn

Response Data

The vector search response data is typically big and possibly containing sensitive information. For those reasons, it is not exported by default.spring-doc.cn

Spring AI supports exporting vector search response data as span events if you use an OpenTelemetry tracing backend, whereas data is exported as span attributes if you use an OpenZipkin tracing backend.spring-doc.cn

Property Description Default

spring.ai.vectorstore.observations.include-query-responsespring-doc.cn

true or falsespring-doc.cn

falsespring-doc.cn

If you enable the inclusion of the vector search response data in the observations, there’s a risk of exposing sensitive or private information. Please, be careful!