@robypag/sap-ai-core-plugin v0.1.13
AI Core Plugin
This package is a CDS Plugin that provides easy access to SAP AI Core Generative AI Hub functionalities. It aims to enable a configuration-based access to Completions and Embeddings in a CAP project, with minimal implementation overhead.
It takes free inspiration from the original CAP LLM Plugin and extends from there.
Please note that THIS IS NOT an official SAP software
This plugin is a personal spare-time project and still requires a lot of work. It has bugs and lacks some key features, so contributions and collaborations are welcome and would be greatly appreciated. But...it currently has only a few commits and it basically works, so there is potential :blush:
Introduction
This plugin offers a simplified way to setup an application to include AI-based conversations. It completely handles system prompts, completions and chat context so that the caller application needs only to provide new user messages and act on responses.
Similarly, it handles a simplified way to generate and store embeddings in a HANA database - assuming that its Vector Engine is enabled.
Please read carefully the documentation, especially the section that describes the managed
and un-managed
modes.
Installing
Install the package via npm install @robypag/sap-ai-core-plugin
. Be sure to satisfy peerDepencencies:
"peerDependencies": {
"@sap/cds": ">= 7.9"
}
Table of Contents
Setup
The plugin uses CAP configuration to set itself up, so it requires a cds configuration entry either in package.json
or .cdsrc.json
.
It comes with a preconfigured schema that helps with value input.
You can define a configuration for both the completion
capability and the embeddings
capability.
The minimum need is to provide completion
configuration section of the plugin:
{
"cds": {
"requires": {
"ai-core": {
"kind": "ai-core":,
"completions": {
"destination": "<NAME_OF_BTP_DESTINATION>",
"resourceGroup": "<AI_CORE_RESOURCE_GROUP>",
"deploymentId": "<AI_CORE_DEPLOYMENT_ID_OF_COMPLETION_MODEL>",
"apiVersion": "<AI_CORE_COMPLETION_MODEL_API_VERSION>",
"temperature": "<COMPLETION_MODEL_TEMPERATURE>"
}
}
}
}
}
Similarly, you can configure the embeddings
section:
{
"cds": {
"requires": {
"ai-core": {
...
"embeddings": {
"destination": "<NAME_OF_BTP_DESTINATION>",
"resourceGroup": "<AI_CORE_RESOURCE_GROUP>",
"deploymentId": "<AI_CORE_DEPLOYMENT_ID_OF_EMBEDDING_MODEL>",
"apiVersion": "<AI_CORE_EMBEDDING_MODEL_API_VERSION>"
}
}
}
}
}
Here is a breakdown of each property:
destination
: the name of the BTP destination that points to the AI Core Service Instance. The plugin uses SAP Cloud SDK Connectivity to look this up.resourceGroup
: the name of AI Core resource group under which Configurations and Deployments are created - see Resource Groups.deploymentId
: The ID of the model deploymentapiVersion
: The API version of the model. Find the available values heretemperature
: (Only valid for Completions) Allows to influence the predictability of the generated text. Accepts values from 0 to 1, where 0 is the most deterministic (more predictable and prone to repetitions) and 1 is the less deterministic (less predictable but more prone to hallucinations)
Embeddings can only be used when running on a HANA database. SQLite does not support vectors and therefore cannot process similarity searches.
The plugin can be configured in a managed
and an un-managed
way.
This is just a way to tell the plugin runtime to discriminate whether you would like only to use Core API functionalities like completions
and embeddings
or you want to have a completely managed solution, that includes database operations and context handling.
Un-Managed Configuration
To just use API functions, set up the corresponding configuration object as follows:
"ai-core": {
"completions": {
"managed": false,
// Other properties of the completions object
}
}
The same applies for embeddings
. With this configuration, the plugin does not check your database model nor the service actions. It will act as a simple proxy between your code and AI Core, using the configuration provided.
This is the default configuration
Managed Configuration
The managed configuration uses all the embedded functionalities of the plugin, which are described in the following paragraphs.
Use AI Artifacts
The plugin offers a set of aspects
to simplify database modeling when using AI capabilities.
You can define entities at database level, include the relative aspect
and you are good to go.
Each aspect comes with specific custom annotations
that allow the plugin to determine which entity is used to do what.
You can flexibly decide wheter to use aspects or to model your database and add AI annotations to your entities.
There are currently 4 available aspects
and 8 available annotations
.
Annotations
Annotations allow you to "mark" specific entities, properties and functions so that the plugin knows how to behave:
Annotation Name | For | Description |
---|---|---|
@AIConversations | Entity | Sets the annotated entity as the "Conversations" entity |
@AIMessages | Entity | Sets the annotated entity as the "Messages" entity |
@AISystemPrompts | Entity | Sets the annotated entity as the source for static System prompts |
@AIEmbeddingsStorage | Entity | Sets the annotated entity as the repository for vectorized texts |
@AISummarize | Property | Marks the property as summarized: by default is the title of the @AIConversations entity. The value of this property will be generated by a completion |
@AIEmbedding | Property | Marks the property as the container for vector values in the @AIEmbeddingsStorage entity. Must be assigned to a field of type Vector |
@AITextChunk | Property | Marks the property as the container for text values in the @AIEmbeddingsStorage entity |
@AICompletion | Action | Annotates an action to act as a Completion endpoint |
@AIEmbeddingGenerator | Action | Annotates an action to act as a embedding vector generator |
Entity and property annotations are used at runtime to determine how to properly handle the persistence of Messages, Conversations and Embeddings.
Action annotations are used at runtime - specifically at cds.once('served')
event - to attach custom handlers to action and automatically handle the processing of Completions and Embedding generation.
More on this later.
Aspects
Above annotations are automatically assigned if you decide to use the pre-defined aspects defined in index.cds.
AIConversations
: Represents the base entity that contains a list of conversation between a User and the AI. It comes with a predefined@AIConversations
annotation and includes a singletitle
property annotated with@AISummarize
.AIMessages
: Represents the base entity that contains messages exchanged between a User and the AI for a given Conversation. Includes the following properties:
Property | Type | Description |
---|---|---|
content | LargeString | Content of the message sent by either the user or the AI |
role | String enum 'user'/'system'/'assistant'/'tool' | The role of message sender |
AISystemPrompt
: Allows to define static texts to be used as context during a conversation. They represent the value of thesystem
message role in a conversation. There are currently two available types: SIMPLE and CONTEXT_AWARE. As the name implies, the first will be used during simple conversations, whereas the latter will be considered during RAG-aware chats.AIDocumentChunks
: This is the base entity that contains vector embeddings. Comes with 3 properties:
Property | Type | Description |
---|---|---|
embedding | Vector(1536) | The vector representation of a text-chunk. Comes annotated with @AIEmbedding |
text | LargeString | The original text-chunk from which vectors are generated. Comes annotated with @AITextChunk |
source | LargeString | The reference to the original text or document from which vectors and text-chunks are determined |
Entity Modeling
As described, the above artifacts allow to design a simple database model to satisfy the minimal configuration to perform conversations and embeddings.
Since aspects do not allow to manage compositions or associations, developer must add corresponding properties to entities annotated with
@AIConversations
and@AIMessages
(regardless if entities include provided aspects or not). Specifically:
entity Chats: AIConversations {
...
Messages: Composition of many Messages on Messages.Chat = $self;
}
...
entity Messages: AIMessages {
...
key Chat: Association to one Chat;
}
This way, the plugin knows how to deal with relationships between the two entities. In future enhancements, the plugin will automatically add missing relationship between these base entities.
Completions
Completions are the most basic functionality of an AI chat. They allow message exchange between a User and the AI. It's very easy using AI Core to perform a "completion": given a deployment ID for a completion model, one POST call to the completion endpoint will provide an AI response.
The plugin simplifies the consumption of the completion model by attaching to an arbitrary OData action that is annotated with @AICompletion
.
Please note that the annotated action must satisfy the following parameter signature:
{ conversationID: uuid | null, content: string, useRag: boolean }
To be effective, Completions must take care about two key points: system prompt and chat context.
System Prompt
The system prompt defines the basic behavior of the AI during a chat session: it basically provides "the instructions" to the AI, so that its answers are generated around a specific topic (or persona) and they are not completely un-deterministic. See prompt engineering.
It is usually sent once per conversation and is hidden to the User perspective: however, there are instances in which the system prompt can dynamically change during a conversation - for example during RAG-aware chats.
During "normal", non RAG conversations, the system context is calculated once: at the creation of a new Conversation, that is, when a Message that has no relationship with an existing Conversation is sent.
Chat Context
Chat context represents the entire history of messages exchanged during a conversation: an effective AI chat "remembers" previous messages, in order to not repeat itself and to keep a true sense of conversing. To keep a chat context, LLMs usually require to receive the entire history of messages whenever a new one is sent: OpenAI defined a common standard in which messages can be sent as a JSON array, where each element is an object like this:
{
role: 'system' | 'user' | 'assistant',
content: 'an arbitrary string that represents a message'
}
The AI Core plugin automatically manages the chat context, by storing Messages of a Conversation in the entities annotated with @AIConversations
and @AIMessages
: on each message, the context is rebuilt and sent to the completion endpoint.
Since AI Core Generative Hub supports multiple completion models, there's no unified endpoint that can generically serve all LLMs. For example, OpenAI models like GPT-4o or GPT-4o-mini will respond to an url like
/chat/completions?api-version=xxx
whereas Anthropic models like claude-3.5-sonnet will respond to an url like/invoke
.This plugin does its best to automatically determine the correct endpoint: however, it currently is a static mapping between the model name and the corresponding completion URL. You can find it here.
Embeddings
As an LLM would say:
Embedding is a way to represent data, like words or images, as numerical vectors to capture relationships and meaning. Embeddings allow machines to understand, compare, and process data more effectively by transforming complex information into numerical forms that highlight patterns, similarities, and differences.
The plugin provides an easy way to produce embeddings from an arbitrary text or piece of data. There are currently two ways in which you can get embeddings:
- Using an action annotated with
@AIEmbeddingGenerator
: whichever text is sent to the action, will be returned as a numerical vector. - By
cds.connect.to('ai-core')
and calling the getEmbeddings() API function.
!WARNING To use embeddings, the corresponding cds configuration MUST be set. See setup.
RAG-Aware Completions
RAG-aware completions combine Retrieval-Augmented Generation (RAG) with conversational AI, enhancing responses by retrieving relevant external information, leading to more accurate, informed, and contextually appropriate dialogue in real-time.
The plugin uses the HANA Vector Engine to perform similarity searches and provide additional, specific context to the LLM.
During RAG-aware conversations, the system prompt is re-calculated on every new message: using the user query, a similarity search is performed on the entity annotated with the @AIEmbeddingsStorage
entity and the resulting context is used as system prompt.
This allows AI answers to be more tailored on application needs, avoiding a broader context and limiting answers to a specific topic.
RAG-aware conversations are activated by providing a truthy value to the parameter useRag
of the completion action annotated with @AICompletion
.
API
!NOTE API Calls still require the minimal configuration for embeddings and completions. See Setup.
You can always call the Core API functions, regardless of the managed aspects and actions. There are 3 main functions:
Function | Parameters | Description |
---|---|---|
genericCompletion(messages) | Array<{ role: string, content: string }> | Performs a completion call to the LLM deployed in the configured deploymentId. It expects a full chat context, including the system role. Returns the AI response in the same format. |
createEmbeddings(text) | text: string | Generates a Vector of embeddings, using the LLM deployed in the configured deploymentId. Returns an array of numbers. |
vectorSearch(params) | See below | Allows the execution of a generic similarity search on HANA |
The third function vectorSearch
allows to perform vector-based searches on a user-specified table. It accepts the following parameter:
|Name|Type|Description|
|---|---|---|
|query|string|The text to search for|
|tableName|string|The name in HANA format that contain embeddings and texts. I.E. SAP_DEMO_EMBEDDINGS|
|embeddingColumnName|string|The name of table field that vectorized representation of data. Must be of type REAL_VECTOR (cds.Vector(1536) in CDS)|
|textColumnName|string|The name of table field that contains textual representation of data|
|searchAlgorithm|string|The name of similarity algorithm. HANA currently supports COSINE_SIMILARITY
and L2DISTANCE
|
|minScore|number|A value between 0 and 1. It will be used to filter out elements with a score lower than the specified value|
|candidates|number|Number of candidates to read from HANA|
Returns an object with found content and a metrics
object that includes similarity scores and the table entry that generated the result:
{
content: ['I am one result in textual representation', 'I am number two'],
metrics: [{
score: 0.945424895818,
textContent: 'I am one result in textual representation',
tableEntry: {
foo: 'bar'
}
}, {...}]
}
Local Testing
The plugin can only be tested if associated with a CAP application: you can quickly spin up a basic bookshop
application and add the plugin usage.
Testing with SQLite will only allow the usage of simple Completions: vectors are not supported in SQLite, so Embeddings and RAG-aware Completions are not working.
There are currently no checks performed by the plugin on this: if you try to deploy a model that uses Vectors to SQLite, the database driver will throw an error.
You can however perform an hybrid
testing and bind your application to an SAP HANA service and still run locally.
To simplify development, bind to a destination
service instance as well, in order to easily consume the required destination that points to AI Core deployments.
Testing with Jest
Under development
Contributing
If you'd like to contribute, please fork the repository and use a feature branch. Pull requests are warmly welcome.
Please have a look at CONTRIBUTING.md
for additional info.
Code of Conduct
Licensing
See LICENSE
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago
10 months ago