chat-about-video v2.5.0
chat-about-video
Chat about a video clip using the powerful OpenAI GPT-4 Vision or GPT-4o.
chat-about-video
is an open-source NPM package designed to accelerate the development of conversation applications about video content. Harnessing the capabilities of OpenAI GPT-4 Vision or GPT-4o services from Microsoft Azure or OpenAI, this package opens up a range of usage scenarios with minimal effort.
Usage scenarios
There are two approaches for feeding video content into GPT-4 Vision. chat-about-video
supports both of them.
Frame image extraction:
- Integrate GPT-4 Vision or GPT-4o from Microsoft Azure or OpenAI effortlessly.
- Utilize ffmpeg integration provided by this package for frame image extraction or opt for a DIY approach.
- Store frame images with ease, supporting Azure Blob Storage and AWS S3.
- GPT-4 Vision hosted in Azure allows analysis of up to 10 frame images.
- GPT-4 Vision or GPT-4o hosted in OpenAI allows analysis of more than 10 frame images.
Video indexing with Microsoft Azure:
- Exclusively supported by GPT-4 Vision from Microsoft Azure.
- Ingest videos seamlessly into Microsoft Azure's Video Retrieval Index.
- Automatic extraction of up to 20 frame images using Video Retrieval Indexer.
- Default integration of speech transcription for enhanced comprehension.
- Flexible storage options with support for Azure Blob Storage and AWS S3.
Usage
Installation
Add chat-about-video as a dependency to your Node.js application using the following command:
npm i chat-about-video
Dependencies
If you intend to utilize ffmpeg for extracting video frame images, ensure it is installed on your system. You can install it using either a system package manager or a helper NPM package:
sudo apt install ffmpeg
# or
npm i @ffmpeg-installer/ffmpeg
If you plan to use Azure Blob Storage, include the following dependency:
npm i @azure/storage-blob
For using AWS S3, install the following dependencies:
npm i @handy-common-utils/aws-utils @aws-sdk/s3-request-presigner @aws-sdk/client-s3
Usage in code
To integrate chat-about-video
into your Node.js application, follow these simple steps:
- Instantiate the
ChatAboutVideo
class by creating an instance. The constructor allows you to pass in configuration options.
- Most configuration options come with sensible default values, but you can specify your own for further customization.
- Use the
startConversation(videoFilePath)
function to initiate a conversation about a video clip. This function returns aConversation
object. The video file or its frame images are sent to Azure Blob Storage or AWS S3 during this step. - Interact with GPT by using the
say(question, { maxTokens: 2000 })
function within the conversation. You can pass in a question, and will receive an answer.
- Message history is automatically kept during the conversation, providing context for a more coherent dialogue.
- The second parameter of the
say(...)
function allows you to specify your own for further customization.
- Wrap up the conversation using the
end()
function. This ensures proper clean-up and resource management.
Examples
Below is an example chat application, which
- uses GPT deployment (in this example, it is named 'gpt4vision') hosted in Microsoft Azure;
- uses ffmpeg to extract video frame images;
- stores video frame images in Azure Blob Storage;
- container name: 'vision-experiment-input'
- object path prefix: 'video-frames/'
- reads credentials from environment variables
- reads input video file path from environment variable 'DEMO_VIDEO'
import readline from 'node:readline';
import { ChatAboutVideo } from 'chat-about-video';
const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const prompt = (question: string) => new Promise<string>((resolve) => rl.question(question, resolve));
async function demo() {
const chat = new ChatAboutVideo({
openAiEndpoint: process.env.AZURE_OPENAI_API_ENDPOINT!, // This line is not needed if you are using GTP provided by OpenAI rather than by Microsoft Azure.
openAiApiKey: process.env.OPENAI_API_KEY!, // This is the API key.
azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!, // This line is not needed if you'd like to use AWS S3.
openAiDeploymentName: 'gpt4vision', // For GPT provided by OpenAI, this is the model name. For GPT provided by Microsoft Azure, this is the deployment name.
storageContainerName: 'vision-experiment-input', // Blob container name in Azure or S3 bucket name in AWS
storagePathPrefix: 'video-frames/',
});
const conversation = await chat.startConversation(process.env.DEMO_VIDEO!);
while(true) {
const question = await prompt('\nUser: ');
if (!question) {
continue;
}
if (['exit', 'quit'].includes(question.toLowerCase().trim())) {
break;
}
const answer = await conversation.say(question, { maxTokens: 2000 });
console.log('\nAI:' + answer);
}
}
demo().catch((error) => console.error(error));
Below is an example showing how to create an instance of ChatAboutVideo
that
- uses GPT provided by OpenAI;
- uses ffmpeg to extract video frame images;
- stores video frame images in AWS S3;
- bucket name: 'my-s3-bucket'
- object path prefix: 'video-frames/'
- reads API key from environment variable 'OPENAI_API_KEY'
const chat = new ChatAboutVideo({
openAiApiKey: process.env.OPENAI_API_KEY!,
openAiDeploymentName: 'gpt-4-vision-preview', // or 'gpt-4o'
storageContainerName: 'my-s3-bucket',
storagePathPrefix: 'video-frames/',
extractVideoFrames: {
limit: 30, // override default value 10
interval: 2, // override default value 5
},
} as any);
Below is an example showing how to create an instance of ChatAboutVideo
that
- uses GPT deployment (in this example, it is named 'gpt4vision') hosted in Microsoft Azure;
- uses Microsoft Video Retrieval Index to extract frames and analyse the video
- A randomly named index is created automatically.
- The index is also deleted automatically when the conversation ends.
- stores video file in Azure Blob Storage;
- container name: 'vision-experiment-input'
- object path prefix: 'videos/'
- reads credentials from environment variables
const chat = new ChatAboutVideo({
openAiEndpoint: process.env.AZURE_OPENAI_API_ENDPOINT!,
openAiApiKey: process.env.AZURE_OPENAI_API_KEY!,
azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!,
openAiDeploymentName: 'gpt4vision',
storageContainerName: 'vision-experiment-input',
storagePathPrefix: 'videos/',
videoRetrievalIndex: {
endpoint: process.env.AZURE_CV_API_ENDPOINT!,
apiKey: process.env.AZURE_CV_API_KEY!,
createIndexIfNotExists: true,
deleteIndexWhenConversationEnds: true,
},
});
API
chat-about-video
Modules
- aws
- azure
- azure/video-retrieval-api-client
- chat
- client-hack
- index
- storage
- storage/types
- video
- video/ffmpeg
- video/types
Classes
Class: VideoRetrievalApiClient
azure/video-retrieval-api-client.VideoRetrievalApiClient
Constructors
constructor
• new VideoRetrievalApiClient(endpointBaseUrl
, apiKey
, apiVersion?
)
Parameters
Name | Type | Default value |
---|---|---|
endpointBaseUrl | string | undefined |
apiKey | string | undefined |
apiVersion | string | '2023-05-01-preview' |
Methods
createIndex
▸ createIndex(indexName
, indexOptions?
): Promise
\<void
>
Parameters
Name | Type |
---|---|
indexName | string |
indexOptions | CreateIndexOptions |
Returns
Promise
\<void
>
createIndexIfNotExist
▸ createIndexIfNotExist(indexName
, indexOptions?
): Promise
\<void
>
Parameters
Name | Type |
---|---|
indexName | string |
indexOptions? | CreateIndexOptions |
Returns
Promise
\<void
>
createIngestion
▸ createIngestion(indexName
, ingestionName
, ingestion
): Promise
\<void
>
Parameters
Name | Type |
---|---|
indexName | string |
ingestionName | string |
ingestion | IngestionRequest |
Returns
Promise
\<void
>
deleteDocument
▸ deleteDocument(indexName
, documentUrl
): Promise
\<void
>
Parameters
Name | Type |
---|---|
indexName | string |
documentUrl | string |
Returns
Promise
\<void
>
deleteIndex
▸ deleteIndex(indexName
): Promise
\<void
>
Parameters
Name | Type |
---|---|
indexName | string |
Returns
Promise
\<void
>
getIndex
▸ getIndex(indexName
): Promise
\<undefined
| IndexSummary
>
Parameters
Name | Type |
---|---|
indexName | string |
Returns
Promise
\<undefined
| IndexSummary
>
getIngestion
▸ getIngestion(indexName
, ingestionName
): Promise
\<IngestionSummary
>
Parameters
Name | Type |
---|---|
indexName | string |
ingestionName | string |
Returns
Promise
\<IngestionSummary
>
ingest
▸ ingest(indexName
, ingestionName
, ingestion
, backoff?
): Promise
\<void
>
Parameters
Name | Type |
---|---|
indexName | string |
ingestionName | string |
ingestion | IngestionRequest |
backoff | number [] |
Returns
Promise
\<void
>
listDocuments
▸ listDocuments(indexName
): Promise
\<DocumentSummary
[]>
Parameters
Name | Type |
---|---|
indexName | string |
Returns
Promise
\<DocumentSummary
[]>
listIndexes
▸ listIndexes(): Promise
\<IndexSummary
[]>
Returns
Promise
\<IndexSummary
[]>
Class: ChatAboutVideo
chat.ChatAboutVideo
Constructors
constructor
• new ChatAboutVideo(options
, log?
)
Parameters
Name | Type |
---|---|
options | ChatAboutVideoConstructorOptions |
log | LineLogger \<(message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void > |
Properties
Property | Description |
---|---|
Protected client: OpenAIClient | |
Protected log: LineLogger \<(message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void > | |
Protected options: ChatAboutVideoOptions |
Methods
prepareVideoFrames
▸ Protected
prepareVideoFrames(conversationId
, videoFile
): Promise
\<PreparationResult
>
Parameters
Name | Type |
---|---|
conversationId | string |
videoFile | string |
Returns
Promise
\<PreparationResult
>
prepareVideoRetrievalIndex
▸ Protected
prepareVideoRetrievalIndex(conversationId
, videoFile
): Promise
\<PreparationResult
>
Parameters
Name | Type |
---|---|
conversationId | string |
videoFile | string |
Returns
Promise
\<PreparationResult
>
startConversation
▸ startConversation(videoFile
): Promise
\<Conversation
>
Start a conversation about a video.
Parameters
Name | Type | Description |
---|---|---|
videoFile | string | Path to a video file in local file system. |
Returns
Promise
\<Conversation
>
The conversation.
Class: Conversation
chat.Conversation
Constructors
constructor
• new Conversation(client
, deploymentName
, conversationId
, messages
, options?
, cleanup?
, log?
)
Parameters
Name | Type |
---|---|
client | OpenAIClient |
deploymentName | string |
conversationId | string |
messages | ChatRequestMessage [] |
options? | GetChatCompletionsOptions |
cleanup? | () => Promise \<void > |
log | LineLogger \<(message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void > |
Properties
Property | Description |
---|---|
Protected Optional cleanup: () => Promise \<void > | |
Protected client: OpenAIClient | |
Protected conversationId: string | |
Protected deploymentName: string | |
Protected log: LineLogger \<(message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void , (message? : any , ...optionalParams : any []) => void > | |
Protected messages: ChatRequestMessage [] | |
Protected Optional options: GetChatCompletionsOptions |
Methods
end
▸ end(): Promise
\<void
>
Returns
Promise
\<void
>
say
▸ say(message
, options?
): Promise
\<undefined
| string
>
Say something in the conversation, and get the response from AI
Parameters
Name | Type | Description |
---|---|---|
message | string | The message to say in the conversation. |
options? | GetChatCompletionsOptions | Options for fine control. |
Returns
Promise
\<undefined
| string
>
The response/completion
Interfaces
Interface: CreateIndexOptions
azure/video-retrieval-api-client.CreateIndexOptions
Properties
Property | Description |
---|---|
Optional features: IndexFeature [] | |
Optional metadataSchema: IndexMetadataSchema | |
Optional userData: object |
Interface: DocumentSummary
azure/video-retrieval-api-client.DocumentSummary
Properties
Property | Description |
---|---|
createdDateTime: string | |
documentId: string | |
Optional documentUrl: string | |
lastModifiedDateTime: string | |
Optional metadata: object | |
Optional userData: object |
Interface: IndexFeature
azure/video-retrieval-api-client.IndexFeature
Properties
Property | Description |
---|---|
Optional domain: "surveillance" | "generic" | |
Optional modelVersion: string | |
name: "vision" | "speech" |
Interface: IndexMetadataSchema
azure/video-retrieval-api-client.IndexMetadataSchema
Properties
Property | Description |
---|---|
fields: IndexMetadataSchemaField [] | |
Optional language: string |
Interface: IndexMetadataSchemaField
azure/video-retrieval-api-client.IndexMetadataSchemaField
Properties
Property | Description |
---|---|
filterable: boolean | |
name: string | |
searchable: boolean | |
type: "string" | "datetime" |
Interface: IndexSummary
azure/video-retrieval-api-client.IndexSummary
Properties
Property | Description |
---|---|
createdDateTime: string | |
eTag: string | |
Optional features: IndexFeature [] | |
lastModifiedDateTime: string | |
name: string | |
Optional userData: object |
Interface: IngestionRequest
azure/video-retrieval-api-client.IngestionRequest
Properties
Property | Description |
---|---|
Optional filterDefectedFrames: boolean | |
Optional generateInsightIntervals: boolean | |
Optional includeSpeechTranscript: boolean | |
Optional moderation: boolean | |
videos: VideoIngestion [] |
Interface: IngestionStatusDetail
azure/video-retrieval-api-client.IngestionStatusDetail
Properties
Property | Description |
---|---|
documentId: string | |
documentUrl: string | |
lastUpdatedTime: string | |
succeeded: boolean |
Interface: IngestionSummary
azure/video-retrieval-api-client.IngestionSummary
Properties
Property | Description |
---|---|
Optional batchName: string | |
createdDateTime: string | |
Optional fileStatusDetails: IngestionStatusDetail [] | |
lastModifiedDateTime: string | |
name: string | |
state: "NotStarted" | "Running" | "Completed" | "Failed" | "PartiallySucceeded" |
Interface: VideoIngestion
azure/video-retrieval-api-client.VideoIngestion
Properties
Property | Description |
---|---|
Optional documentId: string | |
documentUrl: string | |
Optional metadata: object | |
mode: "update" | "remove" | "add" | |
Optional userData: object |
Interface: ChatAboutVideoOptions
chat.ChatAboutVideoOptions
Option settings for ChatAboutVideo
Properties
Property | Description | ||||||||||||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Optional extractVideoFrames: Object | Type declaration | Name | Type | Description | :------ | :------ | :------ | extractor | VideoFramesExtractor | Function for extracting frames from the video. If not specified, a default function using ffmpeg will be used. | height | undefined | number | Video frame height, default is undefined which means the scaling will be determined by the videoFrameWidth option. If both videoFrameWidth and videoFrameHeight are not specified, then the frames will not be resized/scaled. | interval | number | Intervals between frames to be extracted. The unit is second. Default value is 5. | limit | number | Maximum number of frames to be extracted. Default value is 10 which is the current per-request limitation of ChatGPT Vision. | width | undefined | number | Video frame width, default is 200. If both videoFrameWidth and videoFrameHeight are not specified, then the frames will not be resized/scaled. | |||||||
fileBatchUploader: FileBatchUploader | Function for uploading files | ||||||||||||||||||||||||||||
Optional initialPrompts: ChatRequestMessage [] | Initial prompts to be added to the chat history before frame images. | ||||||||||||||||||||||||||||
openAiDeploymentName: string | Name/ID of the deployment | ||||||||||||||||||||||||||||
Optional startPrompts: ChatRequestMessage [] | Prompts to be added to the chat history right after frame images. | ||||||||||||||||||||||||||||
storageContainerName: string | Storage container for storing frame images of the video. | ||||||||||||||||||||||||||||
storagePathPrefix: string | Path prefix to be prepended for storing frame images of the video. | ||||||||||||||||||||||||||||
tmpDir: string | Temporary directory for storing temporary files.If not specified, them temporary directory of the OS will be used. | ||||||||||||||||||||||||||||
Optional videoRetrievalIndex: Object | Type declaration | Name | Type | :------ | :------ | apiKey | string | createIndexIfNotExists? | boolean | deleteDocumentWhenConversationEnds? | boolean | deleteIndexWhenConversationEnds? | boolean | endpoint | string | indexName? | string | ## Modules |
Module: aws
Functions
createAwsS3FileBatchUploader
▸ createAwsS3FileBatchUploader(s3Client
, expirationSeconds
, parallelism?
): FileBatchUploader
Parameters
Name | Type | Default value |
---|---|---|
s3Client | S3Client | undefined |
expirationSeconds | number | undefined |
parallelism | number | 3 |
Returns
Module: azure
References
CreateIndexOptions
Re-exports CreateIndexOptions
DocumentSummary
Re-exports DocumentSummary
IndexFeature
Re-exports IndexFeature
IndexMetadataSchema
Re-exports IndexMetadataSchema
IndexMetadataSchemaField
Re-exports IndexMetadataSchemaField
IndexSummary
Re-exports IndexSummary
IngestionRequest
Re-exports IngestionRequest
IngestionStatusDetail
Re-exports IngestionStatusDetail
IngestionSummary
Re-exports IngestionSummary
PaginatedWithNextLink
Re-exports PaginatedWithNextLink
VideoIngestion
Re-exports VideoIngestion
VideoRetrievalApiClient
Re-exports VideoRetrievalApiClient
Functions
createAzureBlobStorageFileBatchUploader
▸ createAzureBlobStorageFileBatchUploader(blobServiceClient
, expirationSeconds
, parallelism?
): FileBatchUploader
Parameters
Name | Type | Default value |
---|---|---|
blobServiceClient | BlobServiceClient | undefined |
expirationSeconds | number | undefined |
parallelism | number | 3 |
Returns
Module: azure/video-retrieval-api-client
Classes
Interfaces
- CreateIndexOptions
- DocumentSummary
- IndexFeature
- IndexMetadataSchema
- IndexMetadataSchemaField
- IndexSummary
- IngestionRequest
- IngestionStatusDetail
- IngestionSummary
- VideoIngestion
Type Aliases
PaginatedWithNextLink
Ƭ PaginatedWithNextLink\<T
>: Object
Type parameters
Name |
---|
T |
Type declaration
Name | Type |
---|---|
nextLink? | string |
value | T [] |
Module: chat
Classes
Interfaces
Type Aliases
ChatAboutVideoConstructorOptions
Ƭ ChatAboutVideoConstructorOptions: Partial
\<Omit
\<ChatAboutVideoOptions
, "videoRetrievalIndex"
| "extractVideoFrames"
>> & Required
\<Pick
\<ChatAboutVideoOptions
, "openAiDeploymentName"
| "storageContainerName"
>> & { extractVideoFrames?
: Partial
\<Exclude
\<ChatAboutVideoOptions
"extractVideoFrames"
, undefined
>> ; videoRetrievalIndex?
: Partial
\<ChatAboutVideoOptions
"videoRetrievalIndex"
> & Pick
\<Exclude
\<ChatAboutVideoOptions
"videoRetrievalIndex"
, undefined
>, "endpoint"
| "apiKey"
> } & { azureStorageConnectionString?
: string
; downloadUrlExpirationSeconds?
: number
; openAiApiKey
: string
; openAiEndpoint?
: string
}
Module: client-hack
Functions
fixClient
▸ fixClient(openAIClient
): void
Parameters
Name | Type |
---|---|
openAIClient | any |
Returns
void
Module: index
References
ChatAboutVideo
Re-exports ChatAboutVideo
ChatAboutVideoConstructorOptions
Re-exports ChatAboutVideoConstructorOptions
ChatAboutVideoOptions
Re-exports ChatAboutVideoOptions
Conversation
Re-exports Conversation
FileBatchUploader
Re-exports FileBatchUploader
VideoFramesExtractor
Re-exports VideoFramesExtractor
extractVideoFramesWithFfmpeg
Re-exports extractVideoFramesWithFfmpeg
lazyCreatedFileBatchUploader
Re-exports lazyCreatedFileBatchUploader
lazyCreatedVideoFramesExtractor
Re-exports lazyCreatedVideoFramesExtractor
Module: storage
References
FileBatchUploader
Re-exports FileBatchUploader
Functions
lazyCreatedFileBatchUploader
▸ lazyCreatedFileBatchUploader(creator
): FileBatchUploader
Parameters
Name | Type |
---|---|
creator | Promise \<FileBatchUploader > |
Returns
Module: storage/types
Type Aliases
FileBatchUploader
Ƭ FileBatchUploader: (dir
: string
, fileNames
: string
[], containerName
: string
, blobPathPrefix
: string
) => Promise
\<string
[]>
Type declaration
▸ (dir
, fileNames
, containerName
, blobPathPrefix
): Promise
\<string
[]>
####### Parameters
Name | Type |
---|---|
dir | string |
fileNames | string [] |
containerName | string |
blobPathPrefix | string |
####### Returns
Promise
\<string
[]>
Module: video
References
VideoFramesExtractor
Re-exports VideoFramesExtractor
extractVideoFramesWithFfmpeg
Re-exports extractVideoFramesWithFfmpeg
Functions
lazyCreatedVideoFramesExtractor
▸ lazyCreatedVideoFramesExtractor(creator
): VideoFramesExtractor
Parameters
Name | Type |
---|---|
creator | Promise \<VideoFramesExtractor > |
Returns
Module: video/ffmpeg
Functions
extractVideoFramesWithFfmpeg
▸ extractVideoFramesWithFfmpeg(inputFile
, outputDir
, intervalSec
, format?
, width?
, height?
, startSec?
, endSec?
): Promise
\<string
[]>
Parameters
Name | Type |
---|---|
inputFile | string |
outputDir | string |
intervalSec | number |
format? | string |
width? | number |
height? | number |
startSec? | number |
endSec? | number |
Returns
Promise
\<string
[]>
Module: video/types
Type Aliases
VideoFramesExtractor
Ƭ VideoFramesExtractor: (inputFile
: string
, outputDir
: string
, intervalSec
: number
, format?
: string
, width?
: number
, height?
: number
, startSec?
: number
, endSec?
: number
) => Promise
\<string
[]>
Type declaration
▸ (inputFile
, outputDir
, intervalSec
, format?
, width?
, height?
, startSec?
, endSec?
): Promise
\<string
[]>
####### Parameters
Name | Type |
---|---|
inputFile | string |
outputDir | string |
intervalSec | number |
format? | string |
width? | number |
height? | number |
startSec? | number |
endSec? | number |
####### Returns
Promise
\<string
[]>