2.5.0 • Published 2 days ago

chat-about-video v2.5.0

Weekly downloads
-
License
Apache-2.0
Repository
github
Last release
2 days ago

chat-about-video

Chat about a video clip using the powerful OpenAI GPT-4 Vision or GPT-4o.

Version Downloads/week License

chat-about-video is an open-source NPM package designed to accelerate the development of conversation applications about video content. Harnessing the capabilities of OpenAI GPT-4 Vision or GPT-4o services from Microsoft Azure or OpenAI, this package opens up a range of usage scenarios with minimal effort.

Usage scenarios

There are two approaches for feeding video content into GPT-4 Vision. chat-about-video supports both of them.

Frame image extraction:

  • Integrate GPT-4 Vision or GPT-4o from Microsoft Azure or OpenAI effortlessly.
  • Utilize ffmpeg integration provided by this package for frame image extraction or opt for a DIY approach.
  • Store frame images with ease, supporting Azure Blob Storage and AWS S3.
  • GPT-4 Vision hosted in Azure allows analysis of up to 10 frame images.
  • GPT-4 Vision or GPT-4o hosted in OpenAI allows analysis of more than 10 frame images.

Video indexing with Microsoft Azure:

  • Exclusively supported by GPT-4 Vision from Microsoft Azure.
  • Ingest videos seamlessly into Microsoft Azure's Video Retrieval Index.
  • Automatic extraction of up to 20 frame images using Video Retrieval Indexer.
  • Default integration of speech transcription for enhanced comprehension.
  • Flexible storage options with support for Azure Blob Storage and AWS S3.

Usage

Installation

Add chat-about-video as a dependency to your Node.js application using the following command:

npm i chat-about-video

Dependencies

If you intend to utilize ffmpeg for extracting video frame images, ensure it is installed on your system. You can install it using either a system package manager or a helper NPM package:

sudo apt install ffmpeg
# or
npm i @ffmpeg-installer/ffmpeg

If you plan to use Azure Blob Storage, include the following dependency:

npm i @azure/storage-blob

For using AWS S3, install the following dependencies:

npm i @handy-common-utils/aws-utils @aws-sdk/s3-request-presigner @aws-sdk/client-s3

Usage in code

To integrate chat-about-video into your Node.js application, follow these simple steps:

  1. Instantiate the ChatAboutVideo class by creating an instance. The constructor allows you to pass in configuration options.
  • Most configuration options come with sensible default values, but you can specify your own for further customization.
  1. Use the startConversation(videoFilePath) function to initiate a conversation about a video clip. This function returns a Conversation object. The video file or its frame images are sent to Azure Blob Storage or AWS S3 during this step.
  2. Interact with GPT by using the say(question, { maxTokens: 2000 }) function within the conversation. You can pass in a question, and will receive an answer.
  • Message history is automatically kept during the conversation, providing context for a more coherent dialogue.
  • The second parameter of the say(...) function allows you to specify your own for further customization.
  1. Wrap up the conversation using the end() function. This ensures proper clean-up and resource management.

Examples

Below is an example chat application, which

  • uses GPT deployment (in this example, it is named 'gpt4vision') hosted in Microsoft Azure;
  • uses ffmpeg to extract video frame images;
  • stores video frame images in Azure Blob Storage;
    • container name: 'vision-experiment-input'
    • object path prefix: 'video-frames/'
  • reads credentials from environment variables
  • reads input video file path from environment variable 'DEMO_VIDEO'
import readline from 'node:readline';
import { ChatAboutVideo } from 'chat-about-video';

const rl = readline.createInterface({ input: process.stdin, output: process.stdout });
const prompt = (question: string) => new Promise<string>((resolve) => rl.question(question, resolve));

async function demo() {
  const chat = new ChatAboutVideo({
    openAiEndpoint: process.env.AZURE_OPENAI_API_ENDPOINT!, // This line is not needed if you are using GTP provided by OpenAI rather than by Microsoft Azure.
    openAiApiKey: process.env.OPENAI_API_KEY!, // This is the API key.
    azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!, // This line is not needed if you'd like to use AWS S3.
    openAiDeploymentName: 'gpt4vision', // For GPT provided by OpenAI, this is the model name. For GPT provided by Microsoft Azure, this is the deployment name.
    storageContainerName: 'vision-experiment-input', // Blob container name in Azure or S3 bucket name in AWS
    storagePathPrefix: 'video-frames/',
  });

  const conversation = await chat.startConversation(process.env.DEMO_VIDEO!);
  
  while(true) {
    const question = await prompt('\nUser: ');
    if (!question) {
      continue;
    }
    if (['exit', 'quit'].includes(question.toLowerCase().trim())) {
      break;
    }
    const answer = await conversation.say(question, { maxTokens: 2000 });
    console.log('\nAI:' + answer);
  }
}

demo().catch((error) => console.error(error));

Below is an example showing how to create an instance of ChatAboutVideo that

  • uses GPT provided by OpenAI;
  • uses ffmpeg to extract video frame images;
  • stores video frame images in AWS S3;
    • bucket name: 'my-s3-bucket'
    • object path prefix: 'video-frames/'
  • reads API key from environment variable 'OPENAI_API_KEY'
  const chat = new ChatAboutVideo({
    openAiApiKey: process.env.OPENAI_API_KEY!,
    openAiDeploymentName: 'gpt-4-vision-preview', // or 'gpt-4o'
    storageContainerName: 'my-s3-bucket',
    storagePathPrefix: 'video-frames/',
    extractVideoFrames: {
      limit: 30,    // override default value 10
      interval: 2,  // override default value 5
    },
  } as any);

Below is an example showing how to create an instance of ChatAboutVideo that

  • uses GPT deployment (in this example, it is named 'gpt4vision') hosted in Microsoft Azure;
  • uses Microsoft Video Retrieval Index to extract frames and analyse the video
    • A randomly named index is created automatically.
    • The index is also deleted automatically when the conversation ends.
  • stores video file in Azure Blob Storage;
    • container name: 'vision-experiment-input'
    • object path prefix: 'videos/'
  • reads credentials from environment variables
  const chat = new ChatAboutVideo({
    openAiEndpoint: process.env.AZURE_OPENAI_API_ENDPOINT!,
    openAiApiKey: process.env.AZURE_OPENAI_API_KEY!,
    azureStorageConnectionString: process.env.AZURE_STORAGE_CONNECTION_STRING!,
    openAiDeploymentName: 'gpt4vision',
    storageContainerName: 'vision-experiment-input',
    storagePathPrefix: 'videos/',
    videoRetrievalIndex: {
      endpoint: process.env.AZURE_CV_API_ENDPOINT!,
      apiKey: process.env.AZURE_CV_API_KEY!,
      createIndexIfNotExists: true,
      deleteIndexWhenConversationEnds: true,
    },
  });

API

chat-about-video

Modules

Classes

Class: VideoRetrievalApiClient

azure/video-retrieval-api-client.VideoRetrievalApiClient

Constructors

constructor

new VideoRetrievalApiClient(endpointBaseUrl, apiKey, apiVersion?)

Parameters
NameTypeDefault value
endpointBaseUrlstringundefined
apiKeystringundefined
apiVersionstring'2023-05-01-preview'

Methods

createIndex

createIndex(indexName, indexOptions?): Promise\<void>

Parameters
NameType
indexNamestring
indexOptionsCreateIndexOptions
Returns

Promise\<void>


createIndexIfNotExist

createIndexIfNotExist(indexName, indexOptions?): Promise\<void>

Parameters
NameType
indexNamestring
indexOptions?CreateIndexOptions
Returns

Promise\<void>


createIngestion

createIngestion(indexName, ingestionName, ingestion): Promise\<void>

Parameters
NameType
indexNamestring
ingestionNamestring
ingestionIngestionRequest
Returns

Promise\<void>


deleteDocument

deleteDocument(indexName, documentUrl): Promise\<void>

Parameters
NameType
indexNamestring
documentUrlstring
Returns

Promise\<void>


deleteIndex

deleteIndex(indexName): Promise\<void>

Parameters
NameType
indexNamestring
Returns

Promise\<void>


getIndex

getIndex(indexName): Promise\<undefined | IndexSummary>

Parameters
NameType
indexNamestring
Returns

Promise\<undefined | IndexSummary>


getIngestion

getIngestion(indexName, ingestionName): Promise\<IngestionSummary>

Parameters
NameType
indexNamestring
ingestionNamestring
Returns

Promise\<IngestionSummary>


ingest

ingest(indexName, ingestionName, ingestion, backoff?): Promise\<void>

Parameters
NameType
indexNamestring
ingestionNamestring
ingestionIngestionRequest
backoffnumber[]
Returns

Promise\<void>


listDocuments

listDocuments(indexName): Promise\<DocumentSummary[]>

Parameters
NameType
indexNamestring
Returns

Promise\<DocumentSummary[]>


listIndexes

listIndexes(): Promise\<IndexSummary[]>

Returns

Promise\<IndexSummary[]>

Class: ChatAboutVideo

chat.ChatAboutVideo

Constructors

constructor

new ChatAboutVideo(options, log?)

Parameters
NameType
optionsChatAboutVideoConstructorOptions
logLineLogger\<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>

Properties

PropertyDescription
Protected client: OpenAIClient
Protected log: LineLogger\<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>
Protected options: ChatAboutVideoOptions

Methods

prepareVideoFrames

Protected prepareVideoFrames(conversationId, videoFile): Promise\<PreparationResult>

Parameters
NameType
conversationIdstring
videoFilestring
Returns

Promise\<PreparationResult>


prepareVideoRetrievalIndex

Protected prepareVideoRetrievalIndex(conversationId, videoFile): Promise\<PreparationResult>

Parameters
NameType
conversationIdstring
videoFilestring
Returns

Promise\<PreparationResult>


startConversation

startConversation(videoFile): Promise\<Conversation>

Start a conversation about a video.

Parameters
NameTypeDescription
videoFilestringPath to a video file in local file system.
Returns

Promise\<Conversation>

The conversation.

Class: Conversation

chat.Conversation

Constructors

constructor

new Conversation(client, deploymentName, conversationId, messages, options?, cleanup?, log?)

Parameters
NameType
clientOpenAIClient
deploymentNamestring
conversationIdstring
messagesChatRequestMessage[]
options?GetChatCompletionsOptions
cleanup?() => Promise\<void>
logLineLogger\<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>

Properties

PropertyDescription
Protected Optional cleanup: () => Promise\<void>
Protected client: OpenAIClient
Protected conversationId: string
Protected deploymentName: string
Protected log: LineLogger\<(message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void, (message?: any, ...optionalParams: any[]) => void>
Protected messages: ChatRequestMessage[]
Protected Optional options: GetChatCompletionsOptions

Methods

end

end(): Promise\<void>

Returns

Promise\<void>


say

say(message, options?): Promise\<undefined | string>

Say something in the conversation, and get the response from AI

Parameters
NameTypeDescription
messagestringThe message to say in the conversation.
options?GetChatCompletionsOptionsOptions for fine control.
Returns

Promise\<undefined | string>

The response/completion

Interfaces

Interface: CreateIndexOptions

azure/video-retrieval-api-client.CreateIndexOptions

Properties

PropertyDescription
Optional features: IndexFeature[]
Optional metadataSchema: IndexMetadataSchema
Optional userData: object

Interface: DocumentSummary

azure/video-retrieval-api-client.DocumentSummary

Properties

PropertyDescription
createdDateTime: string
documentId: string
Optional documentUrl: string
lastModifiedDateTime: string
Optional metadata: object
Optional userData: object

Interface: IndexFeature

azure/video-retrieval-api-client.IndexFeature

Properties

PropertyDescription
Optional domain: "surveillance" | "generic"
Optional modelVersion: string
name: "vision" | "speech"

Interface: IndexMetadataSchema

azure/video-retrieval-api-client.IndexMetadataSchema

Properties

PropertyDescription
fields: IndexMetadataSchemaField[]
Optional language: string

Interface: IndexMetadataSchemaField

azure/video-retrieval-api-client.IndexMetadataSchemaField

Properties

PropertyDescription
filterable: boolean
name: string
searchable: boolean
type: "string" | "datetime"

Interface: IndexSummary

azure/video-retrieval-api-client.IndexSummary

Properties

PropertyDescription
createdDateTime: string
eTag: string
Optional features: IndexFeature[]
lastModifiedDateTime: string
name: string
Optional userData: object

Interface: IngestionRequest

azure/video-retrieval-api-client.IngestionRequest

Properties

PropertyDescription
Optional filterDefectedFrames: boolean
Optional generateInsightIntervals: boolean
Optional includeSpeechTranscript: boolean
Optional moderation: boolean
videos: VideoIngestion[]

Interface: IngestionStatusDetail

azure/video-retrieval-api-client.IngestionStatusDetail

Properties

PropertyDescription
documentId: string
documentUrl: string
lastUpdatedTime: string
succeeded: boolean

Interface: IngestionSummary

azure/video-retrieval-api-client.IngestionSummary

Properties

PropertyDescription
Optional batchName: string
createdDateTime: string
Optional fileStatusDetails: IngestionStatusDetail[]
lastModifiedDateTime: string
name: string
state: "NotStarted" | "Running" | "Completed" | "Failed" | "PartiallySucceeded"

Interface: VideoIngestion

azure/video-retrieval-api-client.VideoIngestion

Properties

PropertyDescription
Optional documentId: string
documentUrl: string
Optional metadata: object
mode: "update" | "remove" | "add"
Optional userData: object

Interface: ChatAboutVideoOptions

chat.ChatAboutVideoOptions

Option settings for ChatAboutVideo

Properties

PropertyDescription
Optional extractVideoFrames: ObjectType declarationNameTypeDescription:------:------:------extractorVideoFramesExtractorFunction for extracting frames from the video. If not specified, a default function using ffmpeg will be used.heightundefined | numberVideo frame height, default is undefined which means the scaling will be determined by the videoFrameWidth option. If both videoFrameWidth and videoFrameHeight are not specified, then the frames will not be resized/scaled.intervalnumberIntervals between frames to be extracted. The unit is second. Default value is 5.limitnumberMaximum number of frames to be extracted. Default value is 10 which is the current per-request limitation of ChatGPT Vision.widthundefined | numberVideo frame width, default is 200. If both videoFrameWidth and videoFrameHeight are not specified, then the frames will not be resized/scaled.
fileBatchUploader: FileBatchUploaderFunction for uploading files
Optional initialPrompts: ChatRequestMessage[]Initial prompts to be added to the chat history before frame images.
openAiDeploymentName: stringName/ID of the deployment
Optional startPrompts: ChatRequestMessage[]Prompts to be added to the chat history right after frame images.
storageContainerName: stringStorage container for storing frame images of the video.
storagePathPrefix: stringPath prefix to be prepended for storing frame images of the video.
tmpDir: stringTemporary directory for storing temporary files.If not specified, them temporary directory of the OS will be used.
Optional videoRetrievalIndex: ObjectType declarationNameType:------:------apiKeystringcreateIndexIfNotExists?booleandeleteDocumentWhenConversationEnds?booleandeleteIndexWhenConversationEnds?booleanendpointstringindexName?string## Modules

Module: aws

Functions

createAwsS3FileBatchUploader

createAwsS3FileBatchUploader(s3Client, expirationSeconds, parallelism?): FileBatchUploader

Parameters
NameTypeDefault value
s3ClientS3Clientundefined
expirationSecondsnumberundefined
parallelismnumber3
Returns

FileBatchUploader

Module: azure

References

CreateIndexOptions

Re-exports CreateIndexOptions


DocumentSummary

Re-exports DocumentSummary


IndexFeature

Re-exports IndexFeature


IndexMetadataSchema

Re-exports IndexMetadataSchema


IndexMetadataSchemaField

Re-exports IndexMetadataSchemaField


IndexSummary

Re-exports IndexSummary


IngestionRequest

Re-exports IngestionRequest


IngestionStatusDetail

Re-exports IngestionStatusDetail


IngestionSummary

Re-exports IngestionSummary


PaginatedWithNextLink

Re-exports PaginatedWithNextLink


VideoIngestion

Re-exports VideoIngestion


VideoRetrievalApiClient

Re-exports VideoRetrievalApiClient

Functions

createAzureBlobStorageFileBatchUploader

createAzureBlobStorageFileBatchUploader(blobServiceClient, expirationSeconds, parallelism?): FileBatchUploader

Parameters
NameTypeDefault value
blobServiceClientBlobServiceClientundefined
expirationSecondsnumberundefined
parallelismnumber3
Returns

FileBatchUploader

Module: azure/video-retrieval-api-client

Classes

Interfaces

Type Aliases

PaginatedWithNextLink

Ƭ PaginatedWithNextLink\<T>: Object

Type parameters
Name
T
Type declaration
NameType
nextLink?string
valueT[]

Module: chat

Classes

Interfaces

Type Aliases

ChatAboutVideoConstructorOptions

Ƭ ChatAboutVideoConstructorOptions: Partial\<Omit\<ChatAboutVideoOptions, "videoRetrievalIndex" | "extractVideoFrames">> & Required\<Pick\<ChatAboutVideoOptions, "openAiDeploymentName" | "storageContainerName">> & { extractVideoFrames?: Partial\<Exclude\<ChatAboutVideoOptions"extractVideoFrames", undefined>> ; videoRetrievalIndex?: Partial\<ChatAboutVideoOptions"videoRetrievalIndex"> & Pick\<Exclude\<ChatAboutVideoOptions"videoRetrievalIndex", undefined>, "endpoint" | "apiKey"> } & { azureStorageConnectionString?: string ; downloadUrlExpirationSeconds?: number ; openAiApiKey: string ; openAiEndpoint?: string }

Module: client-hack

Functions

fixClient

fixClient(openAIClient): void

Parameters
NameType
openAIClientany
Returns

void

Module: index

References

ChatAboutVideo

Re-exports ChatAboutVideo


ChatAboutVideoConstructorOptions

Re-exports ChatAboutVideoConstructorOptions


ChatAboutVideoOptions

Re-exports ChatAboutVideoOptions


Conversation

Re-exports Conversation


FileBatchUploader

Re-exports FileBatchUploader


VideoFramesExtractor

Re-exports VideoFramesExtractor


extractVideoFramesWithFfmpeg

Re-exports extractVideoFramesWithFfmpeg


lazyCreatedFileBatchUploader

Re-exports lazyCreatedFileBatchUploader


lazyCreatedVideoFramesExtractor

Re-exports lazyCreatedVideoFramesExtractor

Module: storage

References

FileBatchUploader

Re-exports FileBatchUploader

Functions

lazyCreatedFileBatchUploader

lazyCreatedFileBatchUploader(creator): FileBatchUploader

Parameters
NameType
creatorPromise\<FileBatchUploader>
Returns

FileBatchUploader

Module: storage/types

Type Aliases

FileBatchUploader

Ƭ FileBatchUploader: (dir: string, fileNames: string[], containerName: string, blobPathPrefix: string) => Promise\<string[]>

Type declaration

▸ (dir, fileNames, containerName, blobPathPrefix): Promise\<string[]>

####### Parameters

NameType
dirstring
fileNamesstring[]
containerNamestring
blobPathPrefixstring

####### Returns

Promise\<string[]>

Module: video

References

VideoFramesExtractor

Re-exports VideoFramesExtractor


extractVideoFramesWithFfmpeg

Re-exports extractVideoFramesWithFfmpeg

Functions

lazyCreatedVideoFramesExtractor

lazyCreatedVideoFramesExtractor(creator): VideoFramesExtractor

Parameters
NameType
creatorPromise\<VideoFramesExtractor>
Returns

VideoFramesExtractor

Module: video/ffmpeg

Functions

extractVideoFramesWithFfmpeg

extractVideoFramesWithFfmpeg(inputFile, outputDir, intervalSec, format?, width?, height?, startSec?, endSec?): Promise\<string[]>

Parameters
NameType
inputFilestring
outputDirstring
intervalSecnumber
format?string
width?number
height?number
startSec?number
endSec?number
Returns

Promise\<string[]>

Module: video/types

Type Aliases

VideoFramesExtractor

Ƭ VideoFramesExtractor: (inputFile: string, outputDir: string, intervalSec: number, format?: string, width?: number, height?: number, startSec?: number, endSec?: number) => Promise\<string[]>

Type declaration

▸ (inputFile, outputDir, intervalSec, format?, width?, height?, startSec?, endSec?): Promise\<string[]>

####### Parameters

NameType
inputFilestring
outputDirstring
intervalSecnumber
format?string
width?number
height?number
startSec?number
endSec?number

####### Returns

Promise\<string[]>

2.5.0

2 days ago

2.4.1

3 days ago

2.4.0

4 days ago

2.3.4

2 months ago

2.3.3

2 months ago

2.3.2

3 months ago

2.3.1

3 months ago

2.3.0

3 months ago

2.2.0

3 months ago

2.1.1

3 months ago

2.1.0

3 months ago

2.0.0

3 months ago

1.0.4

4 months ago

1.0.3

4 months ago

1.0.2

4 months ago

1.0.1

4 months ago