humanloop v0.6.19
Humanloop
Table of Contents
- Installation
- Streaming Support
- Getting Started
- Reference
humanloop.chat
humanloop.chatDeployed
humanloop.chatExperiment
humanloop.chatModelConfiguration
humanloop.complete
humanloop.completeDeployed
humanloop.completeExperiment
humanloop.completeModelConfiguration
humanloop.datapoints.delete
humanloop.datapoints.get
humanloop.datapoints.update
humanloop.datasets.create
humanloop.datasets.createDatapoint
humanloop.datasets.delete
humanloop.datasets.get
humanloop.datasets.list
humanloop.datasets.listAllForProject
humanloop.datasets.listDatapoints
humanloop.datasets.update
humanloop.evaluations.addEvaluators
humanloop.evaluations.create
humanloop.evaluations.get
humanloop.evaluations.list
humanloop.evaluations.listAllForProject
humanloop.evaluations.listDatapoints
humanloop.evaluations.log
humanloop.evaluations.result
humanloop.evaluations.updateStatus
humanloop.evaluators.create
humanloop.evaluators.delete
humanloop.evaluators.get
humanloop.evaluators.list
humanloop.evaluators.update
humanloop.experiments.create
humanloop.experiments.delete
humanloop.experiments.list
humanloop.experiments.sample
humanloop.experiments.update
humanloop.feedback
humanloop.logs.delete
humanloop.logs.get
humanloop.logs.list
humanloop.log
humanloop.logs.update
humanloop.logs.updateByRef
humanloop.modelConfigs.deserialize
humanloop.modelConfigs.export
humanloop.modelConfigs.get
humanloop.modelConfigs.register
humanloop.modelConfigs.serialize
humanloop.projects.create
humanloop.projects.createFeedbackType
humanloop.projects.deactivateConfig
humanloop.projects.deactivateExperiment
humanloop.projects.delete
humanloop.projects.deleteDeployedConfig
humanloop.projects.deployConfig
humanloop.projects.export
humanloop.projects.get
humanloop.projects.getActiveConfig
humanloop.projects.list
humanloop.projects.listConfigs
humanloop.projects.listDeployedConfigs
humanloop.projects.update
humanloop.projects.updateFeedbackTypes
humanloop.sessions.create
humanloop.sessions.get
humanloop.sessions.list
Installation
npm i humanloop
pnpm i humanloop
yarn add humanloop
Streaming Support
This SDK supports streaming, see example usage in a NextJS application here
Getting Started
import { Humanloop } from "humanloop";
const humanloop = new Humanloop({
// Defining the base path is optional and defaults to https://api.humanloop.com/v4
// basePath: "https://api.humanloop.com/v4",
openaiApiKey: "openaiApiKey",
anthropicApiKey: "anthropicApiKey",
apiKey: "API_KEY",
});
const chatResponse = await humanloop.chat({
project: "sdk-example",
messages: [
{
role: "user",
content: "Explain asynchronous programming.",
},
],
model_config: {
model: "gpt-3.5-turbo",
max_tokens: -1,
temperature: 0.7,
chat_template: [
{
role: "system",
content:
"You are a helpful assistant who replies in the style of {{persona}}.",
},
],
},
inputs: {
persona: "the pirate Blackbeard",
},
stream: false,
});
console.log(chatResponse);
const completeResponse = await humanloop.complete({
project: "sdk-example",
inputs: {
text: "Llamas that are well-socialized and trained to halter and lead after weaning and are very friendly and pleasant to be around. They are extremely curious and most will approach people easily. However, llamas that are bottle-fed or over-socialized and over-handled as youth will become extremely difficult to handle when mature, when they will begin to treat humans as they treat each other, which is characterized by bouts of spitting, kicking and neck wrestling.[33]",
},
model_config: {
model: "gpt-3.5-turbo",
max_tokens: -1,
temperature: 0.7,
prompt_template:
"Summarize this for a second-grade student:\n\nText:\n{{text}}\n\nSummary:\n",
},
stream: false,
});
console.log(completeResponse);
const feedbackResponse = await humanloop.feedback({
type: "rating",
value: "good",
data_id: "data_[...]",
user: "user@example.com",
});
console.log(feedbackResponse);
const logResponse = await humanloop.log({
project: "sdk-example",
inputs: {
text: "Llamas that are well-socialized and trained to halter and lead after weaning and are very friendly and pleasant to be around. They are extremely curious and most will approach people easily. However, llamas that are bottle-fed or over-socialized and over-handled as youth will become extremely difficult to handle when mature, when they will begin to treat humans as they treat each other, which is characterized by bouts of spitting, kicking and neck wrestling.[33]",
},
output:
"Llamas can be friendly and curious if they are trained to be around people, but if they are treated too much like pets when they are young, they can become difficult to handle when they grow up. This means they might spit, kick, and wrestle with their necks.",
source: "sdk",
config: {
model: "gpt-3.5-turbo",
max_tokens: -1,
temperature: 0.7,
prompt_template:
"Summarize this for a second-grade student:\n\nText:\n{{text}}\n\nSummary:\n",
type: "model",
},
});
console.log(logResponse);
Reference
humanloop.chat
Get a chat response by providing details of the model configuration in the request.
🛠️ Usage
const createResponse = await humanloop.chat({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
model_config: {
provider: "openai",
model: "model_example",
max_tokens: -1,
temperature: 1,
top_p: 1,
presence_penalty: 0,
frequency_penalty: 0,
endpoint: "complete",
},
});
⚙️ Parameters
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
model_config: ModelConfigChatRequest
The model configuration used to create a chat response.
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of generations.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
🔄 Return
🌐 Endpoint
/chat
POST
humanloop.chatDeployed
Get a chat response using the project's active deployment.
The active deployment can be a specific model configuration or an experiment.
🛠️ Usage
const createDeployedResponse = await humanloop.chatDeployed({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
});
⚙️ Parameters
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of generations.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
environment: string
The environment name used to create a chat response. If not specified, the default environment will be used.
🔄 Return
🌐 Endpoint
/chat-deployed
POST
humanloop.chatExperiment
Get a chat response for a specific experiment.
🛠️ Usage
const createExperimentResponse = await humanloop.chatExperiment({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
experiment_id: "experiment_id_example",
});
⚙️ Parameters
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
experiment_id: string
If an experiment ID is provided a model configuration will be sampled from the experiments active model configurations.
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of chat responses, where each chat response will use a model configuration sampled from the experiment.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
🔄 Return
🌐 Endpoint
/chat-experiment
POST
humanloop.chatModelConfiguration
Get chat response for a specific model configuration.
🛠️ Usage
const createModelConfigResponse = await humanloop.chatModelConfiguration({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
messages: [
{
role: "user",
},
],
model_config_id: "model_config_id_example",
});
⚙️ Parameters
messages: ChatMessageWithToolCall
[]
The messages passed to the to provider chat endpoint.
model_config_id: string
Identifies the model configuration used to create a chat response.
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of generations.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
tool_choice: ToolChoiceProperty
tool_call: ToolCallProperty
response_format: ResponseFormat
The format of the response. Only type json_object is currently supported for chat.
🔄 Return
🌐 Endpoint
/chat-model-config
POST
humanloop.complete
Create a completion by providing details of the model configuration in the request.
🛠️ Usage
const createResponse = await humanloop.complete({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
model_config: {
provider: "openai",
model: "model_example",
max_tokens: -1,
temperature: 1,
top_p: 1,
presence_penalty: 0,
frequency_penalty: 0,
endpoint: "complete",
prompt_template: "{{question}}",
},
});
⚙️ Parameters
model_config: ModelConfigCompletionRequest
The model configuration used to generate.
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of generations.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
logprobs: number
Include the log probabilities of the top n tokens in the provider_response
suffix: string
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
🔄 Return
🌐 Endpoint
/completion
POST
humanloop.completeDeployed
Create a completion using the project's active deployment.
The active deployment can be a specific model configuration or an experiment.
🛠️ Usage
const createDeployedResponse = await humanloop.completeDeployed({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
});
⚙️ Parameters
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of generations.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
logprobs: number
Include the log probabilities of the top n tokens in the provider_response
suffix: string
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
environment: string
The environment name used to create a chat response. If not specified, the default environment will be used.
🔄 Return
🌐 Endpoint
/completion-deployed
POST
humanloop.completeExperiment
Create a completion for a specific experiment.
🛠️ Usage
const createExperimentResponse = await humanloop.completeExperiment({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
experiment_id: "experiment_id_example",
});
⚙️ Parameters
experiment_id: string
If an experiment ID is provided a model configuration will be sampled from the experiments active model configurations.
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of chat responses, where each chat response will use a model configuration sampled from the experiment.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
logprobs: number
Include the log probabilities of the top n tokens in the provider_response
suffix: string
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
🔄 Return
🌐 Endpoint
/completion-experiment
POST
humanloop.completeModelConfiguration
Create a completion for a specific model configuration.
🛠️ Usage
const createModelConfigResponse = await humanloop.completeModelConfiguration({
save: true,
num_samples: 1,
stream: false,
return_inputs: true,
model_config_id: "model_config_id_example",
});
⚙️ Parameters
model_config_id: string
Identifies the model configuration used to create a chat response.
project: string
Unique project name. If no project exists with this name, a new project will be created.
project_id: string
Unique ID of a project to associate to the log. Either this or project
must be provided.
session_id: string
ID of the session to associate the datapoint.
session_reference_id: string
A unique string identifying the session to associate the datapoint to. Allows you to log multiple datapoints to a session (using an ID kept by your internal systems) by passing the same session_reference_id
in subsequent log requests. Specify at most one of this or session_id
.
parent_id: string
ID associated to the parent datapoint in a session.
parent_reference_id: string
A unique string identifying the previously-logged parent datapoint in a session. Allows you to log nested datapoints with your internal system IDs by passing the same reference ID as parent_id
in a prior log request. Specify at most one of this or parent_id
. Note that this cannot refer to a datapoint being logged in the same request.
inputs: object
The inputs passed to the prompt template.
source: string
Identifies where the model was called from.
metadata: object
Any additional metadata to record.
save: boolean
Whether the request/response payloads will be stored on Humanloop.
source_datapoint_id: string
ID of the source datapoint if this is a log derived from a datapoint in a dataset.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization.
num_samples: number
The number of generations.
stream: boolean
If true, tokens will be sent as data-only server-sent events. If num_samples > 1, samples are streamed back independently.
user: string
End-user ID passed through to provider call.
seed: number
Deprecated field: the seed is instead set as part of the request.config object.
return_inputs: boolean
Whether to return the inputs in the response. If false, the response will contain an empty dictionary under inputs. This is useful for reducing the size of the response. Defaults to true.
logprobs: number
Include the log probabilities of the top n tokens in the provider_response
suffix: string
The suffix that comes after a completion of inserted text. Useful for completions that act like inserts.
🔄 Return
🌐 Endpoint
/completion-model-config
POST
humanloop.datapoints.delete
Delete a list of datapoints by their IDs.
WARNING: This endpoint has been decommisioned and no longer works. Please use the v5 datasets API instead.
🛠️ Usage
const deleteResponse = await humanloop.datapoints.delete();
🌐 Endpoint
/datapoints
DELETE
humanloop.datapoints.get
Get a datapoint by ID.
🛠️ Usage
const getResponse = await humanloop.datapoints.get({
id: "id_example",
});
⚙️ Parameters
id: string
String ID of datapoint.
🔄 Return
🌐 Endpoint
/datapoints/{id}
GET
humanloop.datapoints.update
Edit the input, messages and criteria fields of a datapoint.
WARNING: This endpoint has been decommisioned and no longer works. Please use the v5 datasets API instead.
🛠️ Usage
const updateResponse = await humanloop.datapoints.update({
id: "id_example",
});
⚙️ Parameters
id: string
String ID of datapoint.
🔄 Return
🌐 Endpoint
/datapoints/{id}
PATCH
humanloop.datasets.create
Create a new dataset for a project.
🛠️ Usage
const createResponse = await humanloop.datasets.create({
projectId: "projectId_example",
description: "description_example",
name: "name_example",
});
⚙️ Parameters
description: string
The description of the dataset.
name: string
The name of the dataset.
projectId: string
🔄 Return
🌐 Endpoint
/projects/{project_id}/datasets
POST
humanloop.datasets.createDatapoint
Create a new datapoint for a dataset.
Here in the v4 API, this has the following behaviour:
- Retrieve the current latest version of the dataset.
- Construct a new version of the dataset with the new testcases added.
- Store that latest version as a committed version with an autogenerated commit message and return the new datapoints
🛠️ Usage
const createDatapointResponse = await humanloop.datasets.createDatapoint({
datasetId: "dataset_id_example",
requestBody: {
log_ids: ["log_ids_example"],
},
});
⚙️ Parameters
datasetId: string
String ID of dataset. Starts with evts_
.
requestBody: DatasetsCreateDatapointRequest
🔄 Return
🌐 Endpoint
/datasets/{dataset_id}/datapoints
POST
humanloop.datasets.delete
Delete a dataset by ID.
🛠️ Usage
const deleteResponse = await humanloop.datasets.delete({
id: "id_example",
});
⚙️ Parameters
id: string
String ID of dataset. Starts with evts_
.
🌐 Endpoint
/datasets/{id}
DELETE
humanloop.datasets.get
Get a single dataset by ID.
🛠️ Usage
const getResponse = await humanloop.datasets.get({
id: "id_example",
});
⚙️ Parameters
id: string
String ID of dataset. Starts with evts_
.
🔄 Return
🌐 Endpoint
/datasets/{id}
GET
humanloop.datasets.list
Get all Datasets for an organization.
🛠️ Usage
const listResponse = await humanloop.datasets.list();
🔄 Return
🌐 Endpoint
/datasets
GET
humanloop.datasets.listAllForProject
Get all datasets for a project.
🛠️ Usage
const listAllForProjectResponse = await humanloop.datasets.listAllForProject({
projectId: "projectId_example",
});
⚙️ Parameters
projectId: string
🔄 Return
🌐 Endpoint
/projects/{project_id}/datasets
GET
humanloop.datasets.listDatapoints
Get datapoints for a dataset.
🛠️ Usage
const listDatapointsResponse = await humanloop.datasets.listDatapoints({
datasetId: "datasetId_example",
page: 0,
size: 50,
});
⚙️ Parameters
datasetId: string
String ID of dataset. Starts with evts_
.
page: number
size: number
🔄 Return
PaginatedDataDatapointResponse
🌐 Endpoint
/datasets/{dataset_id}/datapoints
GET
humanloop.datasets.update
Update a testset by ID.
🛠️ Usage
const updateResponse = await humanloop.datasets.update({
id: "id_example",
});
⚙️ Parameters
id: string
String ID of testset. Starts with evts_
.
description: string
The description of the dataset.
name: string
The name of the dataset.
🔄 Return
🌐 Endpoint
/datasets/{id}
PATCH
humanloop.evaluations.addEvaluators
Add evaluators to an existing evaluation run.
🛠️ Usage
const addEvaluatorsResponse = await humanloop.evaluations.addEvaluators({
id: "id_example",
evaluator_ids: ["evaluator_ids_example"],
});
⚙️ Parameters
evaluator_ids: string
[]
IDs of evaluators to add to the evaluation run. IDs start with evfn_
id: string
String ID of evaluation run. Starts with ev_
.
🔄 Return
🌐 Endpoint
/evaluations/{id}/evaluators
PATCH
humanloop.evaluations.create
Create an evaluation.
🛠️ Usage
const createResponse = await humanloop.evaluations.create({
projectId: "projectId_example",
config_id: "config_id_example",
evaluator_ids: ["evaluator_ids_example"],
dataset_id: "dataset_id_example",
max_concurrency: 5,
hl_generated: true,
});
⚙️ Parameters
config_id: string
ID of the config to evaluate. Starts with config_
.
evaluator_ids: string
[]
IDs of evaluators to run on the dataset. IDs start with evfn_
dataset_id: string
ID of the dataset to use in this evaluation. Starts with evts_
.
projectId: string
String ID of project. Starts with pr_
.
provider_api_keys: ProviderApiKeys
API keys required by each provider to make API calls. The API keys provided here are not stored by Humanloop. If not specified here, Humanloop will fall back to the key saved to your organization. Ensure you provide an API key for the provider for the model config you are evaluating, or have one saved to your organization.
max_concurrency: number
The maximum number of concurrent generations to run. A higher value will result in faster completion of the evaluation but may place higher load on your provider rate-limits.
hl_generated: boolean
Whether the log generations for this evaluation should be performed by Humanloop. If False
, the log generations should be submitted by the user via the API.
🔄 Return
🌐 Endpoint
/projects/{project_id}/evaluations
POST
humanloop.evaluations.get
Get evaluation by ID.
🛠️ Usage
const getResponse = await humanloop.evaluations.get({
id: "id_example",
});
⚙️ Parameters
id: string
String ID of evaluation run. Starts with ev_
.
evaluatorAggregates: boolean
Whether to include evaluator aggregates in the response.
🔄 Return
🌐 Endpoint
/evaluations/{id}
GET
humanloop.evaluations.list
Get the evaluations associated with a project.
Sorting and filtering are supported through query params for categorical columns
and the created_at
timestamp.
Sorting is supported for the dataset
, config
, status
and evaluator-{evaluator_id}
columns.
Specify sorting with the sort
query param, with values {column}.{ordering}
.
E.g. ?sort=dataset.asc&sort=status.desc will yield a multi-column sort. First by dataset then by status.
Filtering is supported for the id
, dataset
, config
and status
columns.
Specify filtering with the id_filter
, dataset_filter
, config_filter
and status_filter
query params.
E.g. ?dataset_filter=my_dataset&dataset_filter=my_other_dataset&status_filter=running will only show rows where the dataset is "my_dataset" or "my_other_dataset", and where the status is "running".
An additional date range filter is supported for the created_at
column. Use the start_date
and end_date
query parameters to configure this.
🛠️ Usage
const listResponse = await humanloop.evaluations.list({
projectId: "projectId_example",
size: 50,
page: 0,
});
⚙️ Parameters
projectId: string
String ID of project. Starts with pr_
.
id: string
[]
A list of evaluation run ids to filter on. Starts with ev_
.
startDate: string | Date
Only return evaluations created after this date.
endDate: string | Date
Only return evaluations created before this date.
size: number
page: number
🔄 Return
PaginatedDataEvaluationResponse
🌐 Endpoint
/evaluations
GET
humanloop.evaluations.listAllForProject
Get all the evaluations associated with your project.
Deprecated: This is a legacy unpaginated endpoint. Use /evaluations
instead, with appropriate
sorting, filtering and pagination options.
🛠️ Usage
const listAllForProjectResponse = await humanloop.evaluations.listAllForProject(
{
projectId: "projectId_example",
}
);
⚙️ Parameters
projectId: string
String ID of project. Starts with pr_
.
evaluatorAggregates: boolean
Whether to include evaluator aggregates in the response.
🔄 Return
🌐 Endpoint
/projects/{project_id}/evaluations
GET
humanloop.evaluations.listDatapoints
Get testcases by evaluation ID.
🛠️ Usage
const listDatapointsResponse = await humanloop.evaluations.listDatapoints({
id: "id_example",
page: 1,
size: 10,
});
⚙️ Parameters
id: string
String ID of evaluation. Starts with ev_
.
page: number
Page to fetch. Starts from 1.
size: number
Number of evaluation results to retrieve.
🔄 Return
PaginatedDataEvaluationDatapointSnapshotResponse
🌐 Endpoint
/evaluations/{id}/datapoints
GET
humanloop.evaluations.log
Log an external generation to an evaluation run for a datapoint.
The run must have status 'running'.
🛠️ Usage
const logResponse = await humanloop.evaluations.log({
evaluationId: "evaluationId_example",
datapoint_id: "datapoint_id_example",
log: {
save: true,
},
});
⚙️ Parameters
datapoint_id: string
The datapoint for which a log was generated. Must be one of the datapoints in the dataset being evaluated.
log: LogRequest
The log generated for the datapoint.
evaluationId: string
ID of the evaluation run. Starts with evrun_
.
🔄 Return
🌐 Endpoint
/evaluations/{evaluation_id}/log
POST
humanloop.evaluations.result
Log an evaluation result to an evaluation run.
The run must have status 'running'. One of result
or error
must be provided.
🛠️ Usage
const resultResponse = await humanloop.evaluations.result({
evaluationId: "evaluationId_example",
log_id: "log_id_example",
evaluator_id: "evaluator_id_example",
});
⚙️ Parameters
log_id: string
The log that was evaluated. Must have as its source_datapoint_id
one of the datapoints in the dataset being evaluated.
evaluator_id: string
ID of the evaluator that evaluated the log. Starts with evfn_
. Must be one of the evaluator IDs associated with the evaluation run being logged to.
evaluationId: string
ID of the evaluation run. Starts with evrun_
.
result: ValueProperty
error: string
An error that occurred during evaluation.
🔄 Return
🌐 Endpoint
/evaluations/{evaluation_id}/result
POST
humanloop.evaluations.updateStatus
Update the status of an evaluation run.
Can only be used to update the status of an evaluation run that uses external or human evaluators. The evaluation must currently have status 'running' if swithcing to completed, or it must have status 'completed' if switching back to 'running'.
🛠️ Usage
const updateStatusResponse = await humanloop.evaluations.updateStatus({
id: "id_example",
status: "pending",
});
⚙️ Parameters
status: EvaluationStatus
The new status of the evaluation.
id: string
String ID of evaluation run. Starts with ev_
.
🔄 Return
🌐 Endpoint
/evaluations/{id}/status
PATCH
humanloop.evaluators.create
Create an evaluator within your organization.
🛠️ Usage
const createResponse = await humanloop.evaluators.create({
description: "description_example",
name: "name_example",
arguments_type: "target_free",
return_type: "boolean",
type: "python",
});
⚙️ Parameters
description: string
The description of the evaluator.
name: string
The name of the evaluator.
arguments_type: EvaluatorArgumentsType
Whether this evaluator is target-free or target-required.
return_type: EvaluatorReturnTypeEnum
The type of the return value of the evaluator.
type: EvaluatorType
The type of the evaluator.
code: string
The code for the evaluator. This code will be executed in a sandboxed environment.
model_config: ModelConfigCompletionRequest
The model configuration used to generate.
🔄 Return
🌐 Endpoint
/evaluators
POST
humanloop.evaluators.delete
Delete an evaluator within your organization.
🛠️ Usage
const deleteResponse = await humanloop.evaluators.delete({
id: "id_example",
});
⚙️ Parameters
id: string
🌐 Endpoint
/evaluators/{id}
DELETE
humanloop.evaluators.get
Get an evaluator within your organization.
🛠️ Usage
const getResponse = await humanloop.evaluators.get({
id: "id_example",
});
⚙️ Parameters
id: string
🔄 Return
🌐 Endpoint
/evaluators/{id}
GET
humanloop.evaluators.list
Get all evaluators within your organization.
🛠️ Usage
const listResponse = await humanloop.evaluators.list();
🔄 Return
🌐 Endpoint
/evaluators
GET
humanloop.evaluators.update
Update an evaluator within your organization.
🛠️ Usage
const updateResponse = await humanloop.evaluators.update({
id: "id_example",
arguments_type: "target_free",
return_type: "boolean",
});
⚙️ Parameters
id: string
description: string
8 days ago
7 days ago
14 days ago
27 days ago
28 days ago
1 month ago
1 month ago
1 month ago
1 month ago
1 month ago
1 month ago
1 month ago
2 months ago
2 months ago
2 months ago
2 months ago
2 months ago
2 months ago
2 months ago
2 months ago
2 months ago
2 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
3 months ago
4 months ago
3 months ago
3 months ago
3 months ago
4 months ago
4 months ago
4 months ago
4 months ago
4 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
5 months ago
6 months ago
6 months ago
6 months ago
6 months ago
7 months ago
6 months ago
5 months ago
6 months ago
6 months ago
7 months ago
5 months ago
5 months ago
5 months ago
6 months ago
6 months ago
6 months ago
6 months ago
6 months ago
5 months ago
6 months ago
5 months ago
5 months ago
5 months ago
5 months ago
8 months ago
8 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
8 months ago
7 months ago
8 months ago
8 months ago
8 months ago
8 months ago
8 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
7 months ago
8 months ago
7 months ago
7 months ago
8 months ago
8 months ago
8 months ago
8 months ago
8 months ago
8 months ago
9 months ago
9 months ago
10 months ago
10 months ago
11 months ago
10 months ago
11 months ago
10 months ago
11 months ago
11 months ago
11 months ago
9 months ago
11 months ago
11 months ago
9 months ago
11 months ago
9 months ago
9 months ago
11 months ago
11 months ago
11 months ago
11 months ago
12 months ago
12 months ago
12 months ago
12 months ago
9 months ago
9 months ago
9 months ago
9 months ago
9 months ago
10 months ago
10 months ago
10 months ago
10 months ago
11 months ago
10 months ago
10 months ago
10 months ago
10 months ago
11 months ago
11 months ago
11 months ago
11 months ago
9 months ago
12 months ago
9 months ago
9 months ago
12 months ago
9 months ago
9 months ago
11 months ago
9 months ago
10 months ago
11 months ago
10 months ago
12 months ago
10 months ago
12 months ago
12 months ago
12 months ago
1 year ago
1 year ago
1 year ago
12 months ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago
1 year ago