models.models¶

class promptbench.models.models.BLIP2Model(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: VLMBaseModel

Vision Language model class for the BLIP2 model.

Inherits from VLMBaseModel and sets up the BLIP2 vision language model for use.

Parameters:¶

modelstr: The name of the BLIP2 model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat, optional: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

Parameters of predict method:¶

input_images: list of PIL.Image: The input images.
input_text: str: The input text.

class promptbench.models.models.BaichuanModel(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: LMMBaseModel

Language model class for the Baichuan model.

Inherits from LMMBaseModel and sets up the Baichuan language model for use.

Parameters:¶

modelstr: The name of the Baichuan model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat, optional: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).

Methods:¶

predict(input_text, **kwargs): Generates a prediction based on the input text.

class promptbench.models.models.GeminiModel(model, max_new_tokens, temperature=0, gemini_key=None)¶

Bases: LMMBaseModel

Language model class for interfacing with Google’s Gemini models.

Inherits from LMMBaseModel and sets up a model interface for Gemini models.

Parameters:¶

modelstr: The name of the PaLM model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat, optional: The temperature for text generation (default is 0).
gemini_keystr, optional: The Gemini API key (default is None).

predict(input_text, **kwargs)¶

class promptbench.models.models.GeminiVisionModel(model, max_new_tokens, temperature, gemini_key=None)¶

Bases: VLMBaseModel

Vision Language model class for interfacing with Google’s Gemini models.

Inherits from VLMBaseModel and sets up a model interface for Gemini models.

Parameters:¶

modelstr: The name of the PaLM model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat, optional: The temperature for text generation (default is 0).
gemini_keystr, optional: The Gemini API key (default is None).

Parameters of predict method:¶

input_image: list of PIL.Image: The input images.
input_text: str: The input text.

predict(input_images, input_text, **kwargs)¶

class promptbench.models.models.InternLMVisionModel(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: VLMBaseModel

Vision Language model class for interfacing with InternLM’s vision language models.

Inherits from VLMBaseModel and sets up a model interface for InternLM’s vision language models.

Parameters:¶

model_namestr: The name of the InternLM model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat, optional: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

Parameters of predict method:¶

input_image: list of str: The url / local path of the input images.
input_text: str: The input text.

predict(input_images, input_text, **kwargs)¶

class promptbench.models.models.LLaVAModel(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: VLMBaseModel

Vision Language model class for the LLaVA model.

Inherits from VLMBaseModel and sets up the LLaVA vision language model for use.

Parameters:¶

modelstr: The name of the LLaVA model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

Parameters of predict method:¶

input_image: list of PIL.Image: The input images.
input_text: str: The input text. Using <image> as the placeholder for the image.

class promptbench.models.models.LMMBaseModel(model_name, max_new_tokens, temperature, device='auto')¶

Bases: ABC

Abstract base class for language model interfaces.

This class provides a common interface for various language models and includes methods for prediction.

Parameters:¶

modelstr: The name of the language model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).

Methods:¶

predict(input_text, **kwargs): Generates a prediction based on the input text.
__call__(input_text, **kwargs): Shortcut for predict method.

predict(input_text, **kwargs)¶

class promptbench.models.models.LlamaModel(model_name, max_new_tokens, temperature, device, dtype, system_prompt, model_dir)¶

Bases: LMMBaseModel

Language model class for the Llama model.

Inherits from LMMBaseModel and sets up the Llama language model for use.

Parameters:¶

modelstr: The name of the Llama model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).
system_promptstr: The system prompt to be used (default is None).
model_dirstr: The directory containing the model files (default is None). If not provided, it will be downloaded from the HuggingFace model hub.

predict(input_text, **kwargs)¶

class promptbench.models.models.MistralModel(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: LMMBaseModel

Language model class for the Mistral model.

Inherits from LMMBaseModel and sets up the Mistral language model for use.

Parameters:¶

modelstr: The name of the Mistral model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

class promptbench.models.models.MixtralModel(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: LMMBaseModel

Language model class for the Mixtral model.

Inherits from LMMBaseModel and sets up the Mixtral language model for use.

Parameters:¶

modelstr: The name of the Mixtral model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

class promptbench.models.models.OpenAIModel(model_name, max_new_tokens, temperature, system_prompt, openai_key)¶

Bases: LMMBaseModel

Language model class for interfacing with OpenAI’s GPT models.

Inherits from LMMBaseModel and sets up a model interface for OpenAI GPT models.

Parameters:¶

modelstr: The name of the OpenAI model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
system_promptstr: The system prompt to be used (default is None).
openai_keystr: The OpenAI API key (default is None).

Methods:¶

predict(input_text): Predicts the output based on the given input text using the OpenAI model.

predict(input_text, **kwargs)¶

class promptbench.models.models.OpenAIVisionModel(model_name, max_new_tokens, temperature, system_prompt, openai_key)¶

Bases: VLMBaseModel

Vision Language model class for interfacing with OpenAI’s GPT models.

Inherits from VLMBaseModel and sets up a model interface for OpenAI GPT models.

Parameters:¶

modelstr: The name of the OpenAI model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
system_promptstr: The system prompt to be used (default is None).
openai_keystr: The OpenAI API key (default is None).

Parameters of predict method:¶

input_image: list of str: The url / local path of the input images.
input_text: str: The input text.

predict(input_images, input_text, **kwargs)¶

class promptbench.models.models.PaLMModel(model, max_new_tokens, temperature=0, api_key=None)¶

Bases: LMMBaseModel

Language model class for interfacing with PaLM models.

Inherits from LMMBaseModel and sets up a model interface for PaLM models.

Parameters:¶

modelstr: The name of the PaLM model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat, optional: The temperature for text generation (default is 0).
api_keystr, optional: The PaLM API key (default is None).

predict(input_text, **kwargs)¶

class promptbench.models.models.PhiModel(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: LMMBaseModel

Language model class for the Phi model.

Inherits from LMMBaseModel and sets up the Phi language model for use.

Parameters:¶

modelstr: The name of the Phi model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

predict(input_text, **kwargs)¶

class promptbench.models.models.QwenVLModel(model_name, max_new_tokens, temperature, device, dtype, system_prompt, api_key)¶

Bases: VLMBaseModel

Vision Language model class for the Qwen model.

Inherits from VLMBaseModel and sets up the Qwen vision language model for use.

Parameters:¶

modelstr: The name of the Qwen model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).
system_promptstr: The system prompt to be used (default is None).
api_keystr: The api key for the Qwen model (default is None).

Parameters of predict method:¶

input_image: list of str: The url / local path of the input images. (Add “file://” prefix for local path when using ‘qwen-vl-plus’ and ‘qwen-vl-max’)
input_text: str: The input text.

predict(input_images, input_text, **kwargs)¶

class promptbench.models.models.T5Model(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: LMMBaseModel

Language model class for the T5 model.

Inherits from LMMBaseModel and sets up the T5 language model for use.

Parameters:¶

modelstr: The name of the T5 model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

class promptbench.models.models.UL2Model(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: LMMBaseModel

Language model class for the UL2 model.

Inherits from LMMBaseModel and sets up the UL2 language model for use.

Parameters:¶

modelstr: The name of the UL2 model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).

class promptbench.models.models.VLMBaseModel(model_name, max_new_tokens, temperature, device='auto')¶

Bases: ABC

Abstract base class for vision language model interfaces.

This class provides a common interface for various vision language models and includes methods for prediction.

Parameters:¶

modelstr: The name of the vision language model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).

Methods:¶

predict(input_images, input_text, **kwargs): Generates a prediction based on the input images and text.
__call__(input_image, input_text, **kwargs): Shortcut for predict method.

predict(input_images, input_text, **kwargs)¶

class promptbench.models.models.VicunaModel(model_name, max_new_tokens, temperature, device, dtype, model_dir)¶

Bases: LMMBaseModel

Language model class for the Vicuna model.

Inherits from LMMBaseModel and sets up the Vicuna language model for use.

Parameters:¶

modelstr: The name of the Vicuna model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat, optional: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).
dtype: str: The dtype to use for inference (default is ‘auto’).
model_dirstr, optional: The directory containing the model files (default is None).

predict(input_text, **kwargs)¶

class promptbench.models.models.YiModel(model_name, max_new_tokens, temperature, device, dtype)¶

Bases: LMMBaseModel

Language model class for the Yi model.

Inherits from LMMBaseModel and sets up the Yi language model for use.

Parameters:¶

modelstr: The name of the Yi model.
max_new_tokensint: The maximum number of new tokens to be generated.
temperaturefloat: The temperature for text generation (default is 0).
device: str: The device to use for inference (default is ‘auto’).