dyval.dyval_utils¶
- promptbench.dyval.dyval_utils.dyval_evaluate(dataset_type, preds, gts)¶
Evaluates predictions against ground truths for different dataset types.
Parameters:¶
- dataset_typestr
The type of dataset (e.g., ‘arithmetic’, ‘max_sum_path’).
- predslist
A list of predictions.
- gtslist
A list of ground truths.
Returns:¶
: float
The accuracy of predictions as a proportion of correct answers.
- promptbench.dyval.dyval_utils.process_dyval_inputs(prompt, dataset)¶
Processes inputs for dynamic value (DyVal) dataset.
Parameters:¶
- promptstr
The prompt template to be formatted.
- datasetDyValDataset
The dataset containing descriptions and other relevant data.
Returns:¶
: dict
A dictionary of processed inputs organized by order.
- promptbench.dyval.dyval_utils.process_dyval_preds(raw_pred)¶
Processes the raw prediction string to extract the predicted value.
Parameters:¶
- raw_predstr
The raw prediction string.
Returns:¶
: str
The extracted prediction.
- promptbench.dyval.dyval_utils.process_dyval_training_sample(sample, dataset_type)¶