This repository contains scripts and instructions to run the SLPHelm benchmark.
There are two sub-folders:
finetune: scripts to finetune models with self-generated data.finetune-ultrasuite: instructions to create UltraSuite dataset and finetune models with LLaMa-Factory framework.
- Install Helm:
git clone https://github.com/martinakaduc/helm/ -b slp_helm
cd helm
pip install -e .- Run the benchmark:
# Binary Classification
helm-run --run-entries \
ultra_suite_classification:model={model_name} \
--suite binary-suite \
--output-path {evaluation_dir} \
--disable-cache \
--max-eval-instances 1000
# ASR Classification
helm-run --run-entries \
ultra_suite_classification:model={model_name} \
--suite asr-suite \
--output-path {evaluation_dir} \
--disable-cache \
--max-eval-instances 1000
# ASR Transcription
helm-run --run-entries \
ultra_suite_asr_transcription:model={model_name} \
--suite trans-suite \
--output-path {evaluation_dir} \
--disable-cache \
--max-eval-instances 1000
# Type Classification
helm-run --run-entries \
ultra_suite_classification_breakdown:model={model_name} \
--suite type-suite \
--output-path {evaluation_dir} \
--disable-cache \
--max-eval-instances 1000
# Symptom Classification
helm-run --run-entries \
ultra_suite_disorder_symptoms:model={model_name} \
--suite symp-suite \
--output-path {evaluation_dir} \
--disable-cache \
--max-eval-instances 1000