Fabian Ziegner
ZeroshotEvaluation

Repository



Workflow

Install required packages
Make sure a MongoDB instance is set up and runnning
Fill in your database credentials in .mongo.conf (see https://gitlab.com/MaxSchambach/mdbh)
Fill in your desired settings in config.yaml
Run zeroshot_sacred.py to start the experiment(s)
Run transform.py to get a dataframe containing experiment results


Configuration

name: Name of the experiment
user: Username of the database
host: Address of the database
port: Port of the database
database: Name of the database
auth: Name of the authentication database
pw: Password of the database
device: GPU to use (see nvidia-smi for device numbers)
batch_size: Batch size
representation: A huggingface model (see https://huggingface.co/models)
multi_threshold: Score threshold to use (see mlmc.thresholds.thresholds_dict.keys()). Single-label datasets will always use max prediction.
formatted: If formatting is set to True each class label is replaced by a more descriptive label. Furthermore, if the huggingface method is used the hypothesis is replaced as well.
cut_sample: Trims the input text to the maximum input size of the language model.
method: "huggingface" or "flair"
whole_dataset: If True the entire dataset is used for classification.
datasets: Datasets to use (see mlmc.data.register.keys())