Workflow
- Install required packages
- Make sure a MongoDB instance is set up and runnning
- Fill in your database credentials in .mongo.conf (see https://gitlab.com/MaxSchambach/mdbh)
- Fill in your desired settings in config.yaml
- Run zeroshot_sacred.py to start the experiment(s)
- Run transform.py to get a dataframe containing experiment results
Configuration
- name: Name of the experiment
- user: Username of the database
- host: Address of the database
- port: Port of the database
- database: Name of the database
- auth: Name of the authentication database
- pw: Password of the database
- device: GPU to use (see nvidia-smi for device numbers)
- batch_size: Batch size
- representation: A huggingface model (see https://huggingface.co/models)
- multi_threshold: Score threshold to use (see mlmc.thresholds.thresholds_dict.keys()). Single-label datasets will always use max prediction.
- formatted: If formatting is set to True each class label is replaced by a more descriptive label. Furthermore, if the huggingface method is used the hypothesis is replaced as well.
- cut_sample: Trims the input text to the maximum input size of the language model.
- method: "huggingface" or "flair"
- whole_dataset: If True the entire dataset is used for classification.
- datasets: Datasets to use (see mlmc.data.register.keys())