Host the dataset in HuggingFace
We have seen in the previous tutorial FineTuning_Ludwig that in order to fine-tune a model on a new task, we need to have a dataset in the right format. In this tutorial, we will see how to push your dataset to HuggingFace so that it can be used later to fine-tune a model.
Note
You need to create an account on HuggingFace to be able to push your dataset. You can create an account here.
There is a video tutorial available for this section watch it.
Pushing the dataset
Once your dataset is in the right format (see the data format tutorial), you can upload it to HuggingFace. To do so, you need to follow the following steps:
Connect to your HuggingFace account.
Visit the following link and give a name to your dataset, choose a proper license and choose if you want to make your dataset public or private.
Create a new dataset page.
Click on the Files and versions tab.
The Files and versions tab.
Click on Add file, then Upload files and upload your dataset.
The Files and versions tab.
Drag and drop your dataset.
The drag and drop area.
Hit Commit changes to main.
The Commit changes to main button.
Congratulations! You have successfully pushed your dataset to HuggingFace. You can now use it to fine-tune a model on a new task.
The preview of the data.