Host the dataset in HuggingFace

We have seen in the previous tutorial FineTuning_Ludwig that in order to fine-tune a model on a new task, we need to have a dataset in the right format. In this tutorial, we will see how to push your dataset to HuggingFace so that it can be used later to fine-tune a model.

Note

You need to create an account on HuggingFace to be able to push your dataset. You can create an account here.

There is a video tutorial available for this section watch it.

Pushing the dataset

Once your dataset is in the right format (see the data format tutorial), you can upload it to HuggingFace. To do so, you need to follow the following steps:

  1. Connect to your HuggingFace account.

  2. Visit the following link and give a name to your dataset, choose a proper license and choose if you want to make your dataset public or private.

Alternative text for the image

Create a new dataset page.

  1. Click on the Files and versions tab.

Alternative text for the image

The Files and versions tab.

  1. Click on Add file, then Upload files and upload your dataset.

Alternative text for the image

The Files and versions tab.

  1. Drag and drop your dataset.

Alternative text for the image

The drag and drop area.

  1. Hit Commit changes to main.

Alternative text for the image

The Commit changes to main button.

Congratulations! You have successfully pushed your dataset to HuggingFace. You can now use it to fine-tune a model on a new task.

Alternative text for the image

The preview of the data.