Import from Hugging Face
How to import datasets from HuggingFace Datasets to HoneyHive.
Since HoneyHive’s datasets don’t follow a fixed schema format, we have an automatic integration with HuggingFace datasets (or any kind of dataset management tool) to import datasets into HoneyHive.
Upload a dataset through the SDK
On a high level, all we need to do is
- define our mapping of inputs-outputs
- importing batch size to setup the integration.
Prerequisites
- You have already created a project in HoneyHive, as explained here.
- You have an API key for your project, as explained here.
Expected time: few minutes
Installation
To install our SDK, run the following commands in the shell.
Authentication & Imports
To authenticate your SDK, you need to pass your API key.
Create the HoneyHive dataset
Give your new dataset a name and pass the project name to which you want to associate the dataset.
Keep the generated dataset_id
handy for future reference.
Pass your data in batches with a mapping
Now, using the dataset_id
, you can pass your data list and provide a mapping to the fields.
We’ll create unique datapoints for each entry in the JSON list. The datapoint_id
on those entries will be used for joining traces in experiment runs in the future.
metadata
of the datapoint.You have successfully uploaded your HuggingFace dataset to HoneyHive using the SDK.
You can now view your dataset in the HoneyHive UI.