Off-the-shelf (OTS) Datasets
English (United States) Harmful and harmless prompts and responses **in development**
Dataset ID:
eng_USA_LLM001
Dataset Name:
English (United States) Harmful and harmless prompts and responses **in development**
Common Use Cases:
LLM training, LLM Red teaming, Chatbot
Language:
English
Country:
United States
Language Code:
eng
Country Code:
USA
Product Type
Text
Detailed Product Type
LLM training
Unit
300 prompts
Recording Device
N/A
Recording Condition
N/A
Contributors
Available upon request
Utterances
300
Unique Words
Available upon request
Sample Rate (kHz):
N/A
Channels
N/A
Data Format
csv
Source
Appen Global
Additional Info:
- Prompts and responses annotated for Harm category, Intensity, Voice, and Phrasing.
- Data collected, QA is underway, expected to be ready Q1 2025. Can be prioritized upon request.
Year of Collection
2024
Get Started with Off-the-Shelf AI Training Datasets
Appen’s extensive catalog of off-the-shelf (OTS) datasets spans multiple data types and industries, providing comprehensive coverage for various AI applications. These datasets are crafted to the highest standards of quality and accuracy, ensuring reliable training data for AI models.
Talk to an expert