-
Notifications
You must be signed in to change notification settings - Fork 11
Closed
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestwontfixThis will not be worked onThis will not be worked on
Milestone
Description
A Light Discussion about Dataset Choices for URL (at least)
Besides a small subset of (m)C4, I prefer finding intersections among metadata (URL at least), promptsource, and evaluation WGs.
- TyDi QA (primary task) is probably the only common dataset
For either one of two WGs excluding us metadata here,
- From evaluation
- GEM from eval WG, specifically
- MLSum
- WikiLingua
- GEM from eval WG, specifically
- From promptsource
- app_reviews: although not really URL/URI but basically namespace and date
- CC-News: virtually a subset of C4
- Probably some more
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationenhancementNew feature or requestNew feature or requestwontfixThis will not be worked onThis will not be worked on