-
-
Notifications
You must be signed in to change notification settings - Fork 207
Open
Description
I got it working... but it was brutal, about 300 lines of code. I feel like I did it the hard way, but I wasn't sure if there was an easier way after reading the CSV parser code.
- Parsing the Strings into Longs, Doubles, Strings
- Finding out the "worst" type for each column and normalizing across the column
- Making lookup tables for each column that needs it (small number of ints, or Strings)
- Generate a dataset based on the output column name
Is there an easier way to do this?
Can it be part of the library?
class TableDataLoader
- TableDataLoader(Table<Long, String, String>)
- getDataSet(String)
- tableToDataSet_Classification(ColumnInfo, List, SortedSet, int, int)
- tableToDataSet_Regression(ColumnInfo, List, SortedSet, int, int)
class ColumnInfo
- ColumnInfo(String, Map<Long, String>)
- collectionToSortedUniqueStringList(Collection)
- parseColumn(Map<Long, String>)
- parseToLowestObject(String, Class<?>)
- constructJSATCategoricalData()
- constructLabelLookups()
- getCategoricalData()
- getName()
- getType()
- isLookup()
- getRowValue(Number)
- getKeyFromLookupId(int)
- getAllRowKeys()
Metadata
Metadata
Assignees
Labels
No labels