Prediction in weka using explorer -


once have trained , generated model , of examples have seen , using testing set have put values actual , predicted , there way can either put actual column empty or cannot use @ when doing prediction

if take example , following training set

@relation supermarket @attribute 'department1' { t} @attribute 'department2' { t} @attribute 'department3' { t} @attribute value 

and using testing set like

 @relation supermarket @attribute 'department1' { t} @attribute 'department2' { t} @attribute 'department3' { t} @attribute value 

and output like

@relation supermarket @attribute 'department1' { t} @attribute 'department2' { t} @attribute 'department3' { t} @attribute value @attribute predicted-value @attribute predicted-margin 

my question can either remove value or keep empty testing set

case 1: both training , test set have class labels

training:

@relation simple-training @attribute feature1 numeric feature2 numeric class string{a,b} @data 1, 2, b 2, 4, ....... 

testing:

@relation simple-testing @attribute feature1 numeric feature2 numeric class string{a,b} @data 7, 12, 8, 14, ....... 

in case, whether using k-fold cv or train-test setup, weka not take @ class labels in test set. gets model training, blindly apply on test set , compares prediction actual class labels in testing set.

this useful if want see performance evaluation of classifier.

case 2: have class labels training data don't have class labels testing data.

training:

@relation     simple-training     @attribute     feature1 numeric     feature2 numeric     class string{a,b}     @data     1, 2, b     2, 4,     ....... 

testing:

 @relation     simple-testing     @attribute     feature1 numeric     feature2 numeric     class string{a,b}     @data     7, 12, ?     8, 14, ?     ....... 

this normal since need do- apply training model on unseen unlabeled data label them! in case put ? marks @ testing class labels. after running weka on setup output these ? marks replaced predicted values (you don't need create additional column give error).

so, in nutshell- need have compatibility in training , testing data. in testing data if don't know value , want predict it, put ? mark in column.


Comments

Popular posts from this blog

c# - Validate object ID from GET to POST -

php - Find a regex to take part of Email -

javascript - Function overwritting -