Data Science With RapidMiner: Creating test data with attributes that match the training data

Saturday, 29 January 2011

Creating test data with attributes that match the training data

When applying a model to a test set, it is (usually) important that the number and names of attributes match those used to create the model.

When extracting features from data where the attribute names depend on the data, it can often be the case that test data both lacks all the attributes of the model and may have additional ones.

The example here shows the following

Training data with attributes att2 to att10 and a label
Test data with attributes att1 and att2
Attribute att1 is removed from the test data by using weights from the training data
Attributes att2 to att10 and the label are added to the the test data by using a join operator
The resulting test data contains attributes att2 to att10 and a label. Only att2 has a value, all others are missing

Data Science With RapidMiner

Search this blog

Saturday, 29 January 2011

Creating test data with attributes that match the training data

No comments:

Post a Comment

About Me

Labels

Blog Archive