Search this blog

Saturday, 29 January 2011

Creating test data with attributes that match the training data

When applying a model to a test set, it is (usually) important that the number and names of attributes match those used to create the model.

When extracting features from data where the attribute names depend on the data, it can often be the case that test data both lacks all the attributes of the model and may have additional ones.

The example here shows the following
  1. Training data with attributes att2 to att10 and a label
  2. Test data with attributes att1 and att2
  3. Attribute att1 is removed from the test data by using weights from the training data
  4. Attributes att2 to att10 and the label are added to the the test data by using a join operator
  5. The resulting test data contains attributes att2 to att10 and a label. Only att2 has a value, all others are missing

No comments:

Post a Comment