I had some attributes whose values were very out of range and because I was in a hurry I couldn't go back to the source data and eliminate them there. What I wanted to do was to set the value to missing if it was greater than a certain value.
So I used the Generate Attributes operator like so.
In other words if (a1 > 5) then set a1 to missing (by dividing 0 by 0).
It's OK to modify the value of an existing attribute and not create a new attribute at all. I'm sure this used not to work so an enhancement has sneaked in - maybe my memory is playing tricks - I don't mind though - it works.
Search this blog
Tuesday, 25 June 2013
Saturday, 1 June 2013
Using Groovy to make an arbitrary example set
Here's a Groovy process to make an example set with the number of attributes you want as well as the number of examples.
The process uses two macros to dictate the size of the example set as follows
import com.rapidminer.tools.Ontology;
Integer numberOfAttributes = operator.getProcess().macroHandler.getMacro("numberOfAttributes").toInteger();
Integer numberOfExamples = operator.getProcess().macroHandler.getMacro("numberOfExamples").toInteger();
Attribute[] attributes = new Attribute[numberOfAttributes];
for (i = 0; i < numberOfAttributes; i++) {
name = "att_" + i.toString();
attributes[i] = AttributeFactory.createAttribute(name, Ontology.STRING);
}
MemoryExampleTable table = new MemoryExampleTable(attributes);
DataRowFactory ROW_FACTORY = new DataRowFactory(0);
String[] values = new String[numberOfAttributes];
for (j = 0; j < numberOfExamples; j++){
for (i = 0; i < numberOfAttributes; i++) {
values[i] = 0;
}
DataRow row = ROW_FACTORY.create(values, attributes);
table.addDataRow(row);
}
ExampleSet exampleSet = table.createExampleSet();
return exampleSet;
- numberOfAttributes
- numberOfExamples
These are set in the process context but can easily be defined in other ways.
The attribute names are prefixed with "att_" and the default value is 0. A bit of coding can change this.
Of course, the operators that are already available can be used to recreate this but my personal best is 8 operators to recreate the Groovy script above. I figured a 7 click saving was worth investing a bit of time to get.
Edit: I improved my personal best to 4 operators.
Edit: I improved my personal best to 4 operators.
Subscribe to:
Posts (Atom)