Search this blog


Monday, 25 August 2014


Here is a process to plot the Mandelbrot set. It's based on the one that was successful at the recent RapidMiner World conference.

It makes pretty pictures like this.

Various macros control the execution of the process. With the following settings,

yPoints: 80
xPoints: 120
iterations: 200
xmin: -0.95
xmax: -0.855
ymin: 0.2375
ymax: 0.3275

a zoomed in view like this is produced - how cool.

I noticed a feature of the advanced plotter that limits the number of points that get plotted. This is a configuration setting found at Tools->Preferences->Gui->rapidminer.gui.plotter.rows.maximum. This is 5,000 by default. If you want to see all the points for the settings above then set this to 9,600.

The process itself is in 2 main parts.

Firstly, the sub-process creates the x and y axes which I called x0 and y0. This is done using the operators "Generate Data", "Generate ID", and "Normalize" for the x and y axes. These are then joined using the "Cartesian Product" operator to produce all possible combinations of the x and y axes. The resulting example set is stored in the process context using the "Remember" operator.

Secondly, the "Loop" operator uses the "Recall" operator to get the latest example set to work on and performs the necessary calculations to generate the Mandelbrot set. The result of each iteration is remembered in the process context so the next loop iteration can carry on. There is some cunning filtering to reduce the amount of effort in each loop. Note the "Materialize Data" operator. This is often needed and does no harm if it is included.

At the end of the loop operation, nothing is output from the "Loop" operator itself. The output from the main process is simply a "Recall" operator which uses the last example set that was worked on inside the loop operation.

By having nothing output from the loop operation, the memory impact of this process is reduced.

Sunday, 3 August 2014

New videos coming soon

I've created another set of videos. These are slightly more advanced and tend to combine more operators together to tell a story.

Here's a graphic using RapidMiner's advanced plotting capabilities that shows the video names and the main operators explained during the video.

They'll be available on the RapidMinerResources site very soon.

I plan to do some new ones over the next few months and the question is what do I choose?

My current candidate list is.

  • Groovy Dark Arts
  • Text Processing 
  • Web Mining
  • Time Series in more detail
  • RapidMiner Server
Each would translate to between 10 and 20 videos. 

To help me decide which one I will do next, I'd be happy to get feedback. So please leave a comment and it will certainly help me.

Edit: I took the liberty of doing a mini survey at the RapidMiner World conference. The results are shown here

 I'll take notice of this and give some focus to Text Mining and RapidMiner Server.

Saturday, 5 July 2014

Copy your license before doing a RapidMiner Studio upgrade

Edit: just successfully downloaded 6.0.008 without a problem so I'll delete this in a while.

Before you install the new version - 6.0.006, be sure to copy your license key.

This is accessed by going to Help->Manage Licenses->Enter License.

Copy the text there into a suitable safe place.

When you install the latest RapidMiner Studio version, a nag screen will appear. You can escape this by entering the license text you carefully saved.

It turns out that if you run a stupid version of Internet Explorer (for me 8.* - I have no choice in this) then the license key does not show up on the RapidMiner site. This had the brief and tiresome side effect of locking me into a loop of despair.

With luck, this is a "swivel eyed mad feature" aka a "bug". If it is, I'll delete this post.