Search this blog

Wednesday, 8 June 2022

Fetching stock data using a parameterised Execute R operator

I'm currently delivering data science lectures at the University of Chichester and RapidMiner is part of what I use to teach. And very good it is too. I recently found myself helping my students to get some up to date stock market data. Rather than manually downloading this, I thought I would use RapidMiner with the tidyquant R package and do it automatically. The Finance and Economics Extension seems to be out of date so isn't an option.

My idea was to define a list of stock symbols such as "AAPL", "BTC-USD" and so on and run the Execute R operator in a loop with each symbol individually.

It turns out there isn't a way to parameterise the Execute R operator so I had to invent one.

Basically, I use the Loop Parameters operator to set multiple values for a macro located inside it. This macro is used to create a one row example set with the value of the macro. This example set is then passed to the Execute R operator where the R script uses it as a parameter to drive the rest of the script. It's clunky but it works.

This approach could be adapted to allow R scripts to be run as part of a more complex modelling process. Relatively tough to do but feasible.


Here's a link to the process.

You'll need the R Scripting extension and you will also need to ensure that R is running on your machine with the data.table, tidyverse and tidyquant R packages all installed.

If RapidMiner enhances the Execute R operator to take parameters, (which would be a good enhancement), then this work around will not be needed anymore.


Wednesday, 25 May 2022

Parties at 10 Downing Street in 33 words

The Sue Gray report was published today. I made a word cloud of some of the more frequent words to try and summarise what it's about. 

This one uses 33 words and seems to do a reasonable job.

Sunday, 17 April 2022

Ministerial Directions - improving a misleading graphic.

In the UK, the Civil Service has the job of implementing Government policy. They do this in a non-political way. They do have a duty to advise and if they conclude that a policy is unworkable from a number of perspectives, they can request a ministerial direction that transfers the liability from the civil service to the government minister. This recently happened with the the UK's new proposal to ship to Rwanda asylum seekers who arrive illegally by boat across the English Channel. There was seemingly some doubt that the policy would save money.

This has led to some media coverage about how frequently these ministerial directions happen. One graphic in particular shows these since the time of John Major.


The original for this is here.

This graphic is confusing because the eye is drawn to a trend which could give the impression that the number of these directions is decreasing.

I spent time counting the number of interventions (yes, I did it manually because I couldn't find the source data). These are the raw numbers for each prime minister.

Major    13

Blair    20

Brown    17

Cameron    10

May    10

Johnson    18

The actual numbers are not interesting by themselves, it's vital to normalise the number of directions by the length of the time the prime minister spent in office. If we do this, we get the following interesting graphic.

This is interesting because it shows a recent increasing trend and shows that the Johnson premiership has the largest proportion of directions per year. Prime Minister Brown would no doubt argue that the global financial crash during his time contributed to the large number for him. Prime Minister May would no doubt point to Brexit and Johnson would point both to Brexit and Covid. It is somewhat concerning however that the UK is implementing policies that may not be represent value for money. 

It's ironic that the ministers who make the directions will be out of office and will not face any sanction if the policy does indeed fail at some distant point in the future. There is no check and balance that could deter such risky decisions to stop ministers ignoring advice.

Reproducible research is important and you can find all the data and code on Github here