In response to a request contained in a comment for this post, I've modified the process to count total words and unique words for multiple files.
It does this by using the Loop Files operator to iterate over all the files in a folder.
The Loop operator outputs a collection and the Append operator joins them into a single example set.
Inside the Loop operator, the Read Document operator reads the current file and converts it into a document.
Words and unique words are counted as before and the final operator adds an attribute based on the file name contained in the macro provided by the outer Loop operator.
An example result looks like this.
Download the process here and set the directory and filter parameters of the Loop Files operator to the location you want.
Search this blog
Monday, 15 April 2013
Subscribe to:
Post Comments (Atom)
Thanks Andrew for this new post.
ReplyDeletehow can we do arithematic operations on the results?
For eg: how to compute ( uniquewords/totalwords ) and get the result value for each document?
Dev
Hello Dev
DeleteYou can use the Generate Attributes operator to calculate these. This operator will automatically calculate the value for each example in the example set.
regards
Andrew
Exactly what I was looking for, thanks Andrew.
ReplyDeleteGlad it helped...
ReplyDelete