Search this blog

Wednesday 22 November 2017

Enriching flowfiles in Apache NiFi using Mongo

I am using NiFi a lot for scalable processing of large data flows. NiFi is powerful but there are frustrations for what should to be the simplest of activities. The fact that there are foibles do betray a slight lack of maturity in the product.

Anyway, to help the World, I will describe what I had to do to solve the common problem of enriching JSON flow files containing some form of id with a looked up text value.

For example, if there is a small fragment of JSON like so...

{
    "Id": 7
}

and we want to make this by looking up the value 7 and add the new field Name with the value Me

{
    "Id": 7, "Name": "Me"
}
   
then we can in many ways using various lookup services. The way I had to choose involved reading from a Mongo database containing the Id and Name value pairs and I found most of what I needed here

The issue I found was that the type of the Id field in the Mongo database must be a string for everything to work correctly. Naively using an integer causes the matching to fail (not mismatch which is another issue) and there seems to be no way to work around this in the NiFi operator parameter settings; it always assumes a string when querying.

So, it's a bit of rough edge although in NiFi's defence, the feature I am using is quite new. Hopefully, this mini blog entry will help.


No comments:

Post a Comment