Details
-
Improvement
-
Status: Resolved
-
Major
-
Resolution: Fixed
-
1.6.0, 1.7.0, 1.7.1
Description
The HashAttribute processor currently has surprising behavior. Barring familiarity with the processor, a user would expect HashAttribute to generate a hash value over one or more attributes. Instead, the processor as it is implemented "groups" incoming flowfiles into groups based on regular expressions which match attribute values, and then generates a (non-configurable) MD5 hash over the concatenation of the matching attribute keys and values.
In addition:
- the processor throws an error and routes to failure any incoming flowfile which does not have all attributes specified in the processor
- the use of MD5 is vastly deprecated
- no other hash algorithms are available
I am unaware of community use of this processor, but I do not want to break backward compatibility. I propose the following steps:
- Implement a new CalculateAttributeHash processor (awkward name, but this processor already has the desired name)
- This processor will perform the "standard" use case – identify an attribute, calculate the specified hash over the value, and write it to an output attribute
- This processor will have a required property descriptor allowing a dropdown menu of valid hash algorithms
- This processor will accept arbitrary dynamic properties identifying the attributes to be hashed as a key, and the resulting attribute name as a value
- Example: I want to generate a SHA-512 hash on the attribute username, and a flowfile enters the processor with username value alopresto. I configure algorithm with SHA-512 and add a dynamic property username – username_SHA512. The resulting flowfile will have attribute username_SHA512 with value 739b4f6722fb5de20125751c7a1a358b2a7eb8f07e530e4bf18561fbff93234908aa9d2577770c876bca9ede5ba784d5ce6081dbbdfe5ddd446678f223b8d632
- Improve the documentation of this processor to explain the goal/expected use case
- Link in processor documentation to new processor for standard use cases
- Remove the error alert when an incoming flowfile does not contain all expected attributes. I propose changing the severity to INFO and still routing to failure
Attachments
Issue Links
- is depended upon by
-
NIFI-5582 Integrate legacy behavior of HashAttribute into CryptographicHashAttribute
- Resolved
- links to