Friday, February 22, 2019

Apache NiFi - Data Provenance

Apache NiFi logs and store every information about the events occur on the ingested data in the flow. Data provenance repository stores this information and provides UI to search this event information. Data provenance can be accessed for full NiFi level and processor level also.
Data Provenance
The following table lists down the different fields in the NiFi Data Provenance event list have following fields −
S.No.Field NameDescription
1Date/TimeDate and time of event.
2TypeType of Event like ‘CREATE’.
3FlowFileUuidUUID of the flowfile on which the event is performed.
4SizeSize of the flowfile.
5Component NameName of the component which  performed the event.
6Component TypeType of the component.
7Show lineageLast column has the show lineage icon, which is used to see the flowfile lineage as shown in the below image.
Lineage Icon
To get more information about the event, a user can click on the information icon present in the first column of the NiFi Data Provenance UI.
There are some properties in nifi.properties file, which are used to manage NiFi Data Provenance repository.
S.No.Property NameDefault ValueDescription
1nifi.provenance.repository.directory.default./provenance_repositoryTo specify the default path of NiFi data provenance .
2nifi.provenance.repository.max.storage.time24 hoursTo specify the maximum retention time of NiFi data provenance.
3nifi.provenance.repository.max.storage.size1 GBTo specify the maximum storage of NiFi data provenance.
4nifi.provenance.repository.rollover.time30 secsTo specify the rollover time of NiFi data provenance.
5nifi.provenance.repository.rollover.size100 MBTo specify the rollover size of NiFi data provenance.
6nifi.provenance.repository.indexed.fieldsEventType, FlowFileUUID, Filename, ProcessorID, RelationshipTo specify the fields used to search and index NiFi data provenance.

No comments:

Post a Comment

Popular Posts