Moving the data files from EMS to Apache NiFi

Create a simple NiFi flow that monitors a folder for file and copies to a different folder. This NiFi flow must be creSated to transfer data files from EMS to the spool directory in NiFi.

About this task

The data flow with NiFi processors helps in data files transfer from EMS to the /spool/packs/<pack_name>/in directory in NiFi. The following processors are needed to create the data flow:
  • GetSFFTP
    The GetSFTP processor fetches files from an SFTP Server and creates FlowFiles from them. If the source data files are available in another server, SFTP them to NiFi server by using the PutFile processor.
    Note: If you can also use GetSFTP processor.
  • UpdateAttribute

    The UpdateAttribute processor updates the attributes of a FlowFile by using the properties or rules that are added by the user. It updates the attributes for a FlowFile by using the Attribute Expression Language or deletes the attributes based on a regular expression.

  • PutFile

    The PutFile processor is used to store the file from the data flow to the spool directory of the pack.

Two scenarios are available to transfer the data files from EMS to NiFi:
  • Scenario 1

    For most of the wireline Technology Packs, the remote data file is transferred to the NiFi server location with GetSFTP and PutFile processors.

  • Scenario 2

    For most of the wireless Technology Packs the remote data files that are arranged in multiple folders are transferred to the NiFi server location with GetSFTP, UpdateAttribute and PutFile processors. For example, ACME Packet Net-Net 9200 HDR v1.0.0 pack.

Procedure

Moving the data files from one server to another NiFi server in scenario 1

  • Drag the processor icon to the NiFi canvas and select GetFTP or GetSFTP processor from the list.
  • Right-click on the processor and select Configure and in the Properties tab provide values for the following properties:
    Property Value
    Hostname Hostname of the server where the data files are available.
    Username Username to access the host server.
    Password Password to access the host server.
    Transfer Mode

    ASCII for XML files

    Binary for .gz files

    Remote Path Path to the location where the data files are available. Make sure that you can SFTP to this location.
    Private Key Path The fully qualified path to the Private Key file
    Properties that are applicable for GetSFTP processor only.
    Private Key Passphrase Password for the private key
    Host File If you provide this value, the file is used as the Host Key. Otherwise, no use host key file is used.
    Strict Host Key Checking Indicates whether strict enforcement of hosts keys must be applied
    Send Keep Alive On Timeout Indicates whether to send a single Keep Alive message when SSH socket times out.
  • Optional: To prevent the transfer of large files to NiFi input directory for processing before the file is copied completely, configure the following additional parameters in GetSFTP processor:
    Property Value
    Polling Interval Determines how long to wait between fetching the new files from remote location to NiFi input location. By default, it is 60 seconds. You might want to increase the value if you can determine the time to transfer the larger data files from remote server to the NiFi input location.
    Ignore Dotted files Make sure that this property is set to true. Files that start with a dot (.) are considered as hidden files and not transferred for processing.
    Note: Make sure that you rename with dot prefix the large files that might take time to transfer from the remote server to the NiFi input location.
  • Click Apply and go back to canvas.
  • Drag the processor icon to the NiFi canvas and select PutFile processor from the list.
  • Right-click on the processor and select Configure and in the Properties tab add the location of the input directory to Directory property.
  • Click GetSFTP processor and drag to PutFile processor to connect both of them.
  • Start both the processors.

Moving the data files from one server to another NiFi server in scenario 2

  • Drag the processor icon to the NiFi canvas and select GetSFTP processor from the list.
  • Right-click on the processor and select Configure and in the Properties tab provide values for the following properties:
    Property Value
    Hostname Hostname of the server where the data files are available.
    Username Username to access the host server.
    Password Password to access the host server.
    Transfer Mode

    ASCII for XML files

    Binary for .gz files

    Remote Path Path to the location where the data files are available in many subfolders. Make sure that you can SFTP to this location. For example,
    • /ems_output/card/1590662105.csv
    • /ems_output/session-realm/1590662105.csv
    • /ems_output/system/1590662105.csv
  • Optional: To prevent the transfer of large files to NiFi input directory for processing before the file is copied completely, configure the following additional parameters in GetSFTP processor:
    Property Value
    Polling Interval Determines how long to wait between fetching the new files from remote location to NiFi input location. By default, it is 60 seconds. You might want to increase the value if you can determine the time to transfer the larger data files from remote server to the NiFi input location.
    Ignore Dotted files Make sure that this property is set to true. Files that start with a dot (.) are considered as hidden files and not transferred for processing.
    Note: Make sure that you rename with dot prefix the large files that might take time to transfer from the remote server to the NiFi input location.
  • Click Apply and go back to canvas.
  • Drag the processor icon to the NiFi canvas and select UpdateAttribute processor from the list.
  • Right-click on the processor and select Configure and in the Properties tab. Click the Add Property (Add Property) icon to add the following properties and their values:
    Property Value
    filename
    ${filename:prepend(${path:replace(‘/’,’_’)})}
    It replaces the / in the data file path to _. For example,
    • ems_output_card_1590662105.csv
    • ems_output_session-realm_1590662105.csv
    • ems_output_system_1590662105.csv
    GetSFTP.remote.source The remote server where the data files are located. For example, localhost.
    path Path where the data files are available. For example,
    • /ems_output/card/1590662105.csv
    • /ems_output/session-realm/1590662105.csv
    • /ems_output/system/1590662105.csv
    .
  • Click Apply and go back to canvas.
  • Click GetSFTP processor and drag to UpdateProcessor processor to connect both of them.
  • Drag the processor icon to the NiFi canvas and select PutFile processor from the list.
  • Right-click on the processor and select Configure and in the Properties tab add the location of the input directory to Directory property.

    For example, /spool/packs/<pack_name>/in. After the data file is processed by NiFi from the /spool/packs/<pack_name>/in directory, the file is moved to /spool/packs/<pack_name>/done directory.

  • Click UpdateProcessor processor and drag to PutFile processor to connect both of them.
  • Start all the processors.