Use change data capture to sync flat files from FTP/SFTP servers to any data sink and transform them on the way.
At startup, the connector extracts data from all (matching) files from the given directory. After this initial sync, it watches the directory for new or updated files and syncs only relevant changes.
This source connector supports the following configuration options:
Choose between FTP and SFTP.
Hostname or IP
The hostname or IP address of the FTP/SFTP server.
The port of the FTP/SFTP server. By default, FTP uses port
21and SFTP uses port
Only available for the protocol SFTP. Choose between a password-based and a key-based authentication.
The username to use for authenticating with the FTP/SFTP server.
Only available for the protocol FTP. The password to use for authenticating with the FTP server.
Private SSH key
Only available for the protocol SFTP. The SSH key to use for authenticating with the SFTP server.
The SSH key needs to be provided in the RSA format. OpenSSH keys need to be first converted to RSA before providing them to DataCater.
You can choose between two approaches to defining at which times DataCater should extract data from the FTP/SFTP server:
Sync in fixed second intervals: DataCater extracts data every X seconds (X is defined using the configuration option Sync interval).
CRON expression: DataCater extracts data at the times given by the CRON expression specified in the configuration option Sync interval.
Depending on the option Sync mode, you can either specify the number of seconds or the CRON expression. By default, DataCater extracts data every hour, i.e., the default values are
0 */1 * * ?(CRON expression).
The directory on the FTP/SFTP server, from which DataCater should extract files.
File name filter
Regular expression applied to files from the working directory. Only files with a name matching the regular expression will be extracted. Default value: .
*(matches all file names).
The format of the extracted files. Choose between
CSV delimiter value
Only available for the file type CSV. The character that delimits different columns (default:
Generate attribute names from CSV header row
Only available for the file type CSV. Whether to use the first row of the CSV file for extracting attribute names or not. If this option is set to false, DataCater will generate attribute names based on the index of the attribute, and name them
XPath root node
Only available for the file type XML. The XPath to the node holding the record nodes. (default:
Name of the attribute holding the record key
Name of the attribute that can act as a primary key. Please make sure that this column is never
DataCater automatically extends the set of attributes with the attribute
__datacater_file_nameand fills it with the name of the file.