DataCater supports comma-separated values (CSV) files as a data source. Users can upload CSV files using the UI of DataCater. DataCater takes c3are of parsing the CSV files and publishing the extracted records to a data pipeline.
Once a data pipeline has been built for a certain CSV file, users can upload additional files of the same structure, i.e, files with the same number of values (or columns), which are then instantly processed by the pipeline.
When uploading a CSV file to create a new data source, DataCater tries to automatically detect parser settings, like the used delimiter character. If needed, the configuration used for parsing the CSV file can be adjusted (see below).
This source connector supports the following configuration options:
The delimiter character that is used to separate different values (or columns) from each other (default:
The delimiter characters that are used to separate different rows (or record) from each other (default:
Does the first row provide the attribute names?
If the first row of the uploaded CSV files holds the names of the different attributes (or columns), we may skip it for reading in data (default:
Number of rows to strip from the beginning
If the data part starts after multiple rows, we may skip the first n rows for reading in data (default:
0). This is often the case for CSV files that were generated by a spreadsheet application.
Escape character for special characters
The character used to escape special characters in the CSV file (leave empty, if none is used, which is the default).
DataCater automatically extends the set of attributes with the attribute
__datacater_file_nameand fills it with the name of the uploaded CSV file.