When parsing a JSON file, DataCater tries to extract an array of JSON objects. Each JSON object is published as a record to the consuming data pipelines. By default, DataCater assumes that the array is located at the root level of the JSON file. If the array is not located at the root level of the JSON file, you can provide a pointer to the location of the array within the hierarchy of the JSON file (please see below).
Once a data pipeline has been built for a certain JSON file, users can upload additional files with the same structure and attributes.
DataCater extracts records from a JSON file by parsing an array of JSON objects. DataCater assumes that each entry of the array represents one record. DataCater automatically generates a data source schema based on the keys of the JSON objects using a predefined mapping of JSON data types to DataCater data types.
While DataCater can tolerate if not all records have defined all keys (or attributes), DataCater cannot generate a valid schema if different records use different data types for the same key (or attribute) and prevents importing the JSON file.
The following listing shows a valid JSON structure, which will cause DataCater to generate a schema with three fields
(name: String, age: Long, email: String):
The following listing shows an invalid JSON structure, which cannot be used with DataCater. Records use different data types for the field
This source connector supports the following configuration options:
Pointer to list in JSON file
The pointer starts with a slash (
/) and contains the names of the attributes leading to the location of the array within the hierarchy, delimited by a slash (
By default, DataCater assumes that the JSON array is located at the root level of the JSON file, in which case you do not have to provide any pointer:
For the following nested JSON structure, you may use the pointer
/records/hitsto guide DataCater to the array holding the records:
DataCater automatically extends the set of attributes with the attribute
__datacater_file_nameand fills it with the name of the uploaded JSON file.