Google Cloud BigQuery
Use change data capture to stream data from Google Cloud BigQuery tables to any data sink and transform them on the way.
Please make sure that you have created a service account in Google Cloud, which is assigned to the primitive IAM role BigQuery Data Viewer on the level of the dataset and the primitive IAM roles BigQuery Job User and BigQuery Resource Viewer on the level of the project.
This source connector supports the following configuration options:
Google Cloud credentials (JSON)
The content of the JSON-based credentials file provided by Google Cloud for the service account. The service account must have been assigned to the primitive IAM roles BigQuery Data Viewer (dataset level), BigQuery Job User (project level), and BigQuery ResourceViewer (project level).
Service account e-mail address
The e-mail address of the service account. We try to automatically extract the e-mail address from the provided Google Cloud credentials.
The name of the BigQuery project. We try to automatically extract the name of the BigQuery project from the provided Google Cloud credentials.
The name of the BigQuery dataset.
The name of the BigQuery table (or view). You may retrieve the list of tables (and views) available in the given BigQuery project and dataset by clicking on Fetch table names.
Change Data Capture mode
You may choose one of the following modes for change data capture:
BULK: Recurringly load all data from the BigQuery table. Basically no change data capture at all.
INCREMENTING: Use the primary key column, specified using the Primary key column configuration option, to recurringly extract new records. This mode does only extract INSERTs, but skips UPDATEs and DELETEs.
TIMESTAMP: Use the timestamp column, specified using the Timestamp column configuration option, to recurringly extract new and updated records. This mode does only extract INSERTs and UPDATEs, but skips DELETEs.
TIMESTAMP/INCREMENTING: Use the primary key column and the timestamp column, specified using the Primary key column and Timestamp column configuration options, to recurringly extract new and updated records. This mode does only extract INSERTs and UPDATEs, but skips DELETEs.
Primary key column
BigQuery does not natively support the concept of primary keys. Specifying a column, which can be used for uniquely identifying a row in BigQuery, allows DataCater to detect new records. Please make sure that this colum is never
DataCater can use a timestamp column, which stores the time of the most recent update of a record, to detect record updates. Specifying the timestamp column is required when using TIMESTAMP or TIMESTAMP/INCREMENTING as Change Data Capture mode.
The interval in milliseconds between synchronizations of the BigQuery table and DataCater (default: