Debezium PostgreSQL Source Connector

The Debezium PostgreSQL Connector is a source connector that can obtain a snapshot of the existing data in a PostgreSQL database and then monitor and record all subsequent row-level changes to that data. All of the events for each table are recorded in a separate Apache Kafka® topic, where they can be easily consumed by applications and services.

  • Confluent supports Debezium PostgreSQL connector version 0.9.3 and later.
  • Confluent supports using this connector with PostgreSQL 9.6, 10, 11.
  • Databases hosted by a service such as Heroku Postgres cannot be monitiored with Debezium, since you may be unable to install the the logical decoding plugin.

Install the Postgres Connector

You can install this connector by using the Confluent Hub client (recommended) or you can manually download the ZIP file.

confluent-hub install debezium-connector-postgresql:latest

You can install a specific version by replacing latest with a version number. For example:

confluent-hub install debezium/debezium-connector-postgresql:0.9.4

Setting up PostgreSQL

Before using the Debezium PostgreSQL connector to monitor the changes committed on a PostgreSQL server, first install the logical decoding plugin into the PostgreSQL server. Enable a replication slot and configure a user with sufficient privileges to perform the replication.

To monitor a PostgreSQL database running in Amazon RDS, refer to the Debezium documentation for PostgreSQL on AmazonRDS.

Enable Logical Decoding and Replication on the PostgreSQL server

The Postgres relational database management system has a feature called logical decoding that allows clients to extract all persistent changes to database tables into a coherent format. This formatted data can be interpreted without detailed knowledge of the internal state of the database. An output plugin transforms the data from the write-ahead log's internal representation into a format the consumer of a replication slot needs.

The Debezium PostgreSQL connector works with one of the following supported logical decoding plugins from Debezium:

  • protobuf : To encode changes in Protobuf format
  • wal2json : To encode changes in JSON format

Installing the wal2json plugin

Before executing the commands, make sure the user has write-privilege to the wal2json library at the PostgreSQL lib directory. Note that for the test environment, this directory is /usr/pgsql-9.6/lib/. In the test environment set the export path as shown below:

export PATH="$PATH:/usr/pgsql-9.6/bin"

Enter the wal2json installation commands.

git clone https://github.com/eulerto/wal2json -b master --single-branch \
&& cd wal2json \
&& git checkout d2b7fef021c46e0d429f2c1768de361069e58696 \
&& make && make install \
&& cd .. \
&& rm -rf wal2json

Enable Replication on the PostgreSQL server

Add the following lines to the end of the /usr/share/postgresql/postgresql.conf PostgreSQL configuration file. These lines include the plugin at the shared libraries and adjust some Write-Ahead Log (WAL) and streaming replication settings.

# LOGGING
log_min_error_statement = fatal
# CONNECTION
listen_addresses = '*'
# MODULES
shared_preload_libraries = 'decoderbufs'
# REPLICATION
wal_level = logical             # minimal, archive, hot_standby, or logical (change requires restart)
max_wal_senders = 1             # max number of walsender processes (change requires restart)
#wal_keep_segments = 4          # in logfile segments, 16MB each; 0 disables
#wal_sender_timeout = 60s       # in milliseconds; 0 disables
max_replication_slots = 1       # max number of replication slots (change requires restart)

Initialize Replication Permissions

Add the following lines to the end of the pg_hba.conf PostgreSQL configuration file. These lines configure the client authentication for the database replication.

############ REPLICATION ##############
local   replication     postgres                          trust
host    replication     postgres  127.0.0.1/32            trust
host    replication     postgres  ::1/128                 trust

License

The Debezium PostgreSQL connector is an open source connector and does not require a Confluent Enterprise License.

Note

Portions of the information provided here derives from documentation originally produced by the Debezium Community. Work produced by Debezium is licensed under Creative Commons 3.0.