Quick Start Guide¶
It is very simple to start using SynchDB to perform data replication from heterogeneous databases to PostgreSQL given that you have the correct connection information to your heterogeneous databases.
Install SynchDB Extension¶
SynchDB extension requires pgcrypto to encrypt certain sensitive credential data. Please make sure it is installed prior to installing SynchDB. Alternatively, you can include CASCADE
clause in CREATE EXTENSION
to automatically install dependencies:
CREATE EXTENSION synchdb CASCADE;
Create a Connector¶
This can be done with utility SQL function synchdb_add_conninfo()
.
synchdb_add_conninfo takes these arguments:
argumet | description |
---|---|
name | a unique identifier that represents this connector info |
hostname | the IP address or hostname of the heterogeneous database. |
port | the port number to connect to the heterogeneous database. |
username | user name to use to authenticate with heterogeneous database. |
password | password to authenticate the username |
source database | this is the name of source database in heterogeneous database that we want to replicate changes from. |
destination database | (deprecated) always defaults to the same database as where synchDB is installed |
table | (optional) - expressed in the form of [database].[table] or [database].[schema].[table] that must exists in heterogeneous database so the engine will only replicate the specified tables. If left empty, all tables are replicated. |
connector | the connector type to use (MySQL, Oracle, SQLServer... etc). |
Examples:
-
Create a MySQL connector called
mysqlconn
to replicate from source databaseinventory
in MySQL to destination databasepostgres
in PostgreSQL:SELECT synchdb_add_conninfo( 'mysqlconn', '127.0.0.1', 3306, 'mysqluser', 'mysqlpwd', 'inventory', 'postgres', '', 'mysql');
-
Create a MySQL connector called
mysqlconn2
to replicate from source databaseinventory
to destination databasepostgres
in PostgreSQL:SELECT synchdb_add_conninfo( 'mysqlconn2', '127.0.0.1', 3306, 'mysqluser', 'mysqlpwd', 'inventory', 'postgres', '', 'mysql');
-
Create a SQLServer connector called 'sqlserverconn' to replicate from source database 'testDB' to destination database 'postgres' in PostgreSQL:
SELECT synchdb_add_conninfo( 'sqlserverconn', '127.0.0.1', 1433, 'sa', 'Password!', 'testDB', 'sqlserverdb', '', 'sqlserver');
-
Create a MySQL connector called
mysqlconn3
to replicate from source databaseinventory
sorders
andcustomers
tabls to destination databasepostgres
in PostgreSQL:SELECT synchdb_add_conninfo( 'mysqlconn3', '127.0.0.1', 3306, 'mysqluser', 'mysqlpwd', 'inventory', 'mysqldb3', 'inventory.orders,inventory.customers', 'mysql');
Things to Note¶
- It is possible to create multiple connectors connecting to the same connector type (ex, MySQL, SQLServer..etc). SynchDB will spawn separate connections to fetch change data.
- User-defined X509 certificate and private key for TLS connection to remote database will be supported in near future. In the meantime, please ensure TLS settings are set to optional.
- If SSL is required to establish the connection, refer to here to learn how to configure SSL settings in the rule file per conenctor
Check Created Connection Info¶
All connection information are created in the table synchdb_conninfo
. We are free to view its content and make modification as required. Please note that the password of a user credential is encrypted by pgcrypto using a key only known to synchdb. So please do not modify the password field or it may be decrypted incorrectly if tempered. See below for an example output:
postgres=# \x
Expanded display is on.
postgres=# select * from synchdb_conninfo;
-[ RECORD 1 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name | sqlserverconn
isactive | t
data | {"pwd": "\\xc30d0407030245ca4a983b6304c079d23a0191c6dabc1683e4f66fc538db65b9ab2788257762438961f8201e6bcefafa60460fbf441e55d844e7f27b31745f04e7251c0123a159540676c4", "port": 1433, "user": "sa", "dstdb": "postgres", "srcdb": "testDB", "table": "null", "hostname": "192.168.1.86", "connector": "sqlserver"}
-[ RECORD 2 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name | mysqlconn
isactive | t
data | {"pwd": "\\xc30d04070302986aff858065e96b62d23901b418a1f0bfdf874ea9143ec096cd648a1588090ee840de58fb6ba5a04c6430d8fe7f7d466b70a930597d48b8d31e736e77032cb34c86354e", "port": 3306, "user": "mysqluser", "dstdb": "postgres", "srcdb": "inventory", "table": "null", "hostname": "192.168.1.86", "connector": "mysql"}
-[ RECORD 3 ]-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
name | oracleconn
isactive | t
data | {"pwd": "\\xc30d04070302e3baf1293d0d553066d234014f6fc52e6eea425884b1f65f1955bf504b85062dfe538ca2e22bfd6db9916662406fc45a3a530b7bf43ce4cfaa2b049a1c9af8", "port": 1528, "user": "c##dbzuser", "dstdb": "postgres", "srcdb": "FREE", "table": "null", "hostname": "192.168.1.86", "connector": "oracle"}
Start a Connector¶
Use synchdb_start_engine_bgw()
function to start a connector worker. It takes one argument which is the connection name created above. This command will spawn a new background worker to connect to the heterogeneous database with the specified configurations.
For example, the following will spawn 2 background worker in PostgreSQL, one replicating from a MySQL database, the other from SQL Server
select synchdb_start_engine_bgw('mysqlconn');
select synchdb_start_engine_bgw('sqlserverconn');
select synchdb_start_engine_bgw('oracleconn');
Check Connector Running State¶
Use synchdb_state_view()
to examine all connectors' running states.
See below for an example output:
postgres=# select * from synchdb_state_view;
name | connector_type | pid | stage | state | err | last_dbz_offset
---------------+----------------+--------+---------------------+---------+----------+------------------------------------------------------------------------------------------------------
sqlserverconn | sqlserver | 579820 | change data capture | polling | no error | {"commit_lsn":"0000006a:00006608:0003","snapshot":true,"snapshot_completed":false}
mysqlconn | mysql | 579845 | change data capture | polling | no error | {"ts_sec":1741301103,"file":"mysql-bin.000009","pos":574318212,"row":1,"server_id":223344,"event":2}
oracleconn | oracle | 580053 | change data capture | polling | no error | offset file not flushed yet
(3 rows)
Column Details:
fields | description |
---|---|
name | the associated connector info name created by synchdb_add_conninfo() |
connector_type | the type of connector (mysql, oracle, sqlserver...etc) |
pid | the PID of the connector worker process |
stage | the stage of the connector worker process |
state | the state of the connector. Possible states are:
|
err | the last error message encountered by the worker which would have caused it to exit. This error could originated from PostgreSQL while processing a change, or originated from Debezium running engine while accessing data from heterogeneous database. |
last_dbz_offset | the last Debezium offset captured by synchdb. Note that this may not reflect the current and real-time offset value of the connector engine. Rather, this is shown as a checkpoint that we could restart from this offeet point if needed. |
Check Connector Running Statistics¶
Use synchdb_stats_view()
view to examine the statistic information of all connectors. These statistics record cumulative measurements about different types of change events a connector has processed so far. Currently these statistic values are stored in shared memory and not persisted to disk. Persist statistics data is a feature to be added in near future.
See below for an example output:
postgres=# select * from synchdb_stats_view;
connector | ddls | dmls | reads | creates | updates | deletes | bad_events | total_events | batches_done | avg_batch_size
---------------+------+---------+---------+---------+---------+---------+------------+--------------+--------------+----------------
mysqltpccconn | 22 | 3887111 | 3263746 | 208684 | 400241 | 14440 | 14444 | 3901573 | 2441 | 1598
Column Details:
fields | description |
---|---|
name | the associated connector info name created by synchdb_add_conninfo() |
ddls | number of DDLs operations completed |
dmls | number of DMLs operations completed |
reads | number of READ events completed during initial snapshot stage |
creates | number of CREATES events completed during CDC stage |
updates | number of UPDATES events completed during CDC stage |
deletes | number of DELETES events completed during CDC stage |
bad_events | number of bad events ignored (such as empty events, unsupported DDL events..etc) |
total_events | total number of events processed (including bad_events) |
batches_done | number of batches completed |
avg_batch_size | average batch size (total_events / batches_done) |
Stop a Connector¶
Use synchdb_stop_engine_bgw()
SQL function to stop a running or paused connector worker. This function takes conninfo_name
as its only parameter, which can be found from the output of synchdb_get_state()
view.
For example:
select synchdb_stop_engine_bgw('mysqlconn');
synchdb_stop_engine_bgw()
function also marks a connection info as inactive
, which prevents the this worker from automatic-relaunch at server restarts. See below for more details.