SynchDB Configuration¶

SynchDB supports the following GUC variables in postgresql.conf. These are common parameters that apply the all connectors managed by SynchDB:

GUC Variable	Type	Default Value	Description
synchdb.naptime	integer	100	The delay in milliseconds between each data polling from Debezium runner engine
synchdb.dml_use_spi	boolean	false	Option to use SPI to handle DML operations
synchdb.synchdb_auto_launcher	boolean	true	Option to automatically launch active SynchDB connector workers. This option only works when SynchDB is included in `shared_preload_library` GUC option
synchdb.dbz_batch_size	integer	2048	The maximum number of change events produced by Debezium embedded engine for SynchDB to process. This batch of changes is processed within a single transaction by SynchDB
synchdb.dbz_queue_size	integer	8192	The maximum size (measured in number of change events) of Debezium embedded engine's change event queue. It should be set at least twice of `synchdb.dbz_batch_size`
synchdb.dbz_connect_timeout_ms	integer	30000	The timeout value in milliseconds for Debezium embedded engine to established an initial connection to a remote database
synchdb.dbz_query_timeout_ms	integer	600000	The timeout value in milliseconds for Debezium embedded engine to execute a query on a remote database
synchdb.dbz_skipped_oeprations	string	"t"	A comma-separated list of operations Debezium shall skip when processing change events. "c" is for inserts, "u" is for updates, "d" is for deletes, "t" is for truncates
synchdb.jvm_max_heap_size	integer	1024	The maximum heap size in MB to be allocated to Java Virtual Machine (JVM) when starting a connector.
synchdb.dbz_snapshot_thread_num	integer	2	The number of threads Debezium embedded connector should spawn during initial snapshot. Please note that according to Debezium, multi-threaded snapshot is an `incubating feature`
synchdb.dbz_snapshot_fetch_size	integer	0	The number of rows Debezium embedded connector should fetch at a time during initial snapshot. Set it to 0 to let the engine choose automatically
synchdb. dbz_snapshot_min_row_ to_stream_results	integer	0	The minimum number of rows a remote table should contain before Debezium embedded engine will switch to streaming mode during initial snapshot. Set it to 0 to always switching to stream mode
synchdb. dbz_incremental_ snapshot_chunk_size	integer	2048	The maximum number of change events produced by Debezium embedded engine for SynchDB to process during incremental snapshot
synchdb. dbz_incremental_ snapshot_watermarking_strategy	string	"insert-insert	The watermarking strategy used by Debezium embedded engine to resolve potential conflicts during incremental snapshot. Possible values are "insert-insert" and "insert-delete"
synchdb.dbz_offset_flush_interval_ms	integer	60000	The interval in milliseconds that Debezium embedded engine flushes offset data to disk
synchdb. dbz_capture_only_selected_table_ddl	boolean	true	whether or not Debezium embedded engine should capture the schema of all tables (false) or selected tables(true) during initial snapshot
synchdb.max_connector_workers	integer	30	the maximum number of connector workers that can be running at a time
synchdb.error_handling_strategy	enum	"exit"	configures the error handling strategy of a connector worker. Possible values are "exit" for exiting on error, "skip" for continuing on error, "retry" for retrying on error
synchdb.dbz_log_level	enum	"warn"	the log level setting for Debezium Runner. Possible values are "debug", "info", "warn", "error", "all", "fatal", "off", "trace"
synchdb.log_change_on_error	boolean	true	whether the connector should log the original JSON change event in case of error
synchdb.jvm_max_direct_buffer_size	integer	1024	The maximum direct buffer size in MB to be allocated to hold JSON change events
synchdb.dbz_logminer_stream_mode	enum	"uncommitted"	The streaming mode for Debezium based Oracle connector. The default is uncommitted, which means all the changes streamed from Oracle via Debezium is uncommitted. This indicates Debezium has to do some work to ensure the integrity of transactions and all associated changes. Setting to "committed" shifts this work on Oralce side
synchdb.olr_connect_timeout_ms	integer	5000	(affects OLR connector only) the connect timeout in milliseconds when connecting to openlog replicator service
synchdb.olr_read_timeout_m	integer	5000	(affects OLR connector only) the read timeout in milliseconds when reading from a socket
synchdb.olr_snapshot_engine	enum	"debezium"	(affects OLR connector only) the underlining engine to complete the initial snapshot process. Could be "debezium" or "fdw". If "fdw" is selected, you need to ensure "oracle_fdw" is installed prior
synchdb.cdc_start_delay_ms	integer	0	a delay waited after initial snapshot completes and before CDC streaming begins.

Technical Notes¶

GUC (Grand Unified Configuration) variables are global configuration parameters in PostgreSQL
Values are set in the postgresql.conf file
Changes require a server restart to take effect
shared_preload_library is a critical system configuration that determines which libraries are loaded at startup, synchdb must be put here to enable connector auto launcher

Configuration Examples¶

# Example configuration in postgresql.conf
synchdb.naptime = 1000                                                  # Increase wait time to 1 second
synchdb.dml_use_spi = true                                              # Enable SPI usage for DML operations
synchdb.synchdb_auto_launcher = true                                    # Enable automatic connector startup
synchdb.dbz_batch_size=4096                                             # Each batch can have at most 4096 change events
synchdb.dbz_queue_size=8192                                             # Debezium will use 8192 change event queue size
synchdb.jvm_max_heap_size=2048                                          # 2GB heap memory to be allocated to a connector
synchdb.dbz_snapshot_fetch_size=0                                       # Let Debezium figure out the optimal number of rows to fetch during initial snapshot
synchdb.dbz_min_row_to_stream_results=0                                 # Always stream the results during initial snapshot
synchdb.dbz_snapshot_thread_num=1                                       # Single thread during Debezium's initial snapshot
synchdb.dbz_incremental_snapshot_chunk_size=4096                        # Incremental snapshot produces change events in batches of 4096 max
synchdb.dbz_incremental_snapshot_watermarking_strategy='insert_insert'  # Use insert_insert watermarking strategy
synchdb.dbz_offset_flush_interval_ms=60000                              # Flush offset data to disk every minute if needed    
synchdb.dbz_capture_only_selected_table_ddl=false                       # Debezium will only capture the schema of selected tables rather than all tables
synchdb.max_connector_workers=10                                        # 10 connector workers can be run at a time
synchdb.error_handling_strategy='retry'                                 # connector should retry on error
synchdb.dbz_log_leve='error'                                            # Debezium Runner should log error messages only
synchdb.log_change_on_error=true                                        # log JSON change event on error

Usage Recommendations¶

synchdb.naptime
- Lower values: Higher update frequency but more system load
- Higher values: Lower system load but less frequent updates
- Adjust based on data latency requirements
synchdb.dml_use_spi
- Enable if specific SPI integration is needed
- Keep false for standard DML operations
synchdb.synchdb_auto_launcher
- Recommended to keep true for automatic connector resume upon PostgreSQL restarts
- Change to false only if manual connector control is required
synchdb.dbz_batch_size
- Lower values: Slower processing of change events at lower JVM memory usage
- Higher values: Faster processing of change events at higher JVM memory usage
- Adjust based on resource requirements
synchdb.dbz_queue_size
- Lower values: Smaller Debezium queue to hold change events
- Higher values: Larger Debezium queue to hold change events
- Need to be set at least twice of synchdb.dbz_batch_size
synchdb.jvm_max_heap_size
- Lower values: Smaller heap memory allocated to JVM
- Higher values: Larger heap memory allocated to JVM
- Adjust based on system resource and workload requirements
- Needs increase when working with large number of tables
synchdb.dbz_snapshot_fetch_size
- Lower values: Less rows to be fetched from a table during snapshot
- Higher values: More rows to be fetched from a table during snapshot
- Recommended to keep it 0 to let Debezium figure out an optimal value
synchdb.dbz_min_row_to_stream_results
- Lower values: Less JVM memory requirement, slower processing of change events
- Higher values: More JVM memory requirement, faster processing of change events
- Recommended to keep it 0 to let Debezium use streaming mode always to reduce memory usage
synchdb.dbz_snapshot_thread_num
- Lower values: Slower data export to SynchDB for processing
- Higher values: Faster data export to SyncDB for processing
- Recommended to set it to the same number of CPU cores
synchdb.dbz_incremental_snapshot_chunk_size
- Lower values: Slower processing of change events at lower JVM memory usage during incremental snapshot
- Higher values: Faster processing of change events at higher JVM memory usage during incremental snapshot
- Recommended to set it the same as synchdb.dbz_batch_size and adjust Adjust based on resource requirements
synchdb.dbz_offset_flush_interval_ms
- Lower values: More frequent update to offset file, more IO, less old batches to re-preocess after fault restored
- Higher values: Less frequent update to offset file, less IO, more old batches to re-preocess after fault restored
- Recommended to set it to 60000 as Debezium's recommendation
synchdb.max_connector_workers
- Lower values: less connector workers can be run at a time, less shared memory requirement
- Higher values: more connector workers can be run at a time, more shared memory requirement

Performance Considerations¶

Adjust synchdb.naptime based on system load and latency requirements
Adjust synchdb.dbz_batch_size and synchdb.dbz_queue_size higher to increase processing throughput
Adjust synchdb.jvm_max_heap_size based on workload
- Smaller number of tables (10k or less) + large amount of data per table: 512MB ~ 1024MB should suffice
- Larger number of tables (100k or more) + moderate amount of data per table: consider increasing to 2048MB or above
Set synchdb.dbz_snapshot_fetch_size to 0 to let Debezium pick optimal fetch value
Set synchdb.dbz_snapshot_thread_num to match number of CPU cores
Set synchdb.dbz_min_row_to_stream_results to 0 to always use stream mode to reduce memory usage

Common Use Cases¶

High-Throughput Systems¶

synchdb.naptime = 10            # Faster polling for real-time updates
synchdb.dml_use_spi = false     # Standard DML for better performance
synchdb.dbz_batch_size = 16384
synchdb.dbz_queue_size = 32768
synchdb.jvm_max_heap_size = 2048
synchdb.dbz_snapshot_thread_num = 4
synchdb.dbz_snapshot_fetch_size = 0
synchdb.dbz_min_row_to_stream_results = 0

Resource-Constrained Systems¶

synchdb.naptime = 1000          # Reduced polling frequency
synchdb.dml_use_spi = false     # Minimize additional overhead
synchdb.dbz_batch_size = 1024
synchdb.dbz_queue_size = 2048
synchdb.jvm_max_heap_size = 512
synchdb.dbz_snapshot_thread_num = 1
synchdb.dbz_snapshot_fetch_size = 0
synchdb.dbz_min_row_to_stream_results = 0

Development/Testing¶

synchdb.naptime = 500           # Default polling
synchdb.dml_use_spi = true      # Enable advanced features for testing
synchdb.dbz_batch_size = 2048
synchdb.dbz_queue_size = 4096
synchdb.jvm_max_heap_size = 1024
synchdb.dbz_snapshot_thread_num = 2
synchdb.dbz_snapshot_fetch_size = 0
synchdb.dbz_min_row_to_stream_results = 0

Troubleshooting¶

High CPU Usage
- Increase synchdb.naptime
- Review DML operation patterns
- Reduce synchdb.dbz_batch_size and synchdb.dbz_queue_size
- Increase synchdb.dbz_snapshot_thread_num
Data Latency Issues
- Decrease synchdb.naptime
- Increase synchdb.dbz_batch_size and synchdb.dbz_queue_size
- Check network connectivity
- Increase shared_buffers
- Split workload to more connectors rather than just one
- Start the connector with no_data mode to obtain schema only and begin CDC rather than initial mode which capture both schema and initial data before CDC begins.
Startup Problems
- Verify shared_preload_library configuration
- Check error messages from synchdb_get_state()
- Check connector worker status
Out of Memory Problems
- Increase synchdb.jvm_max_heap_size
- Increase shared_buffers

Best Practices¶

Initial Setup
- Start with default values
- Monitor system performance
- Adjust gradually based on requirements
Production Environment
- Document all configuration changes
- Test changes in staging first
- Maintain backup of working configurations
Monitoring
- Track system resource usage
- Monitor data synchronization latency
- Log configuration changes