|
| 1 | +Using YugaByte’s Change Data Capture (CDC) API, follow these steps to use YugaByte as a data source to a Kafka or console sink: |
| 2 | + |
| 3 | +SETUP YUGABYTE |
| 4 | + |
| 5 | +1. Install Yugabyte. |
| 6 | + |
| 7 | +https://docs.yugabyte.com/latest/quick-start/install/ |
| 8 | + |
| 9 | +2. Create a local cluster. |
| 10 | + |
| 11 | +https://docs.yugabyte.com/latest/quick-start/create-local-cluster/ |
| 12 | + |
| 13 | +3. Create a table on the cluster. |
| 14 | + |
| 15 | +SETUP KAFKA (Skip if Logging to Console): |
| 16 | + |
| 17 | +Avro Schemas: |
| 18 | + |
| 19 | +We support the use of avro schemas to serialize/deserialize tables. Create two avro schemas, one for the table and one for the primary key of the table. After this step, you should have two files: table_schema_path.avsc and primary_key_schema_path.avsc |
| 20 | + |
| 21 | +https://www.tutorialspoint.com/avro/avro_schemas.htm |
| 22 | + |
| 23 | +Starting the kafka services: |
| 24 | + |
| 25 | +1. First, download confluent. |
| 26 | + |
| 27 | +``` |
| 28 | +curl -O http://packages.confluent.io/archive/5.3/confluent-5.3.1-2.12.tar.gz |
| 29 | +tar -xvzf confluent-5.3.1-2.12.tar.gz |
| 30 | +cd confluent-5.3.1/ |
| 31 | +``` |
| 32 | + |
| 33 | +2. Download the bin directory and add it to the PATH var. |
| 34 | + |
| 35 | +``` |
| 36 | +curl -L https://cnfl.io/cli | sh -s -- -b /<path-to-directory>/bin |
| 37 | +export PATH=<path-to-confluent>/bin:$PATH |
| 38 | +export CONFLUENT_HOME=~/code/confluent-5.3.1 |
| 39 | +``` |
| 40 | + |
| 41 | +3. Start the zookeeper, kafka, and the avro schema registry services. |
| 42 | + |
| 43 | +``` |
| 44 | +./bin/confluent local start |
| 45 | +``` |
| 46 | + |
| 47 | +4. Create a kafka topic. |
| 48 | + |
| 49 | +``` |
| 50 | +./bin/kafka-topics --create --partitions 1 --topic <topic_name> --bootstrap-server localhost:9092 --replication-factor 1 |
| 51 | +``` |
| 52 | + |
| 53 | +5. Start the kafka consumer |
| 54 | + |
| 55 | +``` |
| 56 | +bin/kafka-avro-console-consumer --bootstrap-server localhost:9092 --topic <topic_name> --key-deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer --value-deserializer=io.confluent.kafka.serializers.KafkaAvroDeserializer |
| 57 | +``` |
| 58 | + |
| 59 | +SETUP YB CONNECTOR: |
| 60 | + |
| 61 | +1. In a new window, fork YugaByte’s kafka connector repository. |
| 62 | + |
| 63 | +``` |
| 64 | +git clone https://github.com/yugabyte/yb-kafka-connector.git |
| 65 | +cd yb-kafka-connector/yb-cdc |
| 66 | +``` |
| 67 | + |
| 68 | +2. Start the kafka connector app. |
| 69 | + |
| 70 | +Logging to console: |
| 71 | + |
| 72 | +``` |
| 73 | +java -jar yb_cdc_connector.jar |
| 74 | +--table_name <namespace/database>.<table> |
| 75 | +--master_addrs <yb master addresses> [default 127.0.0.1:7100] |
| 76 | +--[stream_id] <optional existing stream id> |
| 77 | +--log_only // Flag to log to console. |
| 78 | +``` |
| 79 | + |
| 80 | +Logging to Kafka: |
| 81 | + |
| 82 | +``` |
| 83 | +java -jar yb_cdc_connector.jar |
| 84 | +--table_name <namespace/database>.<table> |
| 85 | +--master_addrs <yb master addresses> [default 127.0.0.1:7100] |
| 86 | +--[stream_id] <optional existing stream id> |
| 87 | +--kafka_addrs <kafka cluster addresses> [default 127.0.0.1:9092] |
| 88 | +--shema_registry_addrs [default 127.0.0.1:8081] |
| 89 | +--topic_name <topic name to write to> |
| 90 | +--table_schema_path <avro table schema> |
| 91 | +--primary_key_schema_path <avro primary key schema> |
| 92 | +``` |
| 93 | + |
| 94 | +3. In another window, write values to the table and observe the values on your chosen output stream. |
| 95 | + |
0 commit comments