-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
INITIALIZE VERTICA PROJECT: target-vertica as fork of pipelinewise-postgres-vertica #7
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
(cherry picked from commit cad4bca)
- Change "postgres" with "vertica". - Add vertica connector packaage "vertica-python". - Change the package version with intial. (cherry picked from commit 38421f0)
- Change "postgres" with "vertica". - Change SQL queries. - Methods of dbsync are modified. - Few other chagnes. (cherry picked from commit 3deebcd)
- Separate all exceptions into one file. - Organise all utility functions of streams and dbsync into one file. (cherry picked from commit fb55004)
(cherry picked from commit 886b6e9)
- Change "postgres" with "vertica". - Change length in one of the test cases. - Change the invalid input exception vertical copy error. (cherry picked from commit 28574c6)
(cherry picked from commit 56879e2)
- Change port for vertica. - Change "postgres" with "vertica" - Change datatypes (cherry picked from commit 8a5d727)
(cherry picked from commit 46061b5)
(cherry picked from commit 3a33dcf)
(cherry picked from commit 1bb3a8e)
(cherry picked from commit 54a1f91)
(cherry picked from commit 64f1e30)
(cherry picked from commit 40651db)
(cherry picked from commit fdd301f)
- Remove leftover comments - Remove completed TODOs - Remove add_columns - Remove unnecesary loggers - Fix ssl confg option for connection - Fix few pylint errors - Add set header to false for copy statement (cherry picked from commit ace52b5)
- Remove "add_columns" function - Remove unnecessary import - Fix few pylint errors. - Fix long varchar data type. - Fix datatypes with length issue. - Correct "integer" data type with "int". (cherry picked from commit 16ccb74)
*Note: comment few badges as the pypi links for vertica doesn't exists* (cherry picked from commit b63b99b)
(cherry picked from commit 18ab42a)
- Change author - Change URL (cherry picked from commit cc4d516)
(cherry picked from commit 8ffe34c)
(cherry picked from commit 4f0ed2b)
(cherry picked from commit 8490926)
(cherry picked from commit 9c4fa27)
(cherry picked from commit 12cbbf3)
(cherry picked from commit ba813e8)
(cherry picked from commit 16aef8a)
(cherry picked from commit c91b9d0)
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Proposed changes
This PR is for initial setup of the
pipelinewise-target-vertica
project.Types of changes
Checklist
setup.py
is an individual PR and not mixed with feature or bugfix PRs[AP-NNNN]
(if applicable. AP-NNNN = JIRA ID)AP-NNN
(if applicable. AP-NNN = JIRA ID)GUIDE TO TEST
Pipelinewise Guide
This guide is all about the necessary requirements and installation steps required to get started with pipelinewise.
Table of content
tap-s3-csv
target-vertica
Requirements
Python packages:
And the singer connector packages for pipelinwise.
Note: All the necessary packages will be installed automatically, the above is just for reference.
Installation
Clone the pipelinewise and target-vertica repo and change the directory to pipelinewise repo.
Edit
requirements.txt
for singer target-vertica and add the path to the pipelinewise-target-vertica repo to install it with pip.Clear the requirements.txt for vertica.
$ > pipelinewise/singer-connectors/target-vertica/requirements.txt
Get the current directory path and add
/pipelinewise-target-vertica
at the end. For ex:$ pwd /Users/jordanryan/code/f360/pipelinewise
This is the complete
/Users/jordanryan/code/f360/pipelinewise/pipelinewise-target-vertica
path to the target-vertica repo. Add this path to the below command.If everything is done correctly the path to pipelinewise-target-vertica repo should appear in
~/pipelinewise/singer-connectors/target-vertica/requirements.txt
. The path to the pipelinewise-target-vertica repo can also be manually edited to the same file.Run the install script that installs the PipelineWise CLI and every supported singer connectors into separated virtual environments. (refer for more)
Once the install script finished, you will need to activate the virtual environment with the Command Line Tools and set the
PIPELINEWISE_HOME
environment variable as it is displayed above at the end of the install script:If you see the above output saying that you have 0 pipelines in the system then the installation is complete and successful.
Creating Pipelines
After the installation, sample YAML files can be created for each of the supported connectors, which then can be adjusted. Create sample YAML files with the following command:
$ cd .. $ pipelinewise init --name pipelinewise_samples
This will create a
pipelinewise_samples
directory with samples for each supported component:To create a new pipeline you need to enable at least one tap and target by renaming the tap*....yml.sample and one target*...yml.sample file by removing the .sample postfixes.
Tap Template for S3 CSV
Target Template for Vertica
Once you renamed the files that you need, edit the YAML files with your favourite text editor. Follow the instructions in the files to set database credentials, connection details, select tables to replicate, define source to target schema mapping or add load time transformations. Check the detailed Example replication here.
Once you've configured the YAML files you then activate the Pipelines from the YAML files section with the following command.
At this point PipelineWise will connect to and analyse every source database, discovering tables, columns and data types and will generate the required JSON files for the singer taps and targets into
~/.pipelinewise.
PipelineWise will use this directory internally to keep tracking the state files for Key Based Incremental and Log Based replications (aka. bookmarks) and this will be the directory where the log files will be created. Normally you will need to go into~/.pipelinewise
only when you want to access the log files.Once the config YAML files are imported, you can see the new pipelines with the status command:
At this point, you have successfully created your first pipeline in PipelineWise and it’s now ready to run
Running Pipelines
To run a pipeline use the run_tap command with
--tap
and--target
arguments to specify which pipeline to run by IDs.You can check the status with the following command
Issues
Critical
Both of the above issues need to be fixed separately for fastsync tap-s3-csv and singer for pipelinewise-tap-s3-csv.
Moderate