- 
                Notifications
    
You must be signed in to change notification settings  - Fork 129
 
(WIP) Installation Guide
A quick start guide for installing from binaries or compiling from source.
Install and configure the database of choice. MADlib currently supports the following platforms:
- PostgreSQL
 - Greenplum database
 - Apache HAWQ (incubating)
 
This guide describes the installation steps for PostgreSQL and Greenplum. (HAWQ installation steps will be added at a later date.)
PostgreSQL platform notes:
- Ensure that you install PostgreSQL with Python extension specified
 - If environment variables are defined, this can save you some typing.
 
- 
Download the MADlib binary
- Postgres: Get either the OSX or Redhat/CentOS binary from the MADlib download page
 - Pivotal Greenplum Database: Download the .gppkg binary from Pivotal Network
 
 - 
Install the package at the OS level.
- 
Postgres:
- 
on OSX double click the installer package
 - 
on Redhat / CentOS run the following as root:
yum install <madlib_package> --nogpgcheck 
 - 
 - 
Pivotal Greenplum Database:
- 
on Redhat / CentOS run the following as gpadmin
gppkg install <madlib_package> 
 - 
 
 - 
 - 
Ensure that the environment is setup for your database deployment and that the database is up and running.
- 
Ensure that psql, postgres, and pg_config are in your path
which psql which postgres which pg_config - 
Ensure that the database is started and running
psql –c 'select version()' 
The above may need user/port/password setting depending on how the database has been configured.
 - 
 - 
Run the MADlib deployment utility to install MADlib into each database that you want to use it in:
- 
Postgres:
/usr/local/madlib/bin/madpack -s madlib –p postgres install 
if environment variables are defined. Otherwise use a fully defined connection string:
/usr/local/madlib/bin/madpack -s madlib -p postgres -c [user[/password]@][host][:port][/database] install- 
Pivotal Greenplum Database:
/usr/local/madlib/bin/madpack –p greenplum install 
The above may need user/port/password setting depending on how the database has been configured.
For more information on madpack:
/usr/local/madlib/bin/madpack --help - 
 - 
Test your installation to validate proper installation
- 
Postgres:
/usr/local/madlib/bin/madpack -s madlib –p postgres install-check - 
Pivotal Greenplum Database:
/usr/local/madlib/bin/madpack –p greenplum install-check 
The above may need user/port/password setting depending on how the database has been configured.
 - 
 
Requirements for installing MADlib:
- gcc (For OSX, Clang will work for compiling the source, but not for documentation.)
 - An installed version of Pivotal HAWQ, Pivotal Greenplum Database 4.2+ or PostgreSQL (64-bit) 9.2+ with plpython support enabled. Note: plpython may not be enabled in PostgreSQL by default.
 
In the $MADLIB_ROOT (location of MADlib source) run the following commands
mkdir build
cd build
cmake ..
makeAbove, we built the executables in the build folder. This can, however, be any user-named folder (henceforth called $BUILD_ROOT).
Deploy MADlib into the database with MADlib package manager madpack located under $BUILD_ROOT/src/bin.
- to install, run `$BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database]  install`
- to make sure that the installation is successful, run `$BUILD_ROOT/src/bin/madpack -p postgres -c [user[/password]@][host][:port][/database] install-check`
- for more information on the usage of `madpack`, run `$BUILD_ROOT/src/bin/madpack --help`
The below variables will be automatically used by the madpack installer if no connection string is provided.
- User: 
PGUSERorUSER(defaults to OS username) - Password: 
PGPASSWORD(defaults to empty) - Host: 
PGHOST(defaults to 'localhost') - Database: 
PGDATABASE(defaults to OS username) - Port: 
PGPORT(defaults to 5432) 
An example of deploying MADlib using the environment variables:
export PGPORT=5430
export PGHOST=127.0.0.1
export PGDATABASE=madlibtest
$BUILD_ROOT/src/bin/madpack -p postgres install