This Cloud + Data repo is an interactive workshop for doing world-class analytics in the cloud. In this session we will demonstrate how quickly we can provision, configure, and deploy a a cloud analytics stack.
In this Cloud + Data event, we will build a cloud data analytics infrastructure with best of breed tools. The basic architecture is:
Source -> Pipeline -> Warehouse -> Visualization
The tools we will use for this process are:
- Source: Salesforce Developer Edition
- Source: Google Sheets
- Pipeline: Fivetran
- Warehouse: Snowflake
- Analytics: Tableau Online
In this event, data from Salesforce will be replicated with Fivetran into Snowflake, where an analytical data model will be applied. Data will then be visualized in Tableau dashboards.
Salesforce data -> Fivetran replication -> Snowflake warehouse -> Tableau dashboards
- Snowflake Trial: https://trial.snowflake.com $400 free credits, no credit card: choose Enterprise edition, Amazon Web Services, US West (Oregon), this will take 15 minutes
- Salesforce Developer Edition: https://developer.salesforce.com Free account with sample data & API access, no credit card needed
- Fivetran Trial: Sign up through Snowflake Partner Connect inside Snowflake
- Tableau Online Trial: https://www.tableau.com/partner-trial 14 days, no credit card
To connect & pull data from Salesforce there are two options:
- EXISTING ACCOUNT: Use your active Salesforce account. The Salesforce account plan level must be Enterprise or higher OR have purchased API calls.
- DEVELOPER ACCOUNT: Sign up for a new free Salesforce developer account. Developer accounts come with mock data but have free API calls. Sign up at https://developer.salesforce.com/.
To connect & pull data from Google Sheets:
- Sheet Id: The Google Sheet is public to the web. The Id of the sheet is:
1anwwZf62cF_9oShKdCa188z6oPqBO2U96psCya0bgCY
and it is accessible at this link. - Named Range: The named range is called:
QuotaTable
- Sign up for a Snowflake Trial with $400 credit at https://trial.snowflake.com.
- An email will come within 15 minutes when your Snowflake account is ready.
- Once you receive the email, login to your Snowflake account.
- In Snowflake, click the user menu (top right corner) → Switch Role → Select ACCOUNTADMIN.
- Click the Partner Connect tab.
- Choose the Fivetran option and click Connect. This will initiate an automatic process to create a partner trial account in Fivetran that will connect to Snowflake.
- Click Activate in the final step.
- Look for an email from Fivetran and complete the sign up.
- In the Fivetran Dashboard, click Connectors.
- Click + CONNECTOR and search for Salesforce.
- Click Salesforce and change the schema name to
salesforce_cpd
. It is very important for the Tableau Dashboard section later to have it be named exactlysalesforce_cpd
. - Click AUTHORIZE and complete the login of Fivetran into Salesforce.
- Click SAVE & TEST. When the tests are completed, click < View Connector.
- Fivetran will download a list of objects available for sync from Salesforce.
- Under the Schemas header click the
-
button just to the left of the Sync Table header. We only need a few tables, and this will remove every table from the sync. Then go through the list of objects and add back just the necessary tables, which areAccount
,Lead
,Opportunity
, andUser
. - In the upper right, click the slider to Enable the sync of data from Salesforce into Snowflake.
- Above the list of tables, change Replication Frequency slider over to
24h
. This will kick off an initial sync of historical data.
There are two analytical processes that we will perform to add value to Salesforce data:
- Data Modeling to reshape our normalized Salesforce data into a useful format for analytics.
- Dashboard Design to effectively design appealing dashboards to see and understand data, in a way that drives action.
Tableau often requires a simplified dataset that is in a single table or star schema. In our example, this will take the form of a dimensional model that we can install. A sample dimensional model is available in this repository at /model.sql
or https://github.com/DecisiveData/CloudPlusData/blob/master/model.sql.
- In Snowflake, click Worksheets and open a blank worksheet.
- Copy/paste the contents of
/model.sql
into the blank worksheet in Snowflake. - At the top of the worksheet, check the box for All Queries and then click the Run button.
- A Tableau free trial can be created at https://www.tableau.com/products/online/request-trial.
- Fill out the registration form, then wait briefly for Tableau to provision.
If you need Tableau Desktop, connects to Snowflake using an ODBC driver that must be installed as a pre-requisite.
- In Snowflake, click the Help icon in the upper right and click the Download... option.
- On the left menu, select ODBC Driver and download the appropriate ODBC driver for your operating system and on Windows the appropriate bitness (x86 or x64).
- Once the ODBC driver is downloaded, complete the install process on your local computer.
- Go to Tableau Online https://online.tableau.com/#/signin
- Click on Create Workbook
- Go to the Connectors tab and connect to Snowflake
- Fill in the server, username, and passoword
If you do not want your Fivetran data pipeline to continue to burn Snowflake credits, please remember to log back into Fivetran and Pause your connector.
The password is the same for the below example accounts. Ask a leader for it.
- URL: https://login.salesforce.com
- User:
[email protected]
- User:
[email protected]
- User:
[email protected]
- URL:
- User: ``
- Warehouse: ``
- Database:
PC_FIVETRAN_DB
- Schema:
SALESFORCE_CPD
The Populate Salesforce application can create additional realistic data beyond what the Salesforce Developer Edition installs for you. Install it from here: https://appexchange.salesforce.com/listingDetail?listingId=a0N3A00000EO5smUAD