Query Heap data from your warehouse
Have you ever wanted to join website data with other data such as your customer relationship management (CRM) tool or order management system (OMS)? While Heap provides rich integrations to bring data into Heap, sometimes it’s helpful to bring data out of Heap for analysis. Enter Heap Connect, Heap’s solution for sending data to your warehouse.
In this post, we’ll cover:
- Potential use cases for Heap Connect
- Setting up and getting started with Heap Connect
- How Brooklyn Data’s Heap data is used in the modern data stack
- Determining if Heap Connect is right for you
Use cases
Let’s review a couple use cases for Heap Connect: visualizations and joining Heap to other data:
Visualization
The Heap user interface (UI) provides an excellent, always up-to-date medium for product analytics. However, there are a few use cases that are better handled by a business intelligence (BI) tool:
- Enhanced visuals: Heap provides the following chart types in its user interface: line, bar, stacked, pie, and table. With Heap data in your BI tool of choice, you can create enhanced visualizations, including maps. In addition to maps, BI tools open other possibilities including scatter plots, box plots, histograms, and more.
- Robust filters: BI tools have increased filter flexibility beyond what is available in the Heap UI. Here’s an example of various filter options on date ranges in Looker:
- Dynamic dashboards: Dashboards in the Heap UI have some interactive capabilities, but BI tools add additional functionality, including use cases such as:
- The use of parameters to instantly change the grain of all trend charts from ‘daily’ to ‘monthly’
- Linking to a specific order in your order management system (OMS) that needs action
- Visualizing Heap alongside other platforms: Show Heap data alongside other KPIs in an executive overview report. For example, an order conversion funnel can be visualized alongside actual sales to paint a more complete picture of how user behavior impacts the bottom line.
Joining Heap to other data
A very powerful use case for Heap Connect is to join Heap data with other platforms, such as a CRM. By joining Heap data to a CRM, users can unlock insights by combining product analytics with customer data. In this example, Heap gives a sample query of how to join Heap data to Salesforce, a popular CRM.
Let’s take this one step further, and pretend that we are a B2B company who is a featuring a new product on our website. If one of our current customers is showing interest in the new product on our website, we can use a reverse ETL tool such as Census or Hightouch to send these users to our various marketing platforms. A common use case for reverse ETL is to take data that is modeled and trusted from your database and send it to customer engagement platforms where action can be taken on the data. In the next section, we’ll review how to set up and get started with Heap Connect.
Set up Heap Connect
Connecting Heap Connect
Configuring Heap Connect is fast, and Heap provides a helpful guide for connecting to BigQuery, Snowflake, Redshift, and S3. Once connected, Heap automatically includes some tables such as users
, sessions
, pageviews
, and all_events
.
Syncing events
While no additional action is needed to sync data from users
, sessions
, and pageviews
, there is one additional step needed to sync specific events to the all_events
table. To sync an event to the warehouse, simply navigate to the event in the UI, and toggle the button in the 'Connect' section:
Reviewing Schema
Here’s a (partial) view of the sessions table delivered by Heap. It includes various IDs and other information tied to the session such as device type, landing page, and referral information.
To learn more about the fields Heap includes, take a look at their sessions schema. In addition to tables with product analytics data, Heap provides meta data such as this _sync_history table:
Getting started with common queries
Once your data is connected, it can be daunting to know where to start. Thankfully, Heap provides a list of common queries to get you started. After getting comfortable running queries, you may find there are certain queries that you would like updated daily, and other queries that are really complex and require business logic that has been validated by a key stakeholder. Enter the modern data stack, which can take your usage of Heap Connect to the next level. In the next section, we’ll review how the Heap data from Brooklyn Data’s website fits in the modern data stack.
How Heap Connect fits in Brooklyn Data’s stack
At Brooklyn Data, our data from Heap Connect in this example follows a common workflow seen in the modern data stack: Snowflake → dbt → Looker. In the ‘Reviewing the schema’ section, we shared an image of the sessions
table as the data comes from Heap. Here’s an example of how we use the sessions data downstream.
Transforming data using dbt
In the image below, you’ll see the path of data involved in populating our final sessions
model using dbt, which is visible in Looker. Here’s a little more information about what happens in each stage:
-
The green boxes indicate the raw data in Snowflake.
-
Next, the boxes with the
source_
prefix perform basic transformations on the data, such as standardizing time zones and concatenating fields likeuser_id
andsession_id
, to generate a unique session ID assession_id
is not unique in the raw data. -
Robust transformations of the data occur in the boxes with the
int_
prefix, which include calculating sequences, such as the session number of a user, or the number of pageviews in each session. Data from multiple tables are joined together. -
Finally, we have our
heap_sessions
model, which is what Looker connects to.
Visualizing Data in Looker
After configuring a model in Looker, an Explore is used to visualize data. Here is a blank canvas for our sessions Explore:
In the example below, we look at the landing pages that are driving the most pageviews per session, grouped by device type. In the next section, we’ll give some guidance on determining if Heap Connect is right for you.
Is Heap Connect right for me?
Heap Connect isn’t for all users of Heap. Here are key questions to help determine if Heap Connect is right for you:
Can I already do everything I need to do in the UI?If the Heap UI handles all of your use cases, then you probably don’t need Heap Connect.
What benefit do I get in bringing data out of Heap?Are there strong business use cases to bringing data in to your warehouse? Think about the different ways Heap data can transform your business operations. For example, in the ‘Joining Heap to other data’ section, we discussed an end-to-end use case of combining Heap and Salesforce data to power marketing efforts.
Do I have the resources available to maintain, troubleshoot, and provide guidance on data?While using the Heap UI doesn’t require a technical person, technical people are required to properly install and maintain Heap Connect. If you already have people in your organization that support a modern data stack, they would be ideal for supporting Heap Connect.
Want more Heap? Learn about autocaptured properties and capturing data from the past!