Development of a complex BI solution in Tableau and BigQuery

Viktoria Martianova
travel audience
Published in
9 min readMay 11, 2022

--

This post is intended to share our experience of building an externally facing Business Intelligence solution in Tableau. Although it is specific to our use case, we believe that our approach and practices might be helpful for those who work on similar projects or anyone else developing complex Tableau based analytics products in teams.

Photo by Launde Morel on Unsplash

Introduction

Working with destination marketing organizations (DMOs), who usually need to adapt very quickly to changing circumstances, we focused on taking our reporting and analytics to the next level. The current situation with international travel restrictions has emphasized the necessity to constantly keep our clients up to date even more than ever. To address it we have launched a project to build a standardized BI solution for DMOs called Destination Suite.

Usually, launching this kind of project would mean leveraging embedded analytics as a part of a web application. However, in our case we decided to take another approach. We needed to produce feasible and sharable results very quickly without involving frontend and backend engineers from other teams, until we finalize the solution. To make it happen we joined forces with our product team and UI/UX designer and aimed at developing a set of feature-rich dashboards connected by a single navigation. No dependencies on other technical teams also made it easier for us to follow iterative development cycles and start receiving stakeholders’ feedback as early as possible.

We have decided to build Destination Suite in Tableau, as we have been successfully using this platform to provide internal analytics since 2017.

Solution architecture

Having in mind that our path towards the final standardized solution would require some time and ability to experiment with customer-specific adjustments in the beginning, we have decided to wait with introduction of Row Level Security. Therefore, we have chosen to physically separate each client’s solution implementation first, avoiding business logic duplication at the same time.

You can see how exactly we managed to achieve this in the following picture:

Destination Suite architecture
Destination Suite architecture

At travel audience we have our data warehouse in BigQuery. As described in one of the previous posts, our data engineering team makes all upstream data available in BigQuery, providing other data teams access to the views on a variety of fact and meta tables (fact_view_1, meta_view_1, etc.).

Based on requirements for the data model, we developed a view (core_view) that aggregates all necessary facts over a list of required dimensions, as well as contains additional calculations and meta enrichments. This view does not include a filter on the partition field and, therefore, is not supposed to be queried directly. Its purpose is solely restricted to business logic centralization.

core_view is queried by client specific views (client1_view, client2_view) containing filters that are relevant for those clients, including filters on the partition field (we use date for this), client_id and other ones if needed.

Client specific views are connected to corresponding published Tableau data sources (client1_data, client2_data) with extracts. These data sources are used to feed client deployments of a so-called Master workbook (Client 1 Destination Suite, Client 2 Destination Suite).

Master workbook

Concept

Master workbook represents the latest release of Destination Suite. This workbook contains all approved modules (a set of dashboards grouped together in the navigation) and is client agnostic, what means that there are no attributes related to a particular client, it has a client name placeholder and uses a generic client logo. As mentioned before, we decided to go with 100% Tableau based solution to reduce dependency on other technical teams at this stage of the product evolution.

Frontend complexity and design best practices

Here are some of the requirements we had to satisfy:

  • web-application look and feel
  • branded design, based on a style guide
  • centralized navigation
  • responsive layout
Destination Suite look and feel
Dashboards look and feel

In collaboration with our UI/UX designer, we managed to agree on feasible dashboard mock-ups that follow the style guide. To make dashboards responsive, they are developed with respect to fixed minimal dimensions and any possible maximal size. Dashboard objects are organized using tiled layout containers. In order to ensure proper quality of high-resolution images, names of customer logos and other dashboard images follow @2x naming convention.

To make it easier to respect the style guide, we defined product specific colour palettes in the Preferences.tps file. Following the same principle, custom shapes are organized in folders to enable automatic shapes assignment when possible.

Another useful step was to the introduction of a dedicated design system represented by specifications of standard components and general design guidelines used to build dashboards within Destination Suite. These guidelines must be respected across all dashboards to ensure clean and consistent look and feel.

For example, these are some of the sections described in our design system: dashboard layout (e.g. header, footer, navigation), reusable dashboard objects (including tooltips, charts, legends etc.).

Destination Suite design system
Destination Suite design system: dashboard section and item titles

Layered architecture

If you look at the end-to-end solution, you will notice that it is designed with respect to layered architecture pattern, where each layer serves a dedicated purpose and has a well-defined set of requirements. A typical Destination Suite implementation consists of 3 layers:

  • BigQuery view that has a standard schema and includes client specific filters. Its purpose is to produce an initial dataset for a particular client. This dataset is based on the central data model that is defined in the core_view.
  • Tableau data source with a data extract that is connected to a client specific BigQuery view. This data source consumes data returned by BigQuery view and applies formatting such as number formats and additional settings like default aggregations for measures and field aliases to complete the data source configuration.
  • Destination Suite Master workbook that consumes data from the published Tableau data source and contains all necessary calculated fields and parameters, as well as the dashboards themselves. All colour palettes and custom shapes are set in the Master workbook too.
Destination Suite layers
Destination Suite layered architecture

Specifying layers this way has 3 main advantages:

  • Transparency — it is clear where each setting should be controlled. For example, all measures coming directly from a BigQuery view should have their number formats and default aggregations defined at the Tableau data source level instead of the Master workbook.
  • Maintainability — it is easier to locate the source of possible errors and add modifications.
  • Ease of deployment (described below)

The decision to read data from a published Tableau data source instead of connecting directly to a BigQuery view played a significant role in speeding up Destination Suite deployment process. The very first versions of Destination Suite were connected directly to BigQuery views. However, every time when we needed to add new fields to the view or change existing fields, we had to reconnect the workbook to another BigQuery view as the old connection became irrelevant. As a result, all fields that had been renamed in Tableau lost their aliases. Moreover, some other settings like number formats, custom colours etc. were dropped too. It could take up to an hour to recover everything after just a tiny change in the schema. This is a known problem that has been discussed in Tableau community for a while.

After a deeper look at the connection between what was changed and which settings were dropped, we realized that complex settings were affected in most of the cases. For example, we lost some sorts on dimensions that were renamed inside the workbook. This finding together with a comparison of various Tableau file types and lots of experiments resulted in the layered architecture described above. Thanks to it we eliminated almost all issues related to data source replacement, except for colour palettes and custom shapes reassignment. For now, we added these remaining steps to Destination Suite deployment process.

Shared ownership, development and version control

The Master workbook is usually touched only when there is a new dashboard that is fully approved and needs to be added to the product release or in case of making confirmed changes in the existing dashboards. After introducing the Master workbook concept, one of the biggest challenges for us was to define processes required for its development and maintenance.

Old development process and its challenges

While working on internal projects, be that a self-service data source or a workbook, we have not faced any problems related to development processes and version control. Usually, these tasks have a well-defined scope that can be fully delivered within a couple of weeks or iterations (we use 2-week sprints). Someone from the team works on a task, someone else reviews it. After that the final deliverable is published on our Tableau Server for a stakeholder review. If everything looks properly, it is officially made available for a dedicated audience in one of our production projects on the server. Whenever there is a change that needs to be implemented, a new task is created for it. Anyone from the team takes this task, downloads the latest data source/workbook version from the server and follows the process described above.

With the Master workbook this approach very quickly appeared not to be the right one for a few reasons:

  • Firstly, the workbook had a lot of custom elements, such as dashboard images, shapes and colour palettes. They were all stored locally on data analysts’ machines and went out of sync in a couple of iterations.
  • Secondly, the default Tableau revision control did not give much visibility on the changes implemented in each of the revisions, especially considering additional changes in colour palettes, dashboard images and shapes.

New development process with Git

To overcome these shortcomings, we used Git for managing the Master workbook, as well as other solution components, for instance, default data models. Here is the structure of a GitHub repository that we introduced:

Destination Suite GitHub repository
Destination Suite GitHub repository structure

Additionally, we respect the following rules while working on Destination Suite and its components:

  • Everyone in the team uses the same Tableau desktop version.
  • There is always maximum one person working on the Master workbook.

The second point means that we never open more than one pull request for anything from tableau folder. It is necessary to avoid potential corruptions of the Master workbook. Although we store the Master Tableau workbook file in GitHub, we always modify it using Tableau Desktop. Having multiple pull requests containing different modifications might cause unexpected merge conflicts that would be almost impossible to resolve by looking at the .xml code of the workbook.

To sum up, introduction of Git for version control of the Master Workbook, colour palettes, dashboard images, custom shapes and data models has helped us to overcome the main challenges we faced using the old development process, as well as let us leverage pull requests to review changes.

Conclusion & next steps

It was quite a journey to refine our approach and practices described above. We learned a lot, but the key lessons can be summarized as follows:

  • Invest enough time in solution design. Think about ease of deployment and change management in advance. Ask questions like “Which steps will I have to take if I need to modify existing data model or add a new field to it? How much time will it take? May it introduce any breaking changes?” Same works for deployment process.
  • Use version control for Tableau workbooks, Preferences.tps, etc. This is especially critical if you are planning to work on your product together with other developers. It is hard to explain how much easier our lives became after switching from OneDrive to Git to manage Tableau workbooks and other artefacts.
  • And finally, obvious, but very important reminder — document specifications, design guidelines and processes. Especially processes, as it might be that you will instantly spot possible improvements while noting down the steps.

The solution and best practices described above have been in use for almost a year now. Although there are still some manual steps (like reassigning colour palettes) left in the deployment process, they can be eliminated with some changes in the Master workbook design. With this in mind, we consider further work towards gradual automation of deployment process with the help of Tableau APIs.

Hopefully you found this post relevant and helpful. Should you have any questions or suggestions, feel free to write them in the comments.

--

--

Passionate about Data Analytics, BI development, Requirements Engineering, Data Governance & BI solutions usability.