#012: Don’t Sleep on Open-Source Data Viz Tools

Sep 11, 2022

In this edition, I’m sharing 3 questions to ask before spending another dollar on a reporting tool.

Less than a decade ago, the options for data visualization were limited to a few big players.

But today there are multiple highly functional open-source tools in the market.

Unfortunately, they still get overlooked when compared to the more established products.

But when does it make sense to go open-source over a paid, enterprise option?

Let’s discuss.


What’s your end goal?

The first question is so simple it might seem strange.

But ask yourself - Why am I creating these dashboards in the first place?

What questions are being answered and are the results truly useful?

It’s so easy to get caught up in features that we forget the original goal.

I’ve heard this referred to as “shiny object syndrome”

Here’s one way to think about it:

Anything that isn’t actionable for a business is smoke and mirrors.

Or worse, a waste of time.

Recommendation:

  • Are you focused on a few KPIs that can be understood with simple dashboards?
    • Then open-source tools can likely get the job done.
  • Or do you rely heavily on customization and integrations?
    • Then paid products are a worthy expense.

 

How complex is the data?

Of course, many companies do indeed have complex data.

Relationships, modeling and automations are needed outside of the database.

Power BI is an example of a tool that’s great for these types of structures.

Looker, too, offers its own language (LookML) to slice, dice and join data.

But just because you “can” do this, doesn’t mean you always need to.

Recommendation:

  • Can “reporting” logic instead be handled in the Data Mart table(s)?
    • Then consider a direct connection through an open-source tool.
  • Or do you require extra modeling across many different datasets?
    • Then stick with the enterprise tool and maximize its capabilities.

 

How well can it be self-managed?

The reality is a truly open-source tool adds overhead.

Yes, you’ll save money on licenses.

But costs will come in the form of long-term maintenance.

Unlike a paid product, you’re responsible for:

  • Hosting
  • Networking
  • and other admin tasks

But you’ll also have free reign to use it as you please (for the most part).

Recommendation:

  • Is your team willing and skilled enough to host, manage and document this process?
    • If so, open-source tools could be a great option and be more fun to work with.
  • Or do you already have enough overhead and have no desire to add more?
    • Then don’t bother with open-source and just pay this problem away.

 

Dashboards are often the only time people interact with data.

But before you assume you need to pay to have nice visuals, consider open-source options.

You can still provide a great experience all while avoiding up-front costs by understanding your:

  1. End goal
  2. Data complexity
  3. Ability to self-manage


Lastly, here’s a sample list tools in both categories:

Open-Source:

  • Metabase
  • Apache Superset
  • Lightdash
  • Redash
  • Grafana

Paid:

  • Power BI
  • Tableau
  • Looker
  • Qlik
  • Sisense

 

Level-up your abilities as a Data Engineer, faster.

Learn new data engineering tips, tricks and best practices every Wednesday.

Other Recent Posts

Data Automation (CI/CD) with a Real Life Example

May 17, 2023

3 Ways to Deploy Data Projects

May 10, 2023

The Importance of Virtual Environments

Apr 26, 2023

How to Create a Virtual Machine on GCP

Apr 19, 2023