#012: Don’t Sleep on Open-Source Data Viz Tools

newsletter Sep 11, 2022

In this edition, I’m sharing 3 questions to ask before spending another dollar on a reporting tool.

Less than a decade ago, the options for data visualization were limited to a few big players.

But today there are multiple highly functional open-source tools in the market.

Unfortunately, they still get overlooked when compared to the more established products.

But when does it make sense to go open-source over a paid, enterprise option?

Let’s discuss.


What’s your end goal?

The first question is so simple it might seem strange.

But ask yourself - Why am I creating these dashboards in the first place?

What questions are being answered and are the results truly useful?

It’s so easy to get caught up in features that we forget the original goal.

I’ve heard this referred to as “shiny object syndrome”

Here’s one way to think about it:

Anything that isn’t actionable for a business is smoke and mirrors.

Or worse, a waste of time.

Recommendation:

  • Are you focused on a few KPIs that can be understood with simple dashboards?
    • Then open-source tools can likely get the job done.
  • Or do you rely heavily on customization and integrations?
    • Then paid products are a worthy expense.

 

How complex is the data?

Of course, many companies do indeed have complex data.

Relationships, modeling and automations are needed outside of the database.

Power BI is an example of a tool that’s great for these types of structures.

Looker, too, offers its own language (LookML) to slice, dice and join data.

But just because you “can” do this, doesn’t mean you always need to.

Recommendation:

  • Can “reporting” logic instead be handled in the Data Mart table(s)?
    • Then consider a direct connection through an open-source tool.
  • Or do you require extra modeling across many different datasets?
    • Then stick with the enterprise tool and maximize its capabilities.

 

How well can it be self-managed?

The reality is a truly open-source tool adds overhead.

Yes, you’ll save money on licenses.

But costs will come in the form of long-term maintenance.

Unlike a paid product, you’re responsible for:

  • Hosting
  • Networking
  • and other admin tasks

But you’ll also have free reign to use it as you please (for the most part).

Recommendation:

  • Is your team willing and skilled enough to host, manage and document this process?
    • If so, open-source tools could be a great option and be more fun to work with.
  • Or do you already have enough overhead and have no desire to add more?
    • Then don’t bother with open-source and just pay this problem away.

 

Dashboards are often the only time people interact with data.

But before you assume you need to pay to have nice visuals, consider open-source options.

You can still provide a great experience all while avoiding up-front costs by understanding your:

  1. End goal
  2. Data complexity
  3. Ability to self-manage


Lastly, here’s a sample list tools in both categories:

Open-Source:

  • Metabase
  • Apache Superset
  • Lightdash
  • Redash
  • Grafana

Paid:

  • Power BI
  • Tableau
  • Looker
  • Qlik
  • Sisense

 

Level-up your abilities as a Data Engineer, faster.

Subscribe to receive tips to improve your skillset as a data engineer every Saturday. Always readable in 2 minutes or less.

Other Recent Posts

#021: Why your data team needs version control

Nov 26, 2022

#020: What to Learn First as a Data Engineer

Nov 19, 2022

#019: 3 Time-Saving dbt Cloud Features

Nov 12, 2022

#018: Lessons Going from Snowflake to BigQuery

Nov 05, 2022