#027: Embracing AI (ChatGPT) as a Data Engineer

Jan 07, 2023

After weeks of watching from the sidelines, I finally decided to give ChatGPT a try.

Not only was I blown away, but found promising uses for Data Engineers who choose to embrace it.

If you're unfamiliar, ChatGPT is a free AI tool that returns search results in conversational text and has exploded in usage since launching a month ago.

So today, I'll share how I think ChatGPT can help data engineers:

  1. Write code faster
  2. Learn more efficiently
  3. Research more effectively

 

ChatGPT can start writing code for you based on any request

Let’s say you’re using dbt and want to work with JSON data.

You know it's doable with a macro (aka function), but not quite sure exactly how to do it.

Ask ChatGPT to “write a flatten_json() macro in dbt for snowflake database”.

A pretty specific scenario, right?

Within seconds you’ll have a pre-written macro along with an explanation on how to use it.

While not 100% perfect, you can use it as a starting point and tweak as needed.

Using AI this way can drastically cut down development time while also promoting good practices.

 

ChatGPT can help you learn more efficiently with clear, high-level guides

Regardless of experience, there’s always something new to learn.

But the hardest part is often knowing where to start or what to do.

So why not ask ChatGPT?

For example, I’ve never worked with Hadoop, so I asked it:

“how can a beginner start using hadoop”

And here’s what it gave me:

A nice overview of key topics and how to get started.

This is much different than aimlessly clicking on normal search-engine results.

And remember, this platform is still very new.

The results will only improve with time.

 

ChatGPT can help you research more effectively

The amount of tools and approaches in the data world can make your head spin.

The problem is rarely a lack of quality options, but rather having to pick just one.

While I don’t suggest making a tooling decision based on one ChatGPT result, it can still help you gather ideas.

For example, a common decision in the database world is Snowflake vs Databricks.

Here’s a nice breakdown I got from ChatGPT in under 10 seconds:

 

After seeing ChatGPT in action, my gut tells me it’s inevitable that this type of AI continues to grow in mainstream popularity.

But despite the fear of some, I don’t see it as a replacement for data engineers.

In fact, I’d double down on learning more technical skills.

No results will be 100%, especially when talking about custom code.

Which means it still requires a deeper understanding to properly tweak and apply.

I encourage you to embrace this technology as an amazing tool to facilitate our work, not replace it.

To summarize, ChatGPT has great potential to help data engineers write code faster, learn more efficiently and research more effectively.

(and much more)

 


Looking for more? Here are 3 other ways I can help you:

  1. The Playbook for dbt™ - Learn step-by-step how to build, automate & scale dbt projects from scratch using best practices 
  2. Consulting - Lets partner on your data project. Hire me as a hands-on consultant
  3. Sponsorship - Promote your product or brand to 5,000+ email subscribers and/or 16k+ YouTube subscribers

 

Level-up your abilities as a Data Engineer, faster.

Learn new data engineering tips, tricks and best practices every Wednesday.

Other Recent Posts

Data Automation (CI/CD) with a Real Life Example

May 17, 2023

3 Ways to Deploy Data Projects

May 10, 2023

The Importance of Virtual Environments

Apr 26, 2023

How to Create a Virtual Machine on GCP

Apr 19, 2023