#005: 3 Ways to Optimize Your Git Projects

Jun 05, 2022

Version control platforms have become the central hub for modern data teams.

It’s where all code moves and where we spend a large portion of our time.

In today’s edition, I want to share with you 3 simple ways to optimize your git project to help get the most out of this critical piece of your workflow.

 

Tip 1: Create a Request Template

All code must be merged to the “master” or “main” branch before it is considered complete. 

And each platform handles this through a dedicated request form. 

Rather than starting from scratch on each request, you can instead create a template.

Check out my tutorial on how to create a PR Template on GitHub.

 

How is this helpful?

Allows you to be more efficient

Cut out time spent formatting descriptions of changes, which is a non-value add activity.

Instead, focus that time on providing the important details so that the code can be properly reviewed by your teammates and merged. 

 

Helps catch common mistakes

We often develop quickly and may accidentally miss something obvious.

It can be frustrating as a reviewer to constantly be calling out simple misses.

A template forces developers to confirm these items and will allow the code reviews to be more productive from the start. 

 

Can be easily adjusted

The template itself will also be version controlled within the same project as a markdown (.md) file.

You can adjust the template as needed so that it always stays up to date as your team requirements change.

 

Tip 2: Use Tags

While it is easy to build your project indefinitely and without any stopping points, taking this approach will only add more work in the long run.

Instead, use tags as a way to mark certain “versions” or “stopping points” in your project.

 

How is this helpful?

Can be used for automation

Tags can be used as a trigger for automated workflows built into your project.

You can even get more creative and create custom logic to run different workflow steps depending on the name of the tag. 

For example, you could conditionally run a workflow depending on if a tag includes the word “qa” vs “prod”.

 

Can be referenced in other tools

Just like you can refer to a specific branch when calling your project from other tools, you can also refer to a specific tag.

This is helpful to avoid potential breaking changes made to the “main” branch without your knowledge.

If you have a tool set to use a specific tag, it will not use the latest changes unless you specifically change it.

 

Attach to releases

A release will package up your code into a zip file and allow you to easily access all of the files as they were at a particular point in time.

This is a common practice in software development, and now data development.

When you create a release it will need to be tied to a tag.

Therefore, if you want to have releases (which you should) you’ll need to get used to using tags.

Here is an example of this in use:

 

Tip 3: Don’t ignore the README

Too often README files are overlooked and unfortunately left completely blank.

However, this is the first thing people see when viewing your project and is valuable real estate if used well.

Here is an example of the README for dbt:

 

How is this helpful?

Address FAQs

As a development project grows, so too does the tribal knowledge of those who built it.

Rather than keeping things just within our brains, we can use the README to answer common questions or provide direction for where to find other information. 

This is especially important when on-boarding new developers who are trying to understand how/why a project is designed.

 

Avoid project scope creep

At its core, the README is there to provide a description and purpose for your project.

If this is not clear, it’s possible that developers get complacent and start adding other unrelated code to the repository. 

A clear README can help keep your project focused and avoid this accidental scope creep.

 

Give credit to others

Lastly, and maybe most importantly, a README allows you to shout out special contributors to the project.

It takes a lot of work to build a great project and giving credit on the README will provide that recognition front and center.

While some may not like the attention, it never hurts to let others know you appreciate them.

 


 

That's all for this edition. One (or more) tips to help you level up your skills as a data engineer.

If you found this helpful, the best way to say thanks is to share it with somebody else.

Thank you for reading and I'll see you next time.

- Mike

 

New to data engineering? Check out my FREE Starter Guide PDFs.

Level-up your abilities as a Data Engineer, faster.

Learn new data engineering tips, tricks and best practices every Wednesday.

Other Recent Posts

Data Automation (CI/CD) with a Real Life Example

May 17, 2023

3 Ways to Deploy Data Projects

May 10, 2023

The Importance of Virtual Environments

Apr 26, 2023

How to Create a Virtual Machine on GCP

Apr 19, 2023