Step 1: Design
Learn how to design a simple, yet modern end-to-end pipeline
The first step is to establish a clear understanding of common design patterns so you can feel confident picking a strategy or learning a tool.
Without a strategy, you may look busy but will feel completely overwhelmed.
But with one, you’ll have more clarity & your efforts will have more productive, compounding results.
Step 2: Setup
Gain hands-on experience setting up common data tools
With our design in hand, we'll then configure each core component of the data stack, establish roles/permissions where needed & ensure connectivity between them.
You'll learn how to setup & use tools such as: Postgres, Airbyte, dbt, Metabase, GitHub & Docker.
Step 3: Build
Confidently build a data architecture from scratch with best practices
In this step, we'll implement a single foundational end-to-end data pipeline that touches all 5 core components and establish Production vs Development environments.
This sets the standards & conventions for which all future work on the architecture will follow.
By keeping the scope small, you’re able to pay close attention to each layer while also showcasing quick wins.
(Yes, we'll cover data modeling too)
Step 4: Automate
Discover how to automate your development workflow
Next, it's time to create automated workflows that validate data quality before reaching Production & ensure our data models stay updated each day.
Automation makes the development cycle more efficient, reduces egos/emotions in code reviews & overall provides more control over the architecture.
Once you get this right, it's hard to work on any other team without it in place.
Step 5: Maintain
Learn how to maintain & scale your architecture long-term
Lastly, we'll create assets that provide transparency into the design, organize important information & overall establish a culture of high quality development long-term.
The biggest cost in any architecture is maintenance, which compounds over time.
Being proactive will save countless hours later while enabling developers to have fun creating new features vs troubleshooting.
Step 6: Migrate (BONUS)
Migrate from a self-hosted stack to a "Production-Ready" cloud setup
Open source & self-hosted stacks are excellent for learning & gaining experience.
But most Production-ready data stacks won’t be designed on Codespaces or a single virtual machine.
In this section, you'll learn how to setup & use new tools such as: Snowflake, Fivetran & Tableau.
Resources (BONUS)
Access custom resources to prepare you for real implementations
As a student, you'll also get access to tons of resources to help you implement your Simple Stack faster & maintain it more effectively:
This includes (but not limited to):
- A Simple Stack Checklist (Excel)
- Pre-made workflow automation jobs (yaml)
- Pre-written documentation, style guides & more
- Data Architecture designs
Each of these are based on real-world implementations & will save you a ton of time!