The Practical Data Engineering Resource

Photo by matthew Feeney on Unsplash
Photo by matthew Feeney on Unsplash

The data engineering space is evolving. Here are the resources I collected for practical data engineering resource.

Last Updated: 2023–04–02


Photo by Susan Q Yin on Unsplash

The Essential Reading List for Data Engineers: 10 Classic Books You Can’t Miss

Discover the Essential Reading List for Data Engineers: 10 Classic Books You Can't Miss. While many free online resources are available, they often lack the ...
Read More →

Data Engineering Space Leader (who to follow)

Practical Data Engineering Framework

Data Processing

  • Apache Spark: Unified engine for large-scale data analytics
  • Apache Flink: Stateful Computations over Data Streams
  • Apache Beam: The easiest way to do batch and streaming data processing. 
Photo by Lizzi Sassman on Unsplash

Deep Dive into Handling Apache Spark Data Skew

"Why my Spark job is running slow?" is an inevitable question. We will cover how to identify Spark data skew and how to handle data ...
Read More →

Workflow Orchestration

  • Airflow: platform created by the community to programmatically author, schedule and monitor workflows
  • Mage: A modern replacement for Airflow.
  • Kestra: an infinitely scalable orchestration and scheduling platform, creating, running, scheduling, and monitoring millions of complex pipelines.
Photo by Karsten Würth on Unsplash

Here Is What I Learned Using Apache Airflow over 6 Years

Apache Airflow is undoubtedly the most popular open-source project for data engineering for years. It gains popularity at the right time with The Rise Of ...
Read More →
Foto von Enis Yavuz auf Unsplash

Is Apache Airflow Due for Replacement? The First Impression Of mage-ai

Airflow has been widespread for years. Is Apache Airflow due for a replacement? mage-ai is the new ETL tool for data engineers to check out ...
Read More →
Photo by Daria Nepriakhina 🇺🇦 on Unsplash

5 Fantastic Data Pipeline Orchestration Tools For R

Many modern data orchestration projects like Apache Airflow and Luigi are Python-based. Let's explore the popular data pipeline orchestration options for R.
Read More →

OLAP Query

  • Druid: a high-performance, real-time analytics database that delivers sub-second queries on streaming and batch data at scale and under load.
  • Trino: a query engine that runs at ludicrous speed
  • ClickHouse: a column-oriented database that enables its users to generate powerful analytics, using SQL queries, in real-time.

Data Visualization / Reporting

  • Superset: a modern data exploration and visualization platform
  • Metabase: Fast analytics with the friendly UX and integrated tooling to let your company explore data on their own.
  • ECharts: a powerful, interactive charting and data visualization library for browser

Awesome Blogs

Classic Articles

About Me

I hope my stories are helpful to you. 

For data engineering post, you can also subscribe to my new articles or becomes a referred Medium member that also gets full access to stories on Medium.

In case of questions/comments, do not hesitate to write in the comments of this story or reach me directly through Linkedin or Twitter.

More Articles

Photo by Choong Deng Xiang on Unsplash

How to Visualize Monthly Expenses in a Comprehensive Way: Develop a Sankey Diagram in R

Personal budgeting APP like Mint/Personal Capital/Clarity only provide three limited types of charts. Have you ever wondered if charts are good enough to get better ...
Read More →
Photo by JIUNN-YIH LAU on Unsplash

5 Tips for Self-Promotion as Data Professionals

Getting the work done isn't the journey's end. Your work should be your channel to get YOU self-promotion. I will give five tips to get ...
Read More →
Second Iteration: Interactivity with User Click | Image By Author

How to Engage with Users By Storytelling: Show Data Analytics in R and Shiny

Using R and Shiny, we can build an app where the end users can interact with the data analysis we have done. I will show ...
Read More →

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top
Share via
Copy link