Welcome
Build Resilient, Scalable Data Platforms with A Data Architect (Apache Contributor)
I help engineering teams and startups design production-ready pipelines using Apache Airflow, Spark, Kafka, and Flink in cloud-native environments.
Contributor of Apache Airflow, Flink
Creator of Data Engineering Space
Latest Articles & Tutorials
Data Engineering
Unlocking the Secrets of Slowly Changing Dimension (SCD): A Comprehensive View of 8 Types
Slowly Changing Dimension (SCD) is critical to dimensional modeling. We will discuss the eight types of SCDs. By the end, you will clearly understand each type and be able to differentiate between SCDs in dimensional modeling.
July 14, 2023
Data Engineering Data Warehouse SCD
Read More
Data Engineering
Demystifying Null in SQL: A Comprehensive Guide for Data Professionals
Sometimes writing SQL can be frustrating, especially when encountering NULL values. This article can help you better understand NULL in SQL
June 30, 2023
Data Engineering
Read More
Data Engineering
How I Built a Tool to Visualize Expense In Sankey Diagram
My main goal is to enable people without programming experience to use the powerful Sankey Diagram by simply uploading the transaction CVS file from the popular site Mint.com.
June 22, 2023
Data Engineering Data Visualization R Sankey Diagram Shiny
Read More
Data Engineering
Data Engineering: Why It’s About Much More Than Just the Tools You Use
One key learning I had while chasing the latest tool is: Tools are great, but many data engineering problems cannot be resolved by using the newest tool but by human — Data Engineers. I want to share my thoughts on Why data engineering is about much more than just the tools you ...
May 17, 2023
Data Engineering Data Warehouse
Read More
Data Engineering
Building Better Data Warehouses with Dimensional Modeling: A Guide for Data Engineers
Let's bring the data community's attention to the essential- Building Better Data Warehouses with Dimensional Modeling: A Guide for Data Engineers.
May 5, 2023
Data Engineering Data Warehouse
Read More
Data Engineering
Boosting Spark Union Operator Performance: Optimization Tips for Improved Query Speed
We will focus on the Apache Spark Union Operator Performance with examples, show you the physical query plan, and share techniques for optimization in this story.
April 20, 2023
Apache Spark Data Engineering Spark Performance
Read More
Data Engineering
Why R for Data Engineering is More Powerful Than You Thought
R could add potential benefits to help the data engineering community. Let's discuss about Why R for Data Engineering is More Powerful Than You Thought.
April 15, 2023
Data Engineering ggplot2 R Shiny
Read More
Data Engineering
5 Hidden Apache Spark Facts That Fewer People Talk About
I want to share 5 hidden facts about Apache Spark that I learned throughout my career. Those can be helpful to you to save you some time reading the Apache Spark source code.
April 4, 2023
Apache Spark Data Engineering Spark Performance
Read More
Data Engineering
Uncovering the Truth About Apache Spark Performance: coalesce(1) vs. repartition(1)
We will discuss a neglected part of Apache Spark Performance between coalesce(1) and repartition(1), and it could be one of the things to be attentive to when you check the Spark job performance.
April 4, 2023
Apache Spark Data Engineering Spark Performance
Read More
Data Engineering
The Practical Data Engineering Resource
The data engineering space is evolving. Here are the resources I collected for practical data engineering resource.
April 2, 2023
Data Data Engineering Data Visualization
Read More
Data Engineering
How to Find the Best Deals On Time with R and Mage
How to find the best deals and coupons promptly can save you money and time. We can quickly build a weekend project that automatically finds the best deals on time with R and Mage
March 25, 2023
Data Engineering mage-ai R
Read More
Data Engineering
4 Free Fantastic Diagramming Tools To Make Yours Stand Out
Many diagrams bring less excitement to work with and view as the final result. I will share 4 free fantastic diagramming tools to make yours stand out.
March 21, 2023
Data Engineering Data Visualization Diagramming Productivity Project Management
Read More












