Streaming Data Is Exciting: What You Need to Know Before Taking the Plunge
Is streaming data necessary for this particular use case? Rather than blindly diving in, it’s essential first to acknowledge the realities of working with streaming data.
Is streaming data necessary for this particular use case? Rather than blindly diving in, it’s essential first to acknowledge the realities of working with streaming data.
Discover the Essential Reading List for Data Engineers: 10 Classic Books You Can’t Miss. While many free online resources are available, they often lack the depth and context needed to truly master the field. In this article, I will share ten classic books that cover everything from fundamental technical skills like Python and SQL, to more advanced topics like Apache Spark, Apache Flink, Apache Beam, Apache Airflow, Kubernetes, distributed systems, and dimensional modeling.
Understanding window function is critical for anyone that writes SQL daily. In this story, let’s think in SQL and demystify the window function with examples and diagrams.
Data visualization has always been a delightful area for me to work as a data professional. Visualizing data is like an art. Can I visualize streaming data in another way? I built a game by using streaming data, and this is a fun way for data visualization.
Pandas is no doubt one of the most popular libraries in Python. However, Pandas doesn’t shine in the land of data processing with a large dataset. We will compare 4 faster pandas alternatives for data analysis: Polars, Dask, Vaex, Modin
SQL logical query processing order can help you understand why to change writing SQL in the top to bottom approach. It can also help you think in SQL clearly and develop your query more effectively
Have you seen any beautiful racing bar chart data animation on Youtube and wondered how it was built? I will show you how to use gganimate in R to animate data by creating a racing bar chart as an example.
Data engineers can work on some side projects to get experience. Those projects could initiate impressive discussions to help you land a dream job. We will introduce 6 data engineering side project ideas regardless of your experience.
Airflow has been widespread for years. Is Apache Airflow due for a replacement? mage-ai is the new ETL tool for data engineers to check out as a substitution. I have taken a first impression of mage-ai and will share my thoughts.
Apache Airflow is undoubtedly the most popular open-source project for data engineering for years. It gains popularity at the right time with The Rise Of Data Engineer. Today, I want to share my journey with Airflow and what I learned over 6 years.