Use Pipelines to streamline your data science project right now!

Photo by Myriam Jessier on Unsplash

Most of the data science projects (as keen as I am to say all of them) require a certain level of data cleaning and preprocessing to make the most of the machine learning models. Some common preprocessing or transformations are:

a. Imputing missing values

b. Removing outliers

c. Normalising or…

Git initiation, rename, stash, reset, and rebase

Besides ordinary workflow, these are some of the Git practices that I found very useful at work

Photo by Fotis Fotopoulos on Unsplash


  1. You have to have Git installed on your local machine. You can follow the official documentation to do so.
  2. Have a Github account (a free version would suffice).

1. Initiate Git in your project and publish it to Github

You might have been working on something on your own for a while. And one day you need to share it…

Git Question 101

This is probably the most-asked question of git users

Photo by mari lezhava on Unsplash

It’s been long debated in the community that whether merge or rebase should we use.

Some people would say merge is better cause it preserves the most complete working history. Others would argue rebase is neater, which makes the reviewer’s life easier and more efficient. …

Go through the Git workflow step by step with hands-on practice (code example)

Photo by Jefferson Santos on Unsplash

Following the previous An Intro to Git and Github for Beginners, today we’re gonna get our hands dirty. This story is going to walk you through the Git workflow with practical code examples. Since Github is the most popular website to host Git repositories, we’ll be using it as examples…

Lesson no.1 of Git and Github

To finally understand what the heck are those engineers are talking about!

Photo by Lorenzo Herrera on Unsplash

Engineer A: “Hey, have you merged your branch to develop yet?”

Engineer B: “No, I’m waiting for my previous PR to be merged so I can proceed. But I’ve already staged all my works and pushed.”

Engineer A: “Ok, I’ll review and merge it later. Don’t forget to rebase!”


See how we combat the pandemic from a statistical perspective

Photo by Edwin Hooper on Unsplash

To test or not to test, this is a statistical question.

In this global battle with the pandemic, Taiwan has done a marvelous job keeping its citizens safe and healthy, with only 850 confirmed cases and 7 deaths in total to date (17/01/2021). As a small island only a strait…

An easy trick of python’s built-in database, SQLite, to make your data manipulation more flexible and effortless.

Photo by William Iven on Unsplash

Pandas is a powerful Python package to wrangle your data. However, have you ever encountered some tasks that just make you think ‘if only I could use SQL query here!’? I personally found it particularly annoying when it comes to joining multiple tables and extracting only those columns you want…

Photo by Markus Winkler on Unsplash

This article provides a step-by-step tutorial of connecting to Azure SQL Server using Python on Linux OS.

After creating an Azure SQL Database/Server, you can find the server name on the overview page.

How to use Resample in Pandas to enhance your time series data analysis

Photo by Jiyeon Park on Unsplash

When it comes to time series analysis, resampling is a critical technique that allows you to flexibly define the resolution of the data you want. …

