JONATHAN AU

Data Analyst | Financial Analyst

tempImageUnnPHk.gif

Hi, I am Jonathan

I am a financial data analyst based in Vancouver Canada.

I’ve been working as a financial analyst for the past 3 years and have a passion for sharing my learnings in data analytics. I obtained my commerce degree specializing in Finance from UBC Vancouver.

I love conducting industry research and analyzing data, trends, results and generating insights. I am in the transition phase to start a career in data and BI analyst; hence why I created this site to share my thoughts and learnings.

My mission is to transform data into valuable, comprehensive insights to improve results and make data-driven decisions. My application framework is not limited to driving business growth, but also includes discovering the truth about our environment and society. I care about the future of our nature and the well-being of humanity because these issues will always be there and deserve a lifetime of attention.

Skillsets

Skillset banner.png
 

Investment Research

As a financial analyst, I have developed a keen sense for interpreting financial data through the calculations of different metrics. The SOP for investment research is to understand the industry, the company, and its valuation. Such experience allows me to quickly understand a company's business nature, profitability, and competitive edge.

 

Dashboards

Presenting data in an inspiring way can be done by an interactive design. By clicking and drilling, stakeholders can examine trends and patterns themselves. Using storytelling technique to tell data insights makes it easy for everyone involved in the project to absorb the message and meaning within it than if the same message was presented simply in facts and figures.

 

Data Analysis

Turning raw data into meaningful insights requires analytical tools and contextual information. I follow SOPs to guide my data analysis process, with the goal of combining quantitative and qualitative insights to discover causal relationships and make better decisions. I can approach data problems with data analytical approach and the data science approach, depending on the problem statement and context. Both approaches do share some knowledge but the approach to solve a problem works differently, as visualized below.

 

Recent Projects

2016-2022 Vancouver Crime Data Exploration and Modelling

In this project, I applied data analytics and machine learning methodologies for the Vancouver Police Department (VPD) to predict hourly theft crimes across different neighbourhoods in Vancouver BC. Multiple data sources were eventually being incorporated into the original crime dataset from VPD for data exploration and feature engineering. The objective of initial data analysis was to identify key crime patterns that could provide direction for further analysis, dashboard building and model-development. We examined the overall trends of reported crime cases, and analyzed time-related patterns and geographical related patterns. Theft crime accounts for most of the crimes in Vancouver so I decided to narrow down our focus to Theft crimes when building machine learning models. The ultimate goal was to implement predictive policing to help reduce crime, while mitigating risk to law enforcement officers.  

  • Datasets were prepared via cleaning, transforming, joining and aggregating in SQL and Python

  • Visualizing time-related patterns and geographical related patterns on Tableau

  • Building Binary Classification model to classify high risk and low risk of theft crime activities at a given location and hour

  • The model successfully captured 81% of unseen instances where the actual theft crime was greater than 3 cases per neighbourhood per hour

  • Original Project Workbook and Code

 
NYC_Yello_Taxis_bw_smart_cities_Adobe_rt.jpg

New York Taxi Analysis

In this project, I applied and demonstrated the data science pipeline to explore and use deterministic features to predict how much a cab driver can earn per hour in different areas of New York. The knowledge of which areas earn more or earn less in any given day and any given hour allows the union to distribute cab assignments more equitably across different areas, and to rotate cab drivers between higher and lower-income areas on a daily basis. This way, there will not be cab driver would dominate in a high-income area while another cab driver would be continuously evicted to a low-income area.

  • Utilizing Python pandas, numpy, matplotlib for Data Exploration, Data Cleaning and Data Preparation

  • Identifying trends and visualizing time series data of taxi trips

  • Building machine learning algorithms to predict taxi fares in New York

  • Detailed documentation of the each step in jupyter notebook

  • The Python packages I utilized: pandas|numpy|matplotlib|scikit-learn

  • README | Project Workbook

 
red-and-sky-blue-single-virion.png

COVID-19 DATA EXPLORATIONS

  • Identifying trends and visualizing time series data of global, continental, country cases and deaths

  • Examining how different vaccine manufacturers contributed in case reduction

  • Examining the adequacy of policy response to pandemic by analyzing stringency index and cases

  • The tools I utilized: SQL|Tableau

 
wp4729457.jpg

POPULATION GROWTH AND ENVIRONMENTAL DESTRUCTION

  • How has the planet been adversely affected since the population boom of the past 100 years?

  • What are the links between climate change, resource depletion, natural disasters and overpopulation, and what are the implications for humanity?

  • Why is the "Great Green Goal" obsolete and our existing problems will increase further? What is the role of celebrities? Politicians, or even activists? Have they really accomplished anything?

  • The tools I utilized: Excel|Tableau|Canva

 

Air Transport Database Design Project

Before the pandemic, the number of air travel had reached its all time peak. But at the same time there have been many disturbing incidents such as long delays and overbooked flights. There were even passengers being dragged out from the plane and airline companies paying large sums of fines for their extreme delays. Therefore, in our BCIT course project, I and my group had looked into the problem from the perspective of IATA and see how we could utilize data to analyze the inefficiency of airline operation. The database is modelled based on the 7Ws dimensional modelling technique.

  • The goal of the database is to measure the data of passenger counts and delay figures for each flight

  • Developing Star Schema ER diagram in MySQL Workbench

  • Performed query in SQL to narrow down problems and answer core questions

  • Transform query results to actionable insights to tackle flight overbooking and delays

  • The project includes 3 sections: Project Overview, Database Design Process, SQL Query