Blogs

I have a passion for Data Science, Innovation, Leadership, and trying to make the most out of every situation. You will see these themes reflected in these blogs.

Data Normalisation

Posted on 07 Oct 2024 Posted to LinkedIn

Over the last 9 months, the Visa Consulting & Analytics (VCA) Data Science team have developed a new capability which we call Data Normalisation. This enables us to scale up Visa’s aggregated and anonymised spend data to represent total retail spend by industry and geographic location. And to demonstrate this capability, we have collaborated with Craig Butt to reveal how much people have cut back on dining and entertainment and changed their spending behaviours as a result of cost-of-living pressures.

[Read More]

Rotary Youth Leadership Awards

Posted on 27 Jan 2023 Posted to Medium

In January this year, I was part of the facilitation team on the annual Rotary Youth Leadership Awards (RYLA) course in North Sydney. It is a week-long experiential leadership program, aimed at empowering the next generation of leaders to achieve great things.

The program is designed around a reflectively experiential approach to leadership, employing the abstraction of ‘ERLA’ to assist the participants through their journey. ERLA is an abbreviation for: Experience, Reflection, Learning, Action. This concept is continually reinforced throughout the course of the week.

In this article, I will discuss all four parts of ‘ERLA’ from both an experiential and also an academic perspective.

[Read More]

How to Use GitHub Webhooks, Docker, and Python for Automatic End-to-End Deployments

Posted on 29 Apr 2022 Posted to BetterProgramming

If you’re anything like me, you’re a curious creature. So, when I started learning about what Webhooks are, and how to use them, it felt like I was pulling at a loose thread on my tee shirt. The more I pulled, the more the world of Webhooks unraveled, the more I learnt, the more I understood, the more I unveiled about the mysterious world of API’s and Webhooks and Automation, and the more I wanted to learn more!

So, the motivation here is to create a streamlined and seamless way for doing deployments from GitHub to production servers. I feel like it is possible to write a simple Python App to do this. So let’s find out how!

This article will be split in to four sections, each outlining a different aspect of this process. Section Two is incredibly detailed, but I’ve tried to include as much description and screen shots as necessary, in order to make it easier to comprehend what is happening here. Section Three contains hands-on instructions for how to use the App. Enjoy!

[Read More]

The Power of Code Snippets

Posted on 02 Aug 2021 Posted to CodeX

Modern day programmers write a lot of code. That’s part of the job. But one of the main principles of writing good code is the DRY principle: Don’t Repeat Yourself (see The DRY Principle: R Functions or The DRY Principle: Python Functions or Wikipedia or any number of other online sources). In essence, this principle effectively states that if you are going to write the same code twice, then don’t; instead, write the code in a function, then call the function twice.

There’s also another principle, which is equally if not more important: Document Your Work. For this, function docstrings are extremely helpful. Basically, you want to write your code for someone else (even if that someone else is the future version of you…), so that they can understand your code better. Here, you want to explain why your code is doing what it is doing (the reader can see what the code is doing by simply reading the code itself). Such as if there is some complicated logic, or to handle a quirk in the data, then it’s best to write a docstring. But don’t go overboard though! There are many reasons why too much documentation can be a bad thing (such as How to Comment Your Code Like a Pro: Best Practices and Good Habits and Putting Comments in Code: The Good, the Bad, and the Ugly). It is the intersection of these two principles that gives rise to the use of a fantastic thing in the world of programming: Code Snippets. These snippets are effectively ‘saved chunks of code’, which can allow you to speed up the implementation of your code, and the overall readability of your code. And ultimately, will turn your code from Good to Great.

When approaching this from a Data Science perspective, I will focus on the two most popular coding languages in this field, and the popular IDE for each of them: R + RStudio and Python + VSCode.

[Read More]

Mental Models and Social Situations

Posted on 20 Jul 2021 Posted to BetterHumans

Our mental models are deeply engrained images of how we see the world and how we to react in different scenarios and situations (McShane, Travaglione & Olekalns 2010, p. 91; Senge 2006, p. 164).

When I meet someone for the first time, my mental models greatly influence my perceptions of them; their words, their expressions, their reactions. And by extension, these mental models then influence my thoughts and actions in that situation; how I behave, how I speak, even my subconscious mannerisms. As a result, that first meeting can make for a pleasant first impression, or an embarrassingly regrettable occasion.

Conversely, I believe that my mental models also limit my perceptions of the other people I meet. This is because I am basing my conclusions on two things: a single, short meeting, and on my filtered model of the world which has been developed over my entire lifetime. Which leaves me with limited inferential flexibility, and the possibility of a prejudicial conclusion (Ormerod 2000, cited in Johnson-Laird 2001, p. 436; Markman & Gentner 2001, p. 230).

[Read More]

Reinforcement Learning

Posted on 12 Jun 2021 Posted to TowardsDataScience.com

Reinforcement Learning is not a new concept, but has been developed and matured over 70 years of academic rigour. Fundamentally, Reinforcement Learning is a method of machine learning by which an algorithm can make decisions and take actions within a given environment, and learns what appropriate decisions to make through repeated trial-and-error actions. The academic discourse for Reinforcement Learning pursued three concurrent ‘threads’ of research (trial and error, optimal control, and temporal difference), before being united in the research in the 1990’s. Reinforcement Learning was then able to proceed to mastering the playing of Chess, and of Go, and of countless electronic games. The modern applications of Reinforcement Learning are enabling businesses to optimise, control, and monitor their respective processes, to a phenomenal level of accuracy and finesse. As a result, the future of Reinforcement Learning is both exciting and fascinating, as the research aims to improve the algorithm’s interpretability, accountability and trustworthiness.

[Read More]

Vanilla Neural Networks in R

Posted on 07 Nov 2020 Posted to TowardsDataScience.com

The purpose of this paper is to create a ‘back to basics’ approach to designing Deep Learning solutions. The intention is not to create the most predictive model, nor is it to use the latest and greatest techniques (such as convolution or recursion); but the intention is to create a basic neural network, from scratch, using no frameworks, and to walk through the methodology.

[Read More]

Exploring Undernourishment

Posted on 13 Oct 2020 Posted to StartItUp

Our World today has many issues, and the Prevalence of Undernourishment is just one of them. Over the past twenty years, the United Nations, through the Food and Agriculture Organisation (FAO), has collected data on many countries, and have helped to influence and improve their Undernourishment overall.

This report, and the associated Data Exploration App, looks to explore the data provided by the FAO, and to understand it’s nuances, to learn what information it is telling us, and to derive meaning from it. The research activities undertaken focus on four key areas:

This report embarks on an exploratory data analysis through the narratives told by this data.

[Read More]

Addressing the John Smith Problem

Posted on 04 Jul 2020 Posted to TowardsDataScience.com

Many databases have duplicate data. Especially if manual data entry is required. In order to clean the data and to resolve unnecessary duplicates, it is necessary to identify and rectify messy data. However, many duplicates are non-matching; meaning there could be duplicate data that contains, for example, spelling errors. It is challenging to identify these duplicates perfectly using the SQL database language because this relies on exact matching (due to the tenets of Relational Database theory). Therefore, it is necessary to look for other methods of identifying non-matching duplicates, which is where Fuzzy Matching is able to be used.

[Read More]

Reinforcement Learning in the Warehousing Industry

Posted on 03 Jul 2020 Posted to AI in Plain English

Artificial Intelligence and Machine Learning is advancing at an ever-increasing rate. Reinforcement Learning (RL) is one area of Machine Learning which is proving to be incredibly promising for the future of business efficiency and optimisation. Within the Warehousing and Logistics industry, there are some unique challenges, some of which can be addressed and improved with the application of Reinforcement Learning. One of these examples is the Picking and Putaway strategies which are implemented within modern Warehouse Management systems. If a Reinforcement Learning algorithm were to be developed to address this scenario, the benefits to businesses would improve efficiency and profitability. However, Reinforcement Learning has some nuanced difficulties which will need to be handled when scaling a solution like this to a production-ready environment.

[Read More]