AI come in peace 馃槈 馃憢

I’m Kaushik Moudgalya, welcome to my slice of the internet, Old Habits AI Hard. I am a Master’s grad from the University of Montreal & MILA. I specialize in Machine Learning and NLP. Check out my projects or blog or just rest here before further internet travels. As you might have noticed, I like puns.

This page is mostly a collection of my blogs and projects. For more specific information about me, please use the links below. Site currently under construction, updates in progress!

This is also my experimental page for anything random ML / DL / Coding related shenanigans that I find interesting. Any blogs with [DRAFT] in the title are still work in progress and not meant for public consumption….yet. It’s meant to be like a trailer but for blog posts.

Func-tastic Python: Mastering Functools

There鈥檚 your life as Python user before you knew about functools and then there鈥檚 your life as a Python user after. The same can be said about itertools & collections, but that鈥檚 a story for another time. People even go as far as saying that functools is life-changing. Seeing that I鈥檓 writing this post, it鈥檒l be no surprise that I agree. Functools is part of the python standard library. The functools documentation itself is quite nice, but I thought I鈥檇 try presenting my own take on it....

June 28, 2024 路 22 min 路 Kaushik Moudgalya

Pre-commit for Data Scientists

There鈥檚 a nifty little python library called pre-commit that鈥檚 doing rounds around the internet. And it all starts with a little file called .pre-commit-config.yaml. But we鈥檒l get into that later. First, let鈥檚 talk about pre-commits in general. As data scientists, we version our code, data and models. While versioning our code, we might use linters like flake8 and black to make sure our code conforms to the PEP guidelines. However, we usually have to run these manually and more often than not, we might forget to do so....

February 2, 2024 路 8 min 路 Kaushik Moudgalya

OOPs for Data Scientists

There are 4 principles of OOPs that everyone is expected to know. I used to be one of those people who thought that it was enough to know how to read and write OOPs code without actually knowing what those principles were as I thought they were more of a software engineering concept. But it makes sense to be aware of the common terms and verbalizations of the concepts within OOPs....

November 21, 2023 路 19 min 路 Kaushik Moudgalya

Numpy Exercises 1-50

First 50 numpy exercises This is a set of exercises collected by Rougier. All credits to Rougier for curating this list. I am simply trying to solve it for practice and hoping it serves as a reference for others. I am surprised I didn鈥檛 come across it before. View this post in a Jupyter Notebook This is intended to serve as a stepping stone to becoming a better Data Scientist / Machine Learning Researcher....

October 29, 2023 路 28 min 路 Kaushik Moudgalya

Demystifying the mathematics behind PCA

Demystifying the mathematics behind PCA We all know PCA and we all love PCA. Our friend that helps us deal with the curse of dimensionality. All data scientists have probably used PCA. I thought I knew PCA. Until I was asked to explain the mathematics behind PCA in an interview and all I could murmur was that it somehow maximizes the variance of the new features. The interviewer was even kind enough to throw me a hint about projections....

October 7, 2023 路 23 min 路 Kaushik Moudgalya

NHL Data Science Project

Implemented a complete Data Science pipeline: data collection, tidying data, creation of synthetic features, basic and advanced interactive visualizations using plotly, tracking models through CometML, deploying models through a REST API using Docker and Flask.

December 23, 2021 路 1 min 路 Kaushik Moudgalya

Crop Harvest Classification

Given meteorological and satellite data, predicted land as either crop or non-crop land. Used techniques such as AutoML (Light AutoML and PyCaret) as well as blending and stacking to reduce bias and generalize better. Check out the doc link for a more detailed overview of the project.

November 30, 2021 路 1 min 路 Kaushik Moudgalya

Weather Events Classification

Classified events as either standard background conditions / tropical cyclones or atmospheric rivers. Used techniques such as SMOTE and SMOTE Tomek to fix class imbalance, hyperparameter tuning using HalvingRandomGridSearchCV, manual feature engineering, and a plethora of sklearn classification algorithms. Check out the doc link for a more detailed overview of the project.

October 15, 2021 路 1 min 路 Kaushik Moudgalya

Anime Project

My current magnum opus. Consecutively scraped a ton of images for the top 100 anime followed by scraping even more than a ton of images of the top 10 characters in each anime. The goal was to try to identify anime using a ML model and when successful, we planned to identify the characters in the image as well. WIP.

July 23, 2021 路 1 min 路 Kaushik Moudgalya

Whitepaper: Ethically mitigating biases in DS / ML

The paper provides techniques and a checklist to prevent bias from creeping into Machine Learning models. (Co-author)

January 30, 2021 路 1 min 路 Kaushik Moudgalya