Elements of Statistical Learning - Chapter 2 Solutions
The Stanford textbook Elements of Statistical Learning by Hastie, Tibshirani, and Friedman is an excellent (and freely available) graduate-level text in data mining and machine learning. I'm currently...
View ArticleUniversity of Sydney Mathematics Notes
This is a compilation of various sets of lecture notes I created during my Bachelors degree in Mathematics at the University of Sydney. All .tex files are available at the GitHub repository.Applied...
View ArticleA Primer on Gradient Boosted Decision Trees
Gradient boosted decision trees are an effective off-the-shelf method for generating effective models for classification and regression tasks.Gradient boosting is a generic technique that can be...
View ArticlePurely Functional Tree Search in Haskell
Haskell is an absolute pleasure to write code in, and I've been trying to use it more and more. It's a language that rewards extended effort in a way that C++ et. al. do not.Consider the following...
View ArticleThe Ito isometry
The Itō isometry is a useful theorem in stochastic calculus that provides a fundamental tool in computing stochastic integrals - integrals with respect to a Brownian motion \begin{equation}...
View ArticleSpeeding up Decision Tree Training
The classic algorithm for training a decision tree for classification/regression problems (CART) is well known. The underlying algorithm acts by recursively partitioning the dataset into subsets that...
View ArticleConsistency of M-estimators
Let $\Theta \subseteq \mathbb{R}^{p}$ be compact. Let $Q: \Theta \rightarrow \mathbb{R}$ be a continuous, non-random function that has a unique minimizer $\theta_{0} \in \Theta$.Let $Q_{n}: \Theta...
View ArticleA basic soft-margin kernel SVM implementation in Python
Support Vector Machines (SVMs) are a family of nice supervised learning algorithms that can train classification and regression models efficiently and with very good performance in practice.SVMs are...
View ArticleOnline Learning with Microsoft's AdPredictor algorithm
Online learning (as opposed to more traditional batched machine learning) is more and more commonly applied to training machine learned models at scale. The chief advantage is that the model is trained...
View ArticleThe Performance of Decision Tree Evaluation Strategies
UPDATE: Compiled evaluation is now implemented for scikit-learn regression tree/ensemble models, available at https://github.com/ajtulloch/sklearn-compiledtrees or pip install sklearn-compiledtrees.Our...
View ArticleStripe CTF Distributed Systems - Minimal Solutions
The Stripe CTF Distributed Systems edition has just finished, and I passed all the levels (and was one of the first fifty contestants to finish). In constructing my solutions, I thought it would be an...
View ArticleSpeeding up isotonic regression in scikit-learn by 5,000x
Isotonic regression is a useful non-parametric regression technique for fitting an increasing function to a given dataset.A classic use is in improving the calibration of a probabilistic classifier....
View ArticlePython vs Julia - an example from machine learning
In Speeding up isotonic regression in scikit-learn, we dropped down into Cython to improve the performance of a regression algorithm. I thought it would be interesting to compare the performance of...
View ArticleA Parallel Boggle Solver in Haskell
A cute interview question I've had is "given an $n \times n$ board of characters and a dictionary, find all possible words formed by a self-avoiding path in the grid". This is otherwise known as...
View ArticleA modern Emacs setup in OS X
About a year ago I switched from Vim to Emacs, and I couldn't be happier about the move. I spent some time getting a setup I was happy with, and thought I'd share it for those who are also looking to...
View ArticleThe LASSO Estimator
As far as I can tell, the LASSO estimator is the closest thing we have to a miracle in modern statistics.The LASSO estimator is defined as a solution to the minimization problem $\frac{1}{n} \| Y - X...
View ArticleCambridge Part III Mathematics Notes
I've cleaned up (somewhat) my notes from Cambridge Part III and have put them online - with LaTeX sources available on GitHub and PDFs linked below.Advanced Financial ModelsLecture NotesSummaryAdvanced...
View Article
More Pages to Explore .....