Quantcast
Channel: Andrew Tulloch
Browsing latest articles
Browse All 17 View Live

Image may be NSFW.
Clik here to view.

Elements of Statistical Learning - Chapter 2 Solutions

The Stanford textbook Elements of Statistical Learning by Hastie, Tibshirani, and Friedman is an excellent (and freely available) graduate-level text in data mining and machine learning. I'm currently...

View Article



University of Sydney Mathematics Notes

This is a compilation of various sets of lecture notes I created during my Bachelors degree in Mathematics at the University of Sydney. All .tex files are available at the GitHub repository.Applied...

View Article

A Primer on Gradient Boosted Decision Trees

Gradient boosted decision trees are an effective off-the-shelf method for generating effective models for classification and regression tasks.Gradient boosting is a generic technique that can be...

View Article

Purely Functional Tree Search in Haskell

Haskell is an absolute pleasure to write code in, and I've been trying to use it more and more. It's a language that rewards extended effort in a way that C++ et. al. do not.Consider the following...

View Article

The Ito isometry

The Itō isometry is a useful theorem in stochastic calculus that provides a fundamental tool in computing stochastic integrals - integrals with respect to a Brownian motion \begin{equation}...

View Article


Speeding up Decision Tree Training

The classic algorithm for training a decision tree for classification/regression problems (CART) is well known. The underlying algorithm acts by recursively partitioning the dataset into subsets that...

View Article

Consistency of M-estimators

Let $\Theta \subseteq \mathbb{R}^{p}$ be compact. Let $Q: \Theta \rightarrow \mathbb{R}$ be a continuous, non-random function that has a unique minimizer $\theta_{0} \in \Theta$.Let $Q_{n}: \Theta...

View Article

Image may be NSFW.
Clik here to view.

A basic soft-margin kernel SVM implementation in Python

Support Vector Machines (SVMs) are a family of nice supervised learning algorithms that can train classification and regression models efficiently and with very good performance in practice.SVMs are...

View Article


Image may be NSFW.
Clik here to view.

Online Learning with Microsoft's AdPredictor algorithm

Online learning (as opposed to more traditional batched machine learning) is more and more commonly applied to training machine learned models at scale. The chief advantage is that the model is trained...

View Article


Image may be NSFW.
Clik here to view.

The Performance of Decision Tree Evaluation Strategies

UPDATE: Compiled evaluation is now implemented for scikit-learn regression tree/ensemble models, available at https://github.com/ajtulloch/sklearn-compiledtrees or pip install sklearn-compiledtrees.Our...

View Article

Stripe CTF Distributed Systems - Minimal Solutions

The Stripe CTF Distributed Systems edition has just finished, and I passed all the levels (and was one of the first fifty contestants to finish). In constructing my solutions, I thought it would be an...

View Article

Image may be NSFW.
Clik here to view.

Speeding up isotonic regression in scikit-learn by 5,000x

Isotonic regression is a useful non-parametric regression technique for fitting an increasing function to a given dataset.A classic use is in improving the calibration of a probabilistic classifier....

View Article

Image may be NSFW.
Clik here to view.

Python vs Julia - an example from machine learning

In Speeding up isotonic regression in scikit-learn, we dropped down into Cython to improve the performance of a regression algorithm. I thought it would be interesting to compare the performance of...

View Article


A Parallel Boggle Solver in Haskell

A cute interview question I've had is "given an $n \times n$ board of characters and a dictionary, find all possible words formed by a self-avoiding path in the grid". This is otherwise known as...

View Article

A modern Emacs setup in OS X

About a year ago I switched from Vim to Emacs, and I couldn't be happier about the move. I spent some time getting a setup I was happy with, and thought I'd share it for those who are also looking to...

View Article


The LASSO Estimator

As far as I can tell, the LASSO estimator is the closest thing we have to a miracle in modern statistics.The LASSO estimator is defined as a solution to the minimization problem $\frac{1}{n} \| Y - X...

View Article

Cambridge Part III Mathematics Notes

I've cleaned up (somewhat) my notes from Cambridge Part III and have put them online - with LaTeX sources available on GitHub and PDFs linked below.Advanced Financial ModelsLecture NotesSummaryAdvanced...

View Article

Browsing latest articles
Browse All 17 View Live




Latest Images