Citing a rare paper by von Neumann: Various techniques used in connection with random digits

By | July 18, 2019

John von Neumann wrote a marvelous paper in 1951 about random variate generation, which is widely cited in the literature. Here is the BibTeX for those who are here for the citation: @incollection{vonNeumann1951, title = {Various Techniques Used in Connection with Random Digits}, author = {von Neumann, John}, booktitle = {Monte Carlo Method}, editor =… Read More »

Programming and probability: Sampling from a discrete distribution over an infinite set

By | September 2, 2018

This post is an introductory tutorial which presents a simple algorithm for sampling from a discrete probability distribution $p(k)$ defined over a countably infinite set. We also show how to use higher-order programming to implement the algorithm in a modular and reusable way, and how to amortize the cost of sampling using memoization. Introduction and… Read More »

Notating “conditional” probabilities

By | February 27, 2018

Let $p$ denote a probability density with support $x \in \mathcal{X}$, and let $\theta \in \mathbf{T}$ parametrize $p$. It is common in the literature to express this situation using the notation $p(x\mid \theta)$. However, this “conditional” notation using the vertical bar (or mid in LaTeX) is often used ambiguously. Notation with the conditional symbol $p(x… Read More »

A pair of probability problems

By | January 15, 2018

Let $w = (w_1,w_2,\dots)$ be an infinite sequence which sums to unity. Consider the function $$f_w(u) = \sum_{j=1}^{\infty}\mathbf{I}(u < w_j).$$ Prove that $f_w$ is a probability density in $u > 0$. Describe a simulation technique for generating a random variable $U \sim f_w(u)$.

On statistical two-sample testing

By | July 3, 2017

Statistical two-sample testing concerns the following question: given observations $X = \lbrace x_1, x_2, \dots, x_m \rbrace$ and $Y = \lbrace y_1, y_2, \dots, y_n \rbrace$ drawn from distributions $p$ and $q$, respectively, can we decide whether $p=q$? Two-sample testing is an active field of research with an enormous amount of literature. Asymptotic theory of… Read More »

A thought experiment with the Bayesian posterior predictive distribution

By | January 29, 2017

Let $\pi(\theta)$ be a prior for parameter $\Theta$, and $p(x|\theta)$ a likelihood which generates an exchangeable sequence of random variables $(X_1,X_2,X_3\dots)$. Given a set of observations $D := \lbrace X_0=x_0, X_1=x_1, \dots, X_{N-1}=x_{N-1}\rbrace$, the posterior predictive distribution for the next random variable in the sequence $X_N$ is defined as $$p(X_{N}=s | D) = \int p(X_{N}=s… Read More »

Trump, Clinton, and the failures of election forecasting

By | November 13, 2016

Election season has come and gone. Donald Trump pulled off what has been repeatedly characterized as a “stunning upset” over Hillary Clinton. The web has reacted with intense postmortem analysis about the failures of election polling and forecasting. But was it really that surprising? Several polls conducted in the lead up to the election reported… Read More »

Metropolis-Hastings with Gaussian drift proposal on bounded support

By | February 13, 2016

Professor Darren Wilkinson has an excellent blog post on a common implementation issue that arises when the standard random-walk Metropolis Hastings algorithm, with a Gaussian proposal, is modified to sample from a target distribution $p(x)$ where $$\text{supp}(p) = \lbrace x : p(x) > 0 \rbrace = (0, \infty).$$ In this blog post, we are going to briefly review the… Read More »

Hello world!

By | February 13, 2016

Welcome to Random Seed. A good proportion of implementors of probabilistic computation fail to address two key issues relating to entropy control in their software Keeping track of the state of the random number generator. Permuting the random seed. We will look into these two issues, as well as many other topics, throughout future blog posts.