Skip to content

Commit 2bd8548

Browse files
committed
initial checkin
0 parents  commit 2bd8548

18 files changed

+1155
-0
lines changed

.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
*~

README.txt

+54
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,54 @@
1+
Michael Mandel
2+
CS 4771 Final Project
3+
The Infinite Gaussian Mixture Model
4+
Prof. Tony Jebara
5+
May 5, 2005
6+
7+
In order to generate the test data used in the paper, just make this
8+
call in matlab:
9+
[Y,z] = drawGmm([-3 3], [1 10], [1 2], 500);
10+
11+
In order to run the infinite GMM on the data for 10000 iterations,
12+
make this call:
13+
Samp = igmm_uv(Y, 10000);
14+
15+
It's as easy as that.
16+
17+
If you want to run the regular univariate Gibbs Sampler on the data,
18+
do this:
19+
[mu,sigSq,p,z,churn] = gibbsGmm(Y,2,0,100,2,1,2,1000);
20+
21+
The igmm for multivariate data is in igmm_mv.m, which uses
22+
logmvbetpdf.m instead of the logbetapdf.m used by igmm_uv.m.
23+
Otherwise, both igmms are self-contained.
24+
25+
To generate multivariate data, use e.g.
26+
S = [2 1; 1 2]; S(:,:,2) = S
27+
[Y,z] = drawGmm([3 -3; -3 3], S, [1 1], 100);
28+
Samp = igmm_mv(Y, 10000);
29+
30+
To generate figure 3 in the paper, use the function plotAutoCov.m
31+
32+
33+
34+
=====================================================
35+
COPYRIGHT / LICENSE
36+
=====================================================
37+
All code was written by Michael Mandel, and is copyrighted under the
38+
(lesser) GPL:
39+
Copyright (C) 2005 Michael Mandel
40+
41+
This program is free software; you can redistribute it and/or
42+
modify it under the terms of the GNU Lesser General Public License
43+
as published by the Free Software Foundation; version 2.1 or later.
44+
45+
This program is distributed in the hope that it will be useful,
46+
but WITHOUT ANY WARRANTY; without even the implied warranty of
47+
MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
48+
GNU Lesser General Public License for more details.
49+
50+
You should have received a copy of the GNU Lesser General Public License
51+
along with this program; if not, write to the Free Software
52+
Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
53+
54+
The authors may be contacted via email at: mim at ee columbia edu

argmax.m

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
function am = argmax(X, dim)
2+
3+
% am = argmax(X, dim)
4+
%
5+
% Find the index of the maximum value of the matrix X. If dime is
6+
% supplied, find the maximum index along the dimension dim.
7+
8+
% Copyright (C) 2005 Michael Mandel, mim at ee columbia edu;
9+
% distributable under GPL
10+
11+
if(nargin < 2)
12+
[dummy, am] = max(X);
13+
else
14+
[dummy, am] = max(X, [], dim);
15+
end

ars.m

+127
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
function samples = ars(logpdf, pdfargs, N, xi, support)
2+
3+
% Perform adaptive rejection sampling as described in gilks & wild
4+
% '92, and wild & gilks 93. The PDF must be log-concave. Draw N
5+
% samples from the pdf passed in as a function handle to its log. The
6+
% log could be offset by an additive constant, corresponding to an
7+
% unnormalized distribution.
8+
%
9+
% The pdf function should have prototype [h, hprime] = logpdf(x,
10+
% pdfargs{:}), where x could be a vector of points, h is the value of
11+
% the log pdf and hprime is its derivative.
12+
%
13+
% The xi argument to this function is a number of points to initially
14+
% evaluate the pdf at, which must be on either side of the
15+
% distribution's mode. And the support is a 2-vector specifying the
16+
% support of the pdf, defaults to [-inf inf].
17+
%
18+
% This function does not use the lower squeezing bound because it
19+
% is optimized for generating a small number of samples each call.
20+
21+
% Copyright (C) 2005 Michael Mandel, mim at ee columbia edu;
22+
% distributable under GPL, see README.txt
23+
24+
samples = [];
25+
26+
% Don't need to approximate the curve too well, all the sorting and
27+
% whatnot gets expensive
28+
Nxmax = 50;
29+
30+
if(nargin < 5) support = [-inf inf]; end
31+
32+
x = sort(xi);
33+
[h, hprime] = feval(logpdf, x, pdfargs{:});
34+
if(~isfinite(h(1)))
35+
x
36+
h
37+
hprime
38+
logpdf
39+
pdfargs{:}
40+
size(pdfargs)
41+
det(pdfargs{2})
42+
error('h not finite');
43+
end
44+
45+
if(support(1) == 0)
46+
% Cheat! Get closer and closer to 0 as needed
47+
while(hprime(1) < 0)
48+
xt = x(1)/2;
49+
[ht,hpt] = feval(logpdf, xt, pdfargs{:});
50+
[x,z,h,hprime,hu,sc,cu] = insert(x,xt,h,ht,hprime,hpt,support);
51+
end
52+
53+
while(hprime(end) > 0)
54+
xt = x(end)*2;
55+
[ht,hpt] = feval(logpdf, xt, pdfargs{:});
56+
[x,z,h,hprime,hu,sc,cu] = insert(x,xt,h,ht,hprime,hpt,support);
57+
end
58+
end
59+
60+
if(hprime(1) < 0 || hprime(end) > 0)
61+
% If the lower bound isn't 0, can't help it (for now)
62+
error(['Starting points ' num2str(x) ' do not enclose the' ...
63+
' mode']);
64+
end
65+
66+
67+
% Avoid under/overflow errors. the envelope and pdf are only
68+
% proporitional to the true pdf, so we can choose any constant
69+
% of proportionality.
70+
offset = max(h);
71+
h = h-offset;
72+
73+
[x,z,h,hprime,hu,sc,cu] = insert(x,[], h,[], hprime,[], support);
74+
75+
Nsamp = 0;
76+
while Nsamp < N
77+
% Draw 2 random numbers in [0,1]
78+
u = rand(1,2);
79+
80+
% Find the largest z such that sc(z) < u
81+
idx = find(sc/cu < u(1));
82+
idx = idx(end);
83+
84+
% Figure out the x in that segment that u corresponds to
85+
xt = x(idx) + (-h(idx) + log(hprime(idx)*(cu*u(1) - sc(idx)) + ...
86+
exp(hu(idx)))) / hprime(idx);
87+
[ht,hpt] = feval(logpdf, xt, pdfargs{:});
88+
ht = ht-offset;
89+
90+
% Figure out what h_u(xt) is a dumb way, uses assumption that the
91+
% log pdf is concave
92+
hut = min(hprime.*(xt - x) + h);
93+
94+
% Decide whether to keep the sample
95+
if(u(2) < exp(ht - hut))
96+
Nsamp = Nsamp+1;
97+
samples(Nsamp) = xt;
98+
else
99+
% $$$ fprintf('.');
100+
end
101+
102+
% Update vectors if necessary
103+
if(length(x) < Nxmax)
104+
[x,z,h,hprime,hu,sc,cu] = insert(x,xt,h,ht,hprime,hpt,support);
105+
end
106+
end
107+
108+
109+
110+
function [x, z, h, hprime, hu, sc, cu] = ...
111+
insert(x, xnew, h, hnew, hprime, hprimenew, support)
112+
% Insert xnew into x and update all other vectors to reflect the
113+
% new point's addition.
114+
115+
[x,order] = sort([x xnew]);
116+
h = [h hnew]; h = h(order);
117+
hprime = [hprime hprimenew]; hprime = hprime(order);
118+
119+
z = [support(1) x(1:end-1)+(-diff(h)+hprime(2:end).*diff(x)) ./ ...
120+
diff(hprime) support(end)];
121+
hu = [hprime(1) hprime] .* (z - [x(1) x]) + [h(1) h];
122+
123+
% $$$ plot(z, hu);
124+
125+
sc = [0 cumsum(diff(exp(hu)) ./ hprime)];
126+
cu = sc(end);
127+

drawGmm.m

+29
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,29 @@
1+
function [Y,z] = drawGmm(mu, sigSq, p, N)
2+
3+
% Draw N samples from a mixture of N Gaussians. In the multivariate
4+
% case, mu is a matrix where each row is the mean of one Gaussian.
5+
% SigSq is a 3D matrix such that sigSq(:,:,i) is the covariance of the
6+
% ith Gaussian. In the univariate case, mu(i) is the mean and
7+
% sigSq(i) is the variance of the ith Gaussian.
8+
9+
% Copyright (C) 2005 Michael Mandel, mim at ee columbia edu;
10+
% distributable under GPL, see README.txt
11+
12+
% Y ~ sum[ p_i * N(mu_i, sigma_i) ]
13+
14+
[tD,D] = size(mu);
15+
if(tD == 1) D=1; end
16+
17+
z = drawMultinom(repmat(p(:), 1, N));
18+
19+
if(D == 1)
20+
Y = randn(1,N).*sqrt(sigSq(z)) + mu(z);
21+
else
22+
for i=1:length(p)
23+
inClass = find(z == i);
24+
n = numel(inClass);
25+
[u,s,v] = svd(sigSq(:,:,i));
26+
sig = sqrt(s)*v';
27+
Y(inClass,:) = randn(n,D) * sig + repmat(mu(i,:), n, 1);
28+
end
29+
end

drawMultinom.m

+15
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
function x = drawMultinom(p)
2+
3+
% Draw size(p,2) samples from a multinomial distribution where the
4+
% elements [1..size(p,1)] have probabilities p. There should be a way
5+
% to do it without the repmats...
6+
7+
% Copyright (C) 2005 Michael Mandel, mim at ee columbia edu;
8+
% distributable under GPL, see README.txt
9+
10+
11+
p = cumsum(p);
12+
pmax = max(max(p))+1;
13+
u = repmat(rand(1,size(p,2)).*p(end,:), size(p,1), 1);
14+
m = (u < p) .* (pmax-p);
15+
x = argmax(m);

drawSpiral.m

+19
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,19 @@
1+
function Y = drawSpiral(N, std)
2+
3+
% Y = drawSpiral(N, std)
4+
%
5+
% Draw N points from a noisy 3D spiral similar to the one used in
6+
% Rasmussen's paper, which was taken from Ueda et al (1998). Std is
7+
% the standard deviation of noise around the spiral. The default
8+
% parameters should give something like Ueda's spiral.
9+
10+
% Copyright (C) 2005 Michael Mandel, mim at ee columbia edu;
11+
% distributable under GPL, see README.txt
12+
13+
14+
if(nargin < 2) std = .05; end
15+
if(nargin < 1) N = 800; end
16+
17+
t = rand(1,N)*4*pi + 2*pi;
18+
Y = [10*cos(t)./t;-10*sin(t)./t; t/(4*pi)]';
19+
Y = Y + randn(size(Y))*std;

gibbsGmm.m

+97
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
function [mu, sigmaSq, p, z, churn] = ...
2+
gibbsGmm(Y, k, m, etaSq, nu0, nu0lambda0, alpha, Nsamp)
3+
4+
% Use Markov chain Monte Carlo simulation to cluster the data Y into a
5+
% mixture of k univariate Gaussians. Priors on variables are: mu ~
6+
% N(m, etaSq), sigmaSq ~ Wishart(nu0, lambda0), pi ~ dirichlet(alpha/k).
7+
% Outputs of function are samples from the posterior distributions, so
8+
% that theta(i) = [mu(i,:) sigma(i,:) z ], i = 1..Nsamp
9+
10+
N = length(Y);
11+
12+
% Randomly assign to classes, initialize stats to the means and
13+
% vars of those classes.
14+
z = drawMultinom(ones(k,N));
15+
for j=1:k
16+
yj = Y(find(z == j));
17+
mu(1,j) = yj(unidrnd(numel(yj)));
18+
sigmaSq(1,j) = std(yj).^2;
19+
end
20+
p(1,:) = full(sparse(1, z, 1, 1, k));
21+
22+
% Go!
23+
for i=2:Nsamp
24+
% Mu
25+
for j=1:k
26+
n = sum(z == j);
27+
if(n <= 0) ybar = 0;
28+
else ybar = mean(Y(find(z == j)));
29+
end
30+
31+
tmp_sigSq = 1/(n/sigmaSq(i-1,j) + 1/etaSq);
32+
tmp_mu = tmp_sigSq*(n*ybar/sigmaSq(i-1,j) + m/etaSq);
33+
mu(i,j) = drawNormal(tmp_mu, tmp_sigSq);
34+
end
35+
36+
% Sigma
37+
for j=1:k
38+
inClass = z == j;
39+
n = sum(inClass);
40+
if(n <= 0) sigbar = 0;
41+
else sigbar = sum((Y(find(inClass)) - mu(i-1,j)).^2);
42+
end
43+
44+
tmp_nu = nu0+n;
45+
tmp_nu_lambda = (nu0lambda0 + sigbar);
46+
sigmaSq(i,j) = drawInvChiSq(tmp_nu, tmp_nu_lambda);
47+
end
48+
49+
% z \in {1..k}
50+
for j=1:k
51+
tmp_pr(j,:) = normalLike(Y, mu(i-1,j), sigmaSq(i-1,j));
52+
end
53+
n = tabulate(z);
54+
n = n(:,2)';
55+
% $$$ n = full(sparse(1, z, 1, 1, k));
56+
57+
% Scale likelihoods by class memberships times prior
58+
pri = repmat((n'+alpha/k)/(sum(n)-1+alpha), 1, N);
59+
idxs = sub2ind(size(pri), z, [1:N]);
60+
pri(idxs) = pri(idxs) - 1/(sum(n)-1+alpha);
61+
62+
tz = drawMultinom(pri .* tmp_pr);
63+
churn(i) = sum(tz ~= z);
64+
z = tz;
65+
p(i,:) = n;
66+
67+
% $$$ plotGmm(mu(i,1), mu(i,2), sigmaSq(i,1), sigmaSq(i,2), p(i));
68+
% $$$ pause(.1)
69+
end
70+
71+
72+
function x = drawNormal(mu, sigSq)
73+
% Draw one sample from a Gaussian with mean mu and variance sigSq
74+
x = randn(1)*sqrt(sigSq) + mu;
75+
76+
77+
function pr = normalLike(y, mu, sigSq)
78+
% Evaluate the likelihood of the points y under the Gaussian with mean
79+
% mu and variance sigSq
80+
pr = 1/sqrt(2*pi*sigSq) .* exp(-(y-mu).^2/(2*sigSq));
81+
82+
83+
function x = drawInvChiSq(nu, nu_lambda)
84+
% Draw one sample from an inverse chi square distribution with
85+
% parameters nu and lambda
86+
x = nu_lambda / chi2rnd(nu);
87+
88+
89+
function x = drawBeta(a, b)
90+
% Draw one sample from a Beta distribution with parameters a and b
91+
x = betarnd(a,b);
92+
93+
94+
function x = drawBernoulli(p)
95+
% Draw bernoulli random variables with probability p of getting 1.
96+
% x is the same size as p.
97+
x = rand(size(p)) < p;

0 commit comments

Comments
 (0)