function [exemplars, dpsim_vsh, netsim_vsh] = vsh_fc(s, K, p); % TEITZ & BART (1968) VERTEX SUBSTITUTION HEURISTIC (VSH) % USING FAST UPDATES FROM HANSEN & MLADENOVIC (1997), WHICH ARE % BASED ON WHITAKER'S (1983) WORK. % ################################################################## % MODIFIED VERSION THAT HAS A "FIXED CHARGE" FOR EXEMPLARS % There is no difference between vsh_fc.m and vsh.m if the % preference vector elements are all the same (which is usually % the case), however, this program (vsh_fc.m) is preferable if the % preferences are different, as it captures these differences in % the objective function, netsim_vsh. % ################################################################## % NOTE: For consistency with Frey and Dueck's (2007) affinity propagation, % code (www.psi.toronto.edu/affinitypropagation), we have % reverse-coded this program to operate on a SIMILARITY matrix, % rather than dissimilarity or distance matrices. The program can % read the similarity data (s) either in the form of an n x n matrix % of the three column form such as used in Frey and Dueck's test % problems for their Science paper (DocumentSummarization, Travel % Routing, Face Images). % INPUTS: s = an n x n symmetric or asymmetric SIMILARITY matrix, % either n x n directly or 3-column form like F&D. % K = the desired number of medians/clusters/exemplars % p = preference vector of fixed charges % OUTPUTS: exemplars = the best found set of medians/exemplars % dpsim_vsh = the sum of the similarities between each point and its % nearest exemplar. COMPARE THIS VALUE TO "dpsim" % obtained using affinity propagation. BIGGER VALUES % of dpsim are better (more similarity / less error). % For example, for the face images data used in % Frey and Dueck's (2007) paper, the value of -9692 % obtained with vsh.m is better than the value of -9734 % produced by Frey and Dueck's affinity propagation % algorithm. % netsim_vsh = dpsim_vsh + (sum of exemplar preferences). This % sum of the exemplar preferences is added as a "fixed % charge" to the objective function, and this is the % criterion that is actually optimized by "vsh_fc.m" % RECOMMENDATION FOR EASY COMPARISON TO AFFINITY PROPAGATION: % 1) Load in a test problem and run Frey and Dueck's affinity % propagation algorithm (apcluster.m), which will produce 'idx' and % 'dpsim' as output. % 2) Set K = length(unique(idx)) to use same number of clusters as % affinity propagation. % 3) Run this program (vsh.m) using s and K as input arguments. % 4) Compare 'dpsim_vsh' to the value of 'dpsim' produced by affinity % propagation, recalling that LARGER VALUES ARE BETTER % % COMPARATIVE RESULTS: Affinity Propagation Vertex sub. heur. (VSH) % K dpsim) (dpsim_vsh) % --- ------------ ----------------------- % 1. Document Summarization 4 -9582.03 -9429.90 % 2. Travel Routing 7 -60960.00 -60654.00 % 3. Olivetti Face Images 62 -9734.72 -9692.34 % 4. Fisher's iris data 6 -45.96 -43.98 % 5. Hartigan birth/death 6 -863.41 -863.41 % 6. Lin/Kernighan-drilling 11 -44979936.00 -44882137.00 % 7. European cities (202) 17 -2150.64 -2065.57 % 8. European cities (431) 16 -47706.65 -45361.12 % 9. European cities (666) 17 -131944.17 -127716.91 %10. Reinelts circuit board 22 -21798150.52 -21450616.61 % Note that the two algorithms produce the same solution for problem 5. % However, VSH produced a better solution than Aff. Prop. for each of the % remaining problems (i.e., a larger value indicating greater similarity). state = 1; % fix state for testing rand('state', state); tic; P = p; p = []; p = K; [n1,n2] = size(s); if n1 == n2 % Is s a square matrix? n = n1; S = s; s = []; else % or 3-column data like Frey & Deuck? n = max(max(s(:,1)),max(s(:,2))); S = -9.9e+10.*ones(n,n); for i = 1:n1 i1 = s(i,1); i2 = s(i,2); S(i1,i2) = s(i,3); end s = []; end for i = 1:n S(i,i) = 0; end nreps = 20; gbest = -9.9e+12; fstore = zeros(nreps,1); c1 = zeros(n,1); c2 = zeros(n,1); for klk = 1:nreps s = randperm(n); % Randomly order the exemplar a1 = S(:,s(1:p)); % candidates and let the first fbest = sum(P(s(1:p))); % p be the selected subset for i = 1:n; % Note: In this program dmax = -9.9e+12; % i indexes the rows, which for j = 1:p % assigned to exemplars or if a1(i,j) > dmax % selected columns indexed by j dmax = a1(i,j); jsel = j; end end fbest = fbest + dmax; c1(i) = s(jsel); % fbest = initial objective; end % c1(i) = exemplar to which for i = 1:n % object i is most similar dmax = -9.9e+12; % c2(i) = exemplar to which for j = 1:p % object is 2nd most similar jj = s(j); % is computed in this block if c1(i) == jj continue end if S(i,jj) > dmax dmax = S(i,jj); jsel = jj; end end c2(i) = jsel; end trig = 0; while trig == 0 % The exchange process begins here wstar = 0; % The block below finds, for each unseleted point (goin), the best exemplar % to remove (goout) if adding goin. The best (goin, goout) pair is % identified and (wstar) is the resulting change in the objective function. % If wstar >= 0, then the algorithm terminates because there is no viable % exchange that will improve the objective function further. for goin = p+1:n ii = s(goin); w = P(ii); u = zeros(n,1); for i = 1:n if S(i,ii) > S(i,c1(i)) w = w + S(i,ii) - S(i,c1(i)); else u(c1(i)) = u(c1(i)) + max(S(i,ii),S(i,c2(i))) - S(i,c1(i)); end end dmax = -9.9e+12; for j = 1:p jj = s(j); if u(jj)-P(jj) > dmax dmax = u(jj)-P(jj); goout = j; end end w = w + dmax; if w > wstar wstar = w; goinb = goin; gooutb = goout; end end if wstar <= .00001 trig = 1; continue end % The block below updates the c1 and c2 vectors if an (goin, goout) swap % results in an improvement. fbest = fbest + wstar; goinc = s(goinb); gooutc = s(gooutb); idum = s(goinb); s(goinb) = s(gooutb); s(gooutb) = idum; for i = 1:n if c1(i) == gooutc if S(i,goinc) >= S(i,c2(i)) c1(i) = goinc; else c1(i) = c2(i); dmax = -9.9e+12; for j = 1:p jj = s(j); if c1(i) == jj continue end if S(i,jj) > dmax dmax = S(i,jj); jsel = jj; end end c2(i) = jsel; end else if S(i,c1(i)) < S(i,goinc) c2(i) = c1(i); c1(i) = goinc; elseif S(i,goinc) > S(i,c2(i)) c2(i) = goinc; elseif c2(i) == gooutc dmax = -9.9e+12; for j = 1:p jj = s(j); if c1(i) == jj continue end if S(i,jj) > dmax dmax = S(i,jj); jsel = jj; end end c2(i) = jsel; end end end end if fbest > gbest gbest = fbest; sbest = s; end fstore(klk) = fbest; end netsim_vsh = gbest; exemplars = sbest(1:p); % Store best solution dpsim_vsh = netsim_vsh - sum(P(exemplars)); fmean = mean(fstore); % Mean performance fmedian = median(fstore); % Median performance toc