Journal of Combinatorial Theory, Series A 85, 165193 (1999)
Coalescent Random Forests*
Jim Pitman
Department of Statistics,University of California,367 Evans Hall *3860,
Berkeley,California 94720-3860
Communicated by the Managing Editors
Received December 16, 1997
Various enumerations of labeled trees and forests, including Cayley's formula
n
n&2
for the number of trees labeled by [n], and Cayley's multinomial expansion
over trees, are derived from the following coalescent construction of a sequence of
random forests (R
n
,R
n&1
, ..., R
1
) such that R
k
has uniform distribution over the set
of all forests of krooted trees labeled by [n]. Let R
n
be the trivial forest with nroot
vertices and no edges. For nk2, given that R
n
, ..., R
k
have been defined so that
R
k
is a rooted forest of ktrees, define R
k&1
by addition to R
k
of a single edge
picked uniformly at random from the set of n(k&1) edges which when added to R
k
yield a rooted forest of k&1 trees. This coalescent construction is related to a
model for a physical process of clustering or coagulation, the additive coalescent in
which a system of masses is subject to binary coalescent collisions, with each pair
of masses of magnitudes xand yrunning a risk at rate x+yof a coalescent collision
resulting in a mass of magnitude x+y. The transition semigroup of the additive
coalescent is shown to involve probability distributions associated with a multi-
nomial expansion over rooted forests. 1999 Academic Press
1. INTRODUCTION
Let T
n
denote the set of all trees labeled by [n]:=[1, ..., n]. Cayley's
[14] formula *T
n
=n
n&2
is a well known consequence of the bijection
between T
n
and [n]
n&2
set up by Pru fer's [51] coding of trees. See [19,
37, 38, 60] for background, alternative proofs of Cayley's formula, and
related codings and enumerations. One purpose of this paper is to show
how various enumerations of labeled trees and forests, including Cayley's
formula, follow easily from a very different construction of random forests
by a coalescent process. A second purpose is to relate this construction to
various models of coalescent processes which have found applications
in statistical physics and polymer chemistry [34, 32, 31, 65, 22, 12, 8],
computer science [64, 26], genetics [25], combinatorics [13, 3, 7], and
astronomy [59]. A third purpose is to lay combinatorial foundations for
Article ID jcta.1998.2919, available online at http:www.idealibrary.com on
165
0097-316599 30.00
Copyright 1999 by Academic Press
All rights of reproduction in any form reserved.
* Research supported in part by N.S.F. Grants MCS94-04345 and DMS 97-03961.