The likelihood is just the binomial, with parameters
. Let
be the data,
be the data up to time
point N, and
be the parameters, with
being the hyperparameters. Let
represent the collection
of all
.
Then, the likelihood is given as
since
independent of
conditional on
. And, this is just
Certainly one is interested in predicting
given
and the
.