You have already encountered the Moment Generating
Function of a pdf in the Part IB probability course. This
function was closely related to the Laplace
Transform of the pdf.
Now we introduce the Characteristic Function for a
random variable, which is closely related to the Fourier
Transform of the pdf.
In the same way that Fourier Transforms allow easy manipulation
of signals when they are convolved with linear system impulse
responses, Characteristic Functions allow easy manipulation of
convolved pdfs when they represent sums of random processes.
The
Characteristic Function of a pdf is defined as:
ΦXu=Eⅇⅈux=∫-∞∞ⅇⅈuxfXxdx=ℱ-u
ΦX
u
u
x
x
u
x
fX
x
ℱ
u
(1)
where
ℱu
ℱ
u
is the Fourier Transform of the pdf.
Note that whenever
fXfX
is a valid pdf,
Φ0=∫fXxdx=1
Φ
0
x
fX
x
1
Properties of Fourier Transforms apply with
-uu
substituted for
ωω. In
particular:
-
Convolution - (sums of independent rv's)
Y=∑i=1NXi⇒fY=f
X
1
*f
X
2
*…*f
X
N
⇒ΦYu=∏i=1NΦ
X
i
u
Y
i
1
N
Xi
fY
f
X
1
f
X
2
…
f
X
N
ΦY
u
i
1
N
Φ
X
i
u
(2)
-
Inversion
fXx=12π∫ⅇ-ⅈuxΦXudu
fX
x
1
2
u
u
x
ΦX
u
(3)
-
Moments
dndunΦXu=∫ⅈxnⅇⅈuxfXxdx⇒EXn=∫xnfXxdx=1ⅈndndunΦXu|u=0
un
ΦX
u
x
x
n
u
x
fX
x
X
n
x
x
n
fX
x
1
n
u
0
u
n
ΦX
u
(4)
-
Scaling
If
Y=aX
Y
a
X
,
fYy=fXxa
fY
y
fX
x
a
from this equation in our previous discussion
of functions of random variables, then
ΦYu=∫ⅇⅈuyfYydy=∫ⅇⅈuaxfXxdx=ΦXau
ΦY
u
y
u
y
fY
y
x
u
a
x
fX
x
ΦX
a
u
(5)
Characteristic Function of a Gaussian pdf
The Gaussian or normal distribution is very important, largely
because of the Central Limit Theorem which we
shall prove below. Because of this (and as part of the proof
of this theorem) we shall show here that a Gaussian pdf has a
Gaussian characteristic function too.
A Gaussian distribution with mean
μμ and variance
σ2
σ
2
has pdf:
fx=12πσ2ⅇ-x-μ22σ2
f
x
1
2
σ
2
x
μ
2
2
σ
2
(6)
Its characteristic function is obtained as follows, using a
trick known as completing the square of the exponent:
ΦXu=Eⅇⅈux=∫ⅇⅈuxfXxdx=12πσ2∫ⅇ-x2-2μx+μ2-2σ2ⅈux2σ2dx=12πσ2∫ⅇ-x-μ+ⅈuσ222σ2dxⅇ2ⅈuσ2μ-u2σ42σ2=ⅇⅈuμⅇ-u2σ22
ΦX
u
u
x
x
u
x
fX
x
1
2
σ
2
x
x
2
2
μ
x
μ
2
2
σ
2
u
x
2
σ
2
1
2
σ
2
x
x
μ
u
σ
2
2
2
σ
2
2
u
σ
2
μ
u
2
σ
4
2
σ
2
u
μ
u
2
σ
2
2
(7)
since the integral in brackets is similar to a Gaussian pdf
and integrates to unity.
Thus the characteristic function of a Gaussian pdf is also
Gaussian in magnitude,
ⅇ-u2σ22
u
2
σ
2
2
, with standard deviation
1σ
1
σ
, and with a linear phase rotation term,
ⅇⅈuμ
u
μ
, whose rate of rotation equals the mean
μμ of the pdf. This coincides
with standard results from Fourier analysis of Gaussian
waveforms and their spectra (e.g. Fourier transform of a
Gaussian waveform with time shift).
Summation of two or more Gaussian random variables
If two variables,
X
1
X
1
and
X
2
X
2
, with Gaussian pdfs are summed to produce
XX, their characteristic
functions will be multiplied together (equivalent to
convolving their pdfs) to give
ΦXu=Φ
X1
uΦ
X2
u=ⅇⅈuμ1+μ2ⅇ-u2σ12+σ222
ΦX
u
Φ
X1
u
Φ
X2
u
u
μ1
μ2
u
2
σ1
2
σ2
2
2
(8)
This is the characteristic function of a Gaussian pdf with
mean (
μ1+μ2
μ1
μ2
) and variance (
σ12+σ22
σ1
2
σ2
2
).
Further Gaussian variables can be added and the pdf will
remain Gaussian with further terms added to the above
expressions for the combined mean and variance.
Central Limit Theorem
The central limit theorem states broadly that if a large
number NN of independent random
variables of arbitrary pdf, but with equal variance
σ2
σ
2
and zero mean, are summed together and scaled by
1N
1
N
to keep the total energy independent of
NN, then the pdf of the resulting
variable will tend to a zero-mean Gaussian with variance
σ2
σ
2
as NN tends to
infinity.
This result is obvious from the previous result if the
input pdfs are also Gaussian, but it is the fact
that it applies for arbitrary input pdfs
that is remarkable, and is the reason for the importance of
the Gaussian (or normal) pdf. Noise generated in nature is
nearly always the result of summing many tiny random processes
(e.g. noise from electron energy transitions in a resistor or
transistor, or from distant worldwide thunder storms at a
radio antenna) and hence tends to a Gaussian pdf.
Although for simplicity, we shall prove the result only for
the case when all the summed processes have the
same variance and pdfs, the central limit
result is more general than this and applies in many cases
even when the variance and pdfs are not all the same.
Proof:
Let
X
i
X
i
(
i=1
i
1
to
NN) be the
NN independent random
processes, each will zero mean and variance
σ2
σ
2
, which are combined to give
X=1N∑i=1NXi
X
1
N
i
1
N
Xi
(9)
Then, if the characteristic function of each input process
before scaling is
Φu
Φ
u
and we use
Equation 5 to
include the scaling by
1N
1
N
, the characteristic function of
XX is
ΦXu=∏i=1NΦ
X
i
uN=ΦNuN
ΦX
u
i
1
N
Φ
X
i
u
N
Φ
u
N
N
(10)
Taking logs:
logΦXu=NlogΦuN
ΦX
u
N
Φ
u
N
(11)
Using Taylor's theorem to expand
ΦuN
Φ
u
N
in terms of its derivatives at
u=0
u
0
(and hence its moments) gives
ΦuN=Φ0+uNΦ′0+12uN2Φ0′′+16uN3Φ0′′′+124uN4Φ04+…
Φ
u
N
Φ
0
u
N
Φ
0
1
2
u
N
2
2
Φ
0
1
6
u
N
3
3
Φ
0
1
24
u
N
4
4
Φ
0
…
(12)
From the
Moments property of characteristic
functions with zero mean:
-
valid pdf
Φ0=EXi0=1
Φ
0
Xi
0
1
-
zero mean
Φ′0=ⅈEXi=0
Φ
0
Xi
0
-
variance
Φ0′′=ⅈ2EXi2=-σ2
2
Φ
0
2
Xi
2
σ
2
-
scaled skewness
Φ0′′′=ⅈ3EXi3=-ⅈγσ3
3
Φ
0
3
Xi
3
γ
σ
3
-
scaled kurtosis
Φ04=ⅈ4EXi4=κ+3σ4
4
Φ
0
4
Xi
4
κ
3
σ
4
These are all constants, independent of
NN, and dependent only on the
shape of the pdfs
f
Xi
f
Xi
.
Substituting these moments into
Equation 11 and
Equation 12 and
using the series expansion,
log1+x=x
1
x
x
+ (terms of order
x2
x
2
or smaller), gives
logΦXu=NlogΦuN=Nlog1-u22Nσ2+**=N-u2σ22N+**=-u2σ22+##
ΦX
u
N
Φ
u
N
N
1
u
2
2
N
σ
2
**
N
u
2
σ
2
2
N
**
u
2
σ
2
2
##
(13)
where ** represents the terms of order
N-32
N
3
2
or smaller and ## represents the terms of order
N-12
N
1
2
or smaller. As
N→∞
N
,
logΦXu→-u2σ22
ΦX
u
u
2
σ
2
2
Therefore, as
N→∞
N
ΦXu→ⅇ-u2σ22
ΦX
u
u
2
σ
2
2
(14)
Note that, if the input pdfs are symmetric, the skewness will
be zero and the error terms will decay as
N-1
N
-1
rather than
N-12
N
1
2
; and so convergence to a Gaussian characteristic
function will be more rapid.
Hence we may now infer from
Equation 6,
Equation 7 and
Equation 14 that the pdf of
XX as
N→∞
N
will be given by
fXx=12πσ2ⅇ-x22σ2
fX
x
1
2
σ
2
x
2
2
σ
2
(15)
Thus we have proved the required central limit
result.
subfigure 1.1 shows an example of
convergence when the input pdfs are uniform, and
NN is gradually increased from
11 to
5050. By
N=12
N
12
, convergence is good, and this is how some
'Gaussian' random generator functions operate - by summing
typically
1212 uncorrelated random
numbers with uniform pdfs.
For some less smooth or more skewed pdfs, convergence can be
slower, as shown for a highly skewed triangular pdf in
subfigure 1.2; and pdfs of discrete
processes are particularly problematic in this respect, as
illustrated in
subfigure 1.3.