menta calo蒙特卡洛mit deep learning-CSDN博客

本文链接：https://blog.csdn.net/m0_64914656/article/details/132253741

PROFESSOR: Let's look at an instance of Monte Carlo
simulation in more detail.
Now, again, I want to consider a model, which I'm going
to write as y equals f of x.
Now, y is some output of interest.
And let's say it's scalar valued to keep things simple.
And x is a vector of many parameters, all of which
are uncertain in some way.
Now, the first step in Monte Carlo simulation
is to assign this input vector x some kind of probability
distribution.
And this is really a modeling step,
and it depends on what does x represent.
So for instance, if I'm concerned with maybe heat
transfer through a turbine blade,
and I think there's some variability, some manufacturing
variability, in the thickness of the thermal barrier coating,
then that uncertainty should be represented
in x. x could basically be--
one element of x could be the thickness
of that thermal barrier, and it could vary.
And the range of that distribution
really depends on how good my manufacturing process is.
Another setting might be in kind of an aerodynamic setting.
If I'm developing a wind turbine,
and I don't know what the wind conditions are
going to be from one day to the next, that's uncertain.
So I need to assign that input, say, the wind velocity
at the turbine, some kind of distribution that
reflects the knowledge I might have
about how things are distributed from one day to the next.
Another setting might be, for instance, again,
in electrical network or some power generation
application, the demand.
And this varies.
It depends on how many people have their fridges on
and their air conditioners or various other things.
The demand varies from day to day.
That's an uncontrolled parameter.
It's uncertain.
Yet another application might be uncertainty due to lack
of knowledge in the model.
So if I'm developing a chemical kinetic model, very often
the rate parameters inside of that model,
the activation energy or the exponential factor
in an Arrhenius rate expression are uncertain, maybe
of by an order of magnitude.
I just don't have enough measurements
to really constrain that parameter very well.
But that's part of the model, and I want to make predictions.
Whatever it is, you put that into x,
and you assign a probability distribution
into x that reflects your state of knowledge.
Now, the next step in Monte Carlo simulation
is to draw samples of x from that joint distribution
over all the elements of x that you've evaluated--
that you've constructed.
And you evaluate the model in each of those samples.
And so if I draw n samples from x, then what I'm going to do
is create n corresponding samples of y,
call them y1 through y sub n.
Now, these samples are really the result of the Monte Carlo
simulation process.
And they reflect the mean of y, the spread of y,
the skewness, the shape.
If I histogram all the samples, it's
going to give me the sense of the overall distribution of y.
Now I want to consider a much more focused question that we
can analyze more carefully, which is to say,
let's just estimate the mean value of y
given the uncertainty in x.
Now, this mean value I'm going to write here
as expectation of y.
So this expectation operator, that just means the mean.
And this is going to be approximated given the results
of a Monte Carlo simulation by the average of the samples that
I've created, so sum from y equals 1 to n divided by n
of all those yi's.
Now, this n sample estimator I'm going
to give it a special name.
I'm going to call it y bar n.
So y bar n is the n sample estimate of the mean of y,
of the expectation of y.
So the first question you might want to ask
is, how does y bar n resemble the true mean of y?
How close is it?
Will it get there if I increase n?
So on and so forth.
So in general, y bar n, like the result of any Monte Carlo
simulation, is random.
It could be larger than the true mean of y.
It could be below it.
It could be up and down as I introduce
more and more samples.
It is a random variable.
The true mean, the thing I'd like to know,
is the deterministic quantity, this one value
of the true mean of y once I've prescribed my input probability
distribution on x.
But the hope is that y bar n is close to the true mean.
And the hope is that as I put in more computational effort,
as I increase the number of samples,
y bar n will somehow converge to the expectation of y.
Now, this is formalized by two laws, or theorems.
One is this notion of a law of large numbers.
And that basically says that as n goes to infinity, y bar n
converges in a precise sense that we
don't need to worry about too much
to the actual expectation of y.
So that's good news.
More computational effort, I know that I'll get there.
The other very useful theorem is the central limit theorem.
And this is something maybe you remember from some probability
class you took a while ago.
It basically says that for any n that's large enough,
y bar n is normally distributed.
It looks like a Gaussian.
And that Gaussian is centered around the true value,
the true mean of y.
So I'll write this formally saying,
this y bar n is normally distributed with its mean
equal to the actual expectation of y
and a variance that I'm just going to write a sigma
squared divided by n.
So visually, what does that mean?
It says that y bar n, this random variable,
has a probability density function that's bell shaped,
Gaussian shaped.
And it centers on the expectation of y.
And that if the variance is sigma squared over n,
then the standard deviation is sigma divided
by the square root of n.
And that essentially tells you how wide this bell is.
So there's an important lesson in here,
which is basically that the variance goes down as 1/n.
The standard deviation goes down as 1 over square root of n.
So for larger and larger n, then my random estimate, y bar n,
is going to cluster closer to the expectation of y.
So this bell curve will get narrower and narrower.
Now, the bad news is that one over square root of n
is kind of slow.
So what this means, so maybe in concrete terms,
is that if I want to reduce the standard deviation of my Monte
Carlo estimate by a factor of 10,
I need to multiply the number of samples
that I've taken by a factor of 100.
So I need to do 100 times more work
to get a result that's 10 times better.
So that's not great.
But that's Monte Carlo, and that's
actually kind of a universal rule in Monte Carlo.
And the good news is it's independent of how many input
variables I have.
I could have two uncertain input parameters x.
I could have a million uncertain input parameters x.
And I'm still going to converge at this 1-over-square-root-of-n
rate.
So it's very robust.
And it's also very, very flexible.
The other thing to note is that the distribution of y,
the distribution of these y1 through yn samples,
well, it depends on the distribution of the inputs x,
and it depends on what the function f did to those inputs,
by definition, by construction.
That's how we set up the whole simulation.
But this is really the key of what the Monte Carlo
method is trying to probe.
It's trying to understand does the function f, my model,
does it amplify variability in certain inputs,
or does it diminish the variability in certain inputs?
Does my output really depend on some variables but not others?
Does it respond to certain combinations of input values?
Like, when the first value of x is high
and the second value of x is high,
then is my output value y super high or vice versa?
Does it not respond to certain other combinations
of input values?
Do the input values interact in different ways?
What we're really doing is probing this function
over the whole input space just by throwing darts,
just by taking Monte Carlo samples.
And this is really exactly what we want to understand.
This is really why Monte Carlo simulation is powerful,
because we're just looking at all the possible combinations,
seeing how they perform, and then evaluating the results.