|
4 | 4 | "cell_type": "markdown", |
5 | 5 | "metadata": {}, |
6 | 6 | "source": [ |
7 | | - "# MCMC Sampling\n", |
| 7 | + "# MCMC Sampling" |
| 8 | + ] |
| 9 | + }, |
| 10 | + { |
| 11 | + "cell_type": "markdown", |
| 12 | + "metadata": {}, |
| 13 | + "source": [ |
| 14 | + "## Overview\n", |
8 | 15 | "\n", |
9 | | - "The [CmdStanModel](https://mc-stan.org/cmdstanpy/api.html#cmdstanmodel) object's\n", |
10 | | - "method [sample](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanModel.sample)\n", |
11 | | - "invokes Stan's adaptive HMC-NUTS sampler which uses the Hamiltonian Monte Carlo (HMC) algorithm\n", |
12 | | - "and its adaptive variant the no-U-turn sampler (NUTS) to produce a set of\n", |
13 | | - "draws from the posterior distribution of the model parameters conditioned on the data.\n", |
| 16 | + "Stan's MCMC sampler implements the Hamiltonian Monte Carlo (HMC) algorithm and its adaptive variant\n", |
| 17 | + "the no-U-turn sampler (NUTS).\n", |
| 18 | + "It creates a set of draws from the posterior distribution of the model conditioned on the data,\n", |
| 19 | + "allowing for exact Bayesian inference of the model parameters.\n", |
| 20 | + "Each draw consists of the values for all parameter, transformed parameter, and\n", |
| 21 | + "generated quantities variables, reported on the constrained scale.\n", |
14 | 22 | "\n", |
15 | | - "The `sample` method returns a [CmdStanMCMC](https://mc-stan.org/cmdstanpy/api.html#cmdstanmcmc) object.\n", |
16 | | - "Underlyingly, the sampler run outputs are a set of per-chain Stan CSV files.\n", |
17 | | - "The `CmdStanMCMC` object provide multiple accessor functions which allow the user\n", |
18 | | - "to access the resulting sample in whatever data format is needed for further analysis.\n", |
| 23 | + "The [CmdStanModel sample](https://mc-stan.org/cmdstanpy/api.html#cmdstanpy.CmdStanModel.sample) method\n", |
| 24 | + "wraps the CmdStan [sample](https://mc-stan.org/docs/cmdstan-guide/mcmc-config.html) method.\n", |
| 25 | + "Underlyingly, the CmdStan outputs are a set of per-chain Stan CSV files.\n", |
| 26 | + "In addition to the resulting sample, reported as one row per draw,\n", |
| 27 | + "the Stan CSV files encode information about the inference engine configuration\n", |
| 28 | + "and the sampler state.\n", |
| 29 | + "The NUTS-HMC adaptive sampler algorithm also outputs the per-chain\n", |
| 30 | + "HMC tuning parameters `step_size` and `metric`.\n", |
19 | 31 | "\n", |
20 | | - "The sample can be extracted in tabular format, either as\n", |
| 32 | + "The `sample` method returns a [CmdStanMCMC](https://mc-stan.org/cmdstanpy/api.html#cmdstanmcmc) object,\n", |
| 33 | + "which provides access to the disparate information from the Stan CSV files.\n", |
| 34 | + "Accessor functions allow the user\n", |
| 35 | + "to access the sample in whatever data format is needed for further analysis.\n", |
21 | 36 | "\n", |
22 | | - "- an [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html#numpy.ndarray)\n", |
| 37 | + "- The sample can be extracted in tabular format, either as\n", |
23 | 38 | "\n", |
24 | | - "- a [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html#pandas.DataFrame)\n", |
| 39 | + " + an [numpy.ndarray](https://numpy.org/doc/stable/reference/generated/numpy.ndarray.html#numpy.ndarray)\n", |
25 | 40 | "\n", |
26 | | - "The sample can be treated as a collection of named, structured variables, and extracted as\n", |
| 41 | + " + a [pandas.DataFrame](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.html#pandas.DataFrame)\n", |
27 | 42 | "\n", |
28 | | - "- a Python `dict` mapping names to `numpy.ndarray` objects\n", |
| 43 | + "- The sample can be treated as a collection of named, structured variables, and extracted as\n", |
29 | 44 | "\n", |
30 | | - "- an [xarray.Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html)\n", |
| 45 | + " + a Python `dict` mapping names to `numpy.ndarray` objects\n", |
31 | 46 | "\n", |
32 | | - "The CmdStanMCMC object also provides access to the per-chain HMC tuning parameters `step_size` and `metric`\n", |
33 | | - "and the [InferenceMetadata](https://mc-stan.org/cmdstanpy/internal_api.html#inferencemetadata)\n", |
34 | | - "which consists of the CmdStan configuration, the layout of the CSV file data table,\n", |
35 | | - "and the mapping between the table columns and the Stan program structured variables.\n", |
| 47 | + " + an [xarray.Dataset](https://docs.xarray.dev/en/stable/generated/xarray.Dataset.html)\n", |
36 | 48 | "\n", |
37 | 49 | "\n", |
38 | | - "\n" |
| 50 | + "In addtion, the `CmdStanMCMC` object has accessor methods for\n", |
| 51 | + "\n", |
| 52 | + "- The per-chain HMC tuning parameters `step_size` and `metric` \n", |
| 53 | + "\n", |
| 54 | + "- The CmdStan run configuration and console outputs\n", |
| 55 | + "\n", |
| 56 | + "- The sampler algorithm diagnostics\n", |
| 57 | + "\n", |
| 58 | + "- The mapping between the Stan model variables and the corresponding CSV file columns." |
39 | 59 | ] |
40 | 60 | }, |
41 | 61 | { |
|
185 | 205 | "cell_type": "markdown", |
186 | 206 | "metadata": {}, |
187 | 207 | "source": [ |
188 | | - "## Summarizing the sample" |
189 | | - ] |
190 | | - }, |
191 | | - { |
192 | | - "cell_type": "code", |
193 | | - "execution_count": null, |
194 | | - "metadata": { |
195 | | - "scrolled": true |
196 | | - }, |
197 | | - "outputs": [], |
198 | | - "source": [ |
199 | | - "fit.summary()" |
200 | | - ] |
201 | | - }, |
202 | | - { |
203 | | - "cell_type": "markdown", |
204 | | - "metadata": {}, |
205 | | - "source": [ |
206 | | - "## Analyzing the sample" |
| 208 | + "## Accessing the sampler outputs" |
207 | 209 | ] |
208 | 210 | }, |
209 | 211 | { |
|
322 | 324 | "cell_type": "markdown", |
323 | 325 | "metadata": {}, |
324 | 326 | "source": [ |
325 | | - "### Saving the sampler output files" |
326 | | - ] |
327 | | - }, |
328 | | - { |
329 | | - "cell_type": "markdown", |
330 | | - "metadata": {}, |
331 | | - "source": [ |
332 | | - "The sampler output files are written to a temporary directory which\n", |
333 | | - "is deleted upon session exit unless the ``output_dir`` argument is specified.\n", |
334 | | - "The ``save_csvfiles`` function moves the CmdStan CSV output files\n", |
335 | | - "to a specified directory without having to re-run the sampler.\n", |
336 | | - "The console output files are not saved. These files are treated as ephemeral; if the sample is valid, all relevant information is recorded in the CSV files." |
| 327 | + "## Summarizing the sample" |
337 | 328 | ] |
338 | 329 | }, |
339 | 330 | { |
340 | 331 | "cell_type": "code", |
341 | 332 | "execution_count": null, |
342 | | - "metadata": {}, |
| 333 | + "metadata": { |
| 334 | + "scrolled": true |
| 335 | + }, |
343 | 336 | "outputs": [], |
344 | 337 | "source": [ |
345 | | - "# fit.save_csvfiles(dir=\"some_dir\")" |
| 338 | + "fit.summary()" |
346 | 339 | ] |
347 | 340 | }, |
348 | 341 | { |
|
369 | 362 | "cell_type": "markdown", |
370 | 363 | "metadata": {}, |
371 | 364 | "source": [ |
372 | | - "#### eight_schools.stan" |
| 365 | + "**eight_schools.stan**" |
373 | 366 | ] |
374 | 367 | }, |
375 | 368 | { |
|
386 | 379 | "cell_type": "markdown", |
387 | 380 | "metadata": {}, |
388 | 381 | "source": [ |
389 | | - "#### eight_schools.data.json" |
| 382 | + "**eight_schools.data.json**" |
390 | 383 | ] |
391 | 384 | }, |
392 | 385 | { |
|
434 | 427 | "source": [ |
435 | 428 | "print(eight_schools_fit.diagnose())" |
436 | 429 | ] |
| 430 | + }, |
| 431 | + { |
| 432 | + "cell_type": "markdown", |
| 433 | + "metadata": {}, |
| 434 | + "source": [ |
| 435 | + "## Saving the sampler output files\n", |
| 436 | + "\n", |
| 437 | + "The sampler output files are written to a temporary directory which\n", |
| 438 | + "is deleted upon session exit unless the ``output_dir`` argument is specified.\n", |
| 439 | + "The ``save_csvfiles`` function moves the CmdStan CSV output files\n", |
| 440 | + "to a specified directory without having to re-run the sampler.\n", |
| 441 | + "The console output files are not saved. These files are treated as ephemeral; if the sample is valid, all relevant information is recorded in the CSV files." |
| 442 | + ] |
437 | 443 | } |
438 | 444 | ], |
439 | 445 | "metadata": { |
|
0 commit comments