Skip to content

Commit 2d5d234

Browse files
release/v1.0.0rc2: updating version numbers
1 parent 5b80eb8 commit 2d5d234

45 files changed

Lines changed: 5462 additions & 2592 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

cmdstanpy/_version.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,3 +1,3 @@
11
"""PyPi Version"""
22

3-
__version__ = '1.0.0rc1'
3+
__version__ = '1.0.0rc2'
11 Bytes
Loading

docs/_modules/cmdstanpy/cmdstan_args.html

Lines changed: 1153 additions & 0 deletions
Large diffs are not rendered by default.

docs/_modules/cmdstanpy/compiler_opts.html

Lines changed: 465 additions & 0 deletions
Large diffs are not rendered by default.

docs/_modules/cmdstanpy/model.html

Lines changed: 400 additions & 243 deletions
Large diffs are not rendered by default.

docs/_modules/cmdstanpy/stanfit.html

Lines changed: 153 additions & 165 deletions
Large diffs are not rendered by default.

docs/_modules/cmdstanpy/utils.html

Lines changed: 99 additions & 64 deletions
Large diffs are not rendered by default.

docs/_modules/index.html

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@
55
<head>
66
<meta charset="utf-8" />
77
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
8-
<title>Overview: module code &#8212; CmdStanPy 1.0.0rc1 documentation</title>
8+
<title>Overview: module code &#8212; CmdStanPy 1.0.0rc2 documentation</title>
99

1010
<link href="../_static/css/theme.css" rel="stylesheet">
1111
<link href="../_static/css/index.ff1ffe594081f20da1ef19478df9384b.css" rel="stylesheet">
@@ -58,7 +58,7 @@
5858
<div id="navbar-start">
5959

6060
<!-- This will display the version of the docs -->
61-
<a class='navbar-brand' href='index.html'>CmdStanPy 1.0.0rc1</a>
61+
<a class='navbar-brand' href='index.html'>CmdStanPy 1.0.0rc2</a>
6262

6363
</div>
6464

@@ -168,7 +168,9 @@
168168
<div>
169169

170170
<h1>All modules for which code is available</h1>
171-
<ul><li><a href="cmdstanpy/model.html">cmdstanpy.model</a></li>
171+
<ul><li><a href="cmdstanpy/cmdstan_args.html">cmdstanpy.cmdstan_args</a></li>
172+
<li><a href="cmdstanpy/compiler_opts.html">cmdstanpy.compiler_opts</a></li>
173+
<li><a href="cmdstanpy/model.html">cmdstanpy.model</a></li>
172174
<li><a href="cmdstanpy/stanfit.html">cmdstanpy.stanfit</a></li>
173175
<li><a href="cmdstanpy/utils.html">cmdstanpy.utils</a></li>
174176
</ul>
@@ -198,7 +200,7 @@ <h1>All modules for which code is available</h1>
198200

199201
<div class="footer-item">
200202
<p class="sphinx-version">
201-
Created using <a href="http://sphinx-doc.org/">Sphinx</a> 4.2.0.<br>
203+
Created using <a href="http://sphinx-doc.org/">Sphinx</a> 4.3.0.<br>
202204
</p>
203205
</div>
204206

docs/_sources/api.rst.txt

Lines changed: 19 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,15 @@
44
API Reference
55
#############
66

7+
The following documents the public API of CmdStanPy. It is expected to be stable between versions,
8+
with backwards compatibility between minor versions and deprecation warnings preceeding breaking changes.
9+
There is also the `internal API <internal_api.rst>`__, which is makes no such guarantees.
10+
11+
.. toctree::
12+
:hidden:
13+
14+
internal_api.rst
15+
716
*******
817
Classes
918
*******
@@ -54,22 +63,15 @@ CmdStanVB
5463
:members:
5564

5665

57-
InferenceMetadata
58-
=================
59-
60-
.. autoclass:: cmdstanpy.InferenceMetadata
61-
:members:
62-
63-
RunSet
64-
======
65-
66-
.. autoclass:: cmdstanpy.stanfit.RunSet
67-
:members:
68-
6966
*********
7067
Functions
7168
*********
7269

70+
show_versions
71+
=============
72+
73+
.. autofunction:: cmdstanpy.show_versions
74+
7375
cmdstan_path
7476
============
7577

@@ -85,6 +87,11 @@ set_cmdstan_path
8587

8688
.. autofunction:: cmdstanpy.set_cmdstan_path
8789

90+
cmdstan_version
91+
================
92+
93+
.. autofunction:: cmdstanpy.cmdstan_version
94+
8895
set_make_env
8996
============
9097

docs/_sources/examples/VI as Sampler Inits.ipynb.txt

Lines changed: 42 additions & 68 deletions
Original file line numberDiff line numberDiff line change
@@ -7,14 +7,18 @@
77
"## Using Variational Estimates to Initialize the NUTS-HMC Sampler\n",
88
"\n",
99
"In this example we show how to use the parameter estimates return by Stan's variational inference algorithm\n",
10-
"as the initial parameter values for Stan's NUTS-HMC sampler, using a the [earnings-logearn_height model](https://github.com/stan-dev/posteriordb/blob/master/posterior_database/models/stan/logearn_height.stan) and data from the [posteriordb package](https://github.com/stan-dev/posteriordb).\n",
11-
"\n",
12-
"The experiments reported in the paper [Pathfinder: Parallel quasi-Newton variational inference](https://arxiv.org/abs/2108.03782) by Zhang et al. show that mean-field ADVI provides a better estimate of the posterior, as measured by the 1-Wasserstein distance to the reference posterior, than 75 iterations of the warmup Phase I algorithm used by the NUTS-HMC sampler, furthermore, ADVI is more computationally efficient, requiring fewer evaluations of the log density and gradient functions. Therefore, using the estimates from ADVI to initialize the parameter values for the NUTS-HMC sampler will allow the sampler to do a better job of adapting the stepsize and metric during warmup, resulting in better performance and estimation.\n",
13-
"\n",
10+
"as the initial parameter values for Stan's NUTS-HMC sampler.\n",
11+
"By default, the sampler algorithm randomly initializes all model parameters in the range uniform\\[-2, 2\\]. When the true parameter value is outside of this range, starting from the ADVI estimates will speed up and improve adaptation.\n",
1412
"\n",
1513
"### Model and data\n",
1614
"\n",
17-
"For conveince, we have copied the posteriordb model and data to this directory, in files [logearn_height.stan](logearn_height.stan) and [earnings.json](earnings.json)."
15+
"The Stan model and data are taken from the [posteriordb package](https://github.com/stan-dev/posteriordb).\n",
16+
"\n",
17+
"We use the [blr model](https://github.com/stan-dev/posteriordb/blob/master/posterior_database/models/stan/blr.stan),\n",
18+
"a Bayesian standard linear regression model with noninformative priors,\n",
19+
"and its corresponding simulated dataset [sblri.json](https://github.com/stan-dev/posteriordb/blob/master/posterior_database/data/data/sblri.json.zip),\n",
20+
"which was simulated via script [sblr.R](https://github.com/stan-dev/posteriordb/blob/master/posterior_database/data/data-raw/sblr/sblr.R).\n",
21+
"For conveince, we have copied the posteriordb model and data to this directory, in files `blr.stan` and `sblri.json`."
1822
]
1923
},
2024
{
@@ -25,19 +29,23 @@
2529
"source": [
2630
"import os\n",
2731
"from cmdstanpy import CmdStanModel\n",
28-
" \n",
29-
"stan_file = 'logearn_height.stan'\n",
30-
"data_file = 'earnings.json'\n",
3132
"\n",
32-
"# instantiate, compile bernoulli model\n",
33-
"model = CmdStanModel(stan_file=stan_file)"
33+
"stan_file = 'blr.stan' # basic linear regression\n",
34+
"data_file = 'sblri.json' # simulated data\n",
35+
"\n",
36+
"model = CmdStanModel(stan_file=stan_file)\n",
37+
"\n",
38+
"print(model.code())"
3439
]
3540
},
3641
{
3742
"cell_type": "markdown",
3843
"metadata": {},
3944
"source": [
40-
"The earnings dataset is a set of 1192 observations of annual earnings in USD, height in inches, and indicator for sex==male."
45+
"### Run Stan's variational inference algorithm, obtain fitted estimates\n",
46+
"\n",
47+
"The `CmdStanModel` method `variational` runs CmdStan's ADVI algorithm.\n",
48+
"Because this algorithm is unstable and may fail to converge, we run it with argument `require_converged` set to `False`. We also specify a seed, to avoid instabilities as well as for reproducibility."
4149
]
4250
},
4351
{
@@ -46,20 +54,17 @@
4654
"metadata": {},
4755
"outputs": [],
4856
"source": [
49-
"import json\n",
50-
"with open(data_file, 'r') as fd:\n",
51-
" data_dict = json.load(fd)\n",
52-
"print(data_dict.keys())\n",
53-
"print(data_dict['N'])\n",
54-
"for i in range(5):\n",
55-
" print(data_dict['earn'][i], data_dict['height'][i])\n"
57+
"vb_fit = model.variational(data=data_file, require_converged=False, seed=123)"
5658
]
5759
},
5860
{
5961
"cell_type": "markdown",
6062
"metadata": {},
6163
"source": [
62-
"The \"logearn_height\" model regresses the log earnings on height."
64+
"The ADVI algorithm provides estimates of all model parameters.\n",
65+
"\n",
66+
"The `variational` method returns a `CmdStanVB` object, with method `stan_variables`, which\n",
67+
"returns the approximate estimates of all model parameters as a Python dictionary."
6368
]
6469
},
6570
{
@@ -68,17 +73,18 @@
6873
"metadata": {},
6974
"outputs": [],
7075
"source": [
71-
"print(model.code())"
76+
"print(vb_fit.stan_variables())"
7277
]
7378
},
7479
{
7580
"cell_type": "markdown",
7681
"metadata": {},
7782
"source": [
78-
"### Run Stan's variational inference algorithm, obtain fitted estimates\n",
83+
"Posteriordb provides reference posteriors for all models. For the blr model, conditioned on the dataset `sblri.json`, the reference posteriors are in file [sblri-blr.json](https://github.com/stan-dev/posteriordb/blob/master/posterior_database/reference_posteriors/summary_statistics/mean/mean/sblri-blr.json)\n",
7984
"\n",
80-
"The `CmdStanModel` method `variational` runs CmdStan's ADVI algorithm.\n",
81-
"Conditioning the model on the data results in a posterior geometry which is difficult to navigate. Because the ADVI algorithm is unstable and may fail to converge, we run it with argument `require_converged` set to `False`. We also specify a seed, to avoid instabilities as well as for reproducibility."
85+
"The reference posteriors for all elements of `beta` and `sigma` are all very close to $1.0$.\n",
86+
"\n",
87+
"The experiments reported in the paper [Pathfinder: Parallel quasi-Newton variational inference](https://arxiv.org/abs/2108.03782) by Zhang et al. show that mean-field ADVI provides a better estimate of the posterior, as measured by the 1-Wasserstein distance to the reference posterior, than 75 iterations of the warmup Phase I algorithm used by the NUTS-HMC sampler, furthermore, ADVI is more computationally efficient, requiring fewer evaluations of the log density and gradient functions. Therefore, using the estimates from ADVI to initialize the parameter values for the NUTS-HMC sampler will allow the sampler to do a better job of adapting the stepsize and metric during warmup, resulting in better performance and estimation.\n"
8288
]
8389
},
8490
{
@@ -87,17 +93,10 @@
8793
"metadata": {},
8894
"outputs": [],
8995
"source": [
90-
"vb_fit = model.variational(data=data_file, require_converged=False, seed=123)"
91-
]
92-
},
93-
{
94-
"cell_type": "markdown",
95-
"metadata": {},
96-
"source": [
97-
"The ADVI algorithm provides estimates of all model parameters as well as the step size scaling factor `eta`.\n",
98-
"\n",
99-
"The `variational` method returns a `CmdStanVB` object, with methods `eta` and `stan_variables`, which\n",
100-
"return the step size scaling factor and estimates of all model parameters as a Python dictionary respectively."
96+
"vb_vars = vb_fit.stan_variables()\n",
97+
"mcmc_vb_inits_fit = model.sample(\n",
98+
" data=data_file, inits=vb_vars, iter_warmup=75, seed=12345\n",
99+
")"
101100
]
102101
},
103102
{
@@ -106,21 +105,14 @@
106105
"metadata": {},
107106
"outputs": [],
108107
"source": [
109-
"print(vb_fit.eta, vb_fit.stan_variables())"
108+
"mcmc_vb_inits_fit.summary()"
110109
]
111110
},
112111
{
113112
"cell_type": "markdown",
114113
"metadata": {},
115114
"source": [
116-
"Posteriordb provides reference posteriors for all models. For the logearn_height model, conditioned on the dataset `earnings.json`, the posterior variables are:\n",
117-
"\n",
118-
"- beta[1]: 5.782\n",
119-
"- beta[2]: 0.059\n",
120-
"- sigma: 0.894\n",
121-
"\n",
122-
"By default, the sampler algorithm randomly initializes all model parameters in the range uniform[-2, 2]. The ADVI estimates will provide a better starting point, especially w/r/t to parameter `beta[1]`, than the defaults.\n",
123-
"In addition, we can use the step size scaling factor to scale the initial step size, which allows us to skip the first phase of warmup (default 75 iterations)."
115+
"The sampler estimates match the reference posterior."
124116
]
125117
},
126118
{
@@ -129,34 +121,14 @@
129121
"metadata": {},
130122
"outputs": [],
131123
"source": [
132-
"vb_vars = vb_fit.stan_variables()\n",
133-
"vb_stepsize = 1.0 / vb_fit.eta\n",
134-
"mcmc_vb_inits_fit = model.sample(\n",
135-
" data=data_file, inits=vb_vars, step_size=vb_stepsize,\n",
136-
" adapt_init_phase=0, seed=123\n",
137-
")"
138-
]
139-
},
140-
{
141-
"cell_type": "code",
142-
"execution_count": null,
143-
"metadata": {},
144-
"outputs": [],
145-
"source": [
146-
"mcmc_vb_inits_fit.summary()"
124+
"print(mcmc_vb_inits_fit.diagnose())"
147125
]
148126
},
149127
{
150128
"cell_type": "markdown",
151129
"metadata": {},
152130
"source": [
153-
"The sampler results match the reference posterior, (taking into account MCSE).\n",
154-
"\n",
155-
"- beta[1]: 5.782\n",
156-
"- beta[2]: 0.059\n",
157-
"- sigma: 0.894\n",
158-
"\n",
159-
"To see how this is useful, we run the sampler with default initializations, step size, and warmup."
131+
"Using the default random parameter initializations, we need to run more warmup iteratons. If we only run 75 warmup iterations with random inits, the result fails to estimate `sigma` correctly. It is necessary to run the model with at least 150 warmup iterations to produce a good set of estimates."
160132
]
161133
},
162134
{
@@ -165,7 +137,7 @@
165137
"metadata": {},
166138
"outputs": [],
167139
"source": [
168-
"mcmc_random_inits_fit = model.sample(data=data_file, seed=123)"
140+
"mcmc_random_inits_fit = model.sample(data=data_file, iter_warmup=75, seed=12345)"
169141
]
170142
},
171143
{
@@ -178,10 +150,12 @@
178150
]
179151
},
180152
{
181-
"cell_type": "markdown",
153+
"cell_type": "code",
154+
"execution_count": null,
182155
"metadata": {},
156+
"outputs": [],
183157
"source": [
184-
"Using the variational estimates to skip warmup phase I shows improved N_Eff/s (number of effective sampler per second) values for all parameters. This is a simple model run on a small dataset. For complex models where the initial parameter values are far from the default initializations, this procedure may allow for faster and better adaptation during warmup."
158+
"print(mcmc_random_inits_fit.diagnose())"
185159
]
186160
}
187161
],

0 commit comments

Comments
 (0)