ArviZ migration guide#
We have been working on refactoring ArviZ to allow more flexibility and extensibility of its elements while keeping as much as possible a friendly user-interface that gives sensible results with little to no arguments.
One important change is enhanced modularity. Everything will still be available through a common namespace arviz
,
but ArviZ will now be composed of 3 smaller libraries:
arviz-base data related functionality, including converters from different PPLs.
arviz-stats for statistical functions and diagnostics.
arviz-plots for visual checks built on top of arviz-stats and arviz-base.
Each library depends only on a minimal set of libraries, with a lot of functionality built on top of optional dependencies. This keeps ArviZ smaller and easier to install as you can install only the components you really need. The main examples are:
arviz-base
has no I/O library as a dependency, but you can usenetcdf4
,h5netcdf
orzarr
to read and write your data, allowing you to install only the one you need.arviz-plots
has no plotting library as a dependency, but it can generate plots withmatplotlib
,bokeh
orplotly
if they are installed.
At the time of writing, arviz-xyz
libraries are independent of the arviz
library, but arviz
tries to import the arviz-xyz
libraries
and exposes all their elements through the arviz.preview
namespace. In the future, with the ArviZ 1.0 release, the arviz
namespace will look
like arviz.preview
looks like today.
We encourage you to try it out and get a head start on the migration!
import arviz.preview as az
# change to import arviz as az after ArviZ 1.0 release
Check all 3 libraries have been exposed correctly:
print(az.info)
arviz_base available, exposing its functions as part of arviz.preview
arviz_stats available, exposing its functions as part of arviz.preview
arviz_plots available, exposing its functions as part of arviz.preview
arviz-base
#
Credible intervals and rcParams#
Some global configuration settings have changed. For example, the default credible interval probability (ci_prob
) has been updated from 0.94 to 0.89. Using 0.89 produces intervals with lower variability, leading to more stable summaries. At the same time, keeping a non-standard value (rather than 0.90 or 0.95) serves as a friendly reminder that the choice of interval should reflect the problem at hand.
In addition, a new setting ci_kind
has been introduced, which defaults to “eti” (equal-tailed interval). This controls the method used to compute credible intervals. The alternative is “hdi” (highest density interval), which was previously the default.
Defaults set via rcParams
are not fixed rules, they’re meant to be adjusted to fit the needs of your analysis. rcParams
offers a convenient way to establish global defaults for your workflow, while most functions that compute credible intervals also provide ci_prob
and ci_kind
arguments to override these settings locally.
You can check all defatult settings with:
az.rcParams
RcParams({'data.http_protocol': 'https',
'data.index_origin': 0,
'data.sample_dims': ('chain', 'draw'),
'data.save_warmup': False,
'plot.backend': 'matplotlib',
'plot.density_kind': 'kde',
'plot.max_subplots': 40,
'stats.ci_kind': 'eti',
'stats.ci_prob': 0.89,
'stats.ic_compare_method': 'stacking',
'stats.ic_pointwise': True,
'stats.ic_scale': 'log',
'stats.module': 'base',
'stats.point_estimate': 'mean'})
DataTree
#
One of the main differences is the arviz.InferenceData
object doesn’t exist any more.
arviz-base
uses xarray.DataTree
instead. This is a new data structure in xarray,
so it might still have some rough edges, but it is much more flexible and powerful.
To give some examples, I/O will now be more flexible, and any format supported by
xarray is automatically available to you, no need to add wrappers on top of them within ArviZ.
It is also possible to have arbitrary nesting of variables within groups and subgroups.
Important
Not all the functionality on xarray.DataTree
will be compatible with ArviZ as it would be too much
work for us to cover and maintain. If there are things you have always wanted to do but
were not possible with InferenceData
and are now possible with DataTree
please try
them out, give feedback on them and on desired behaviour for things that still don’t work.
After a couple releases the “ArviZverse” will stabilize much more and it might not be
possible to add support for that anymore.
I already have InferenceData
object from an external library#
InferenceData
already has a method to convert it to DataTree, idata.to_datatree()
, but you can also use
az.convert_to_datatree(idata)
import arviz as arviz_legacy
idata = arviz_legacy.load_arviz_data("centered_eight")
az.convert_to_datatree(idata)
<xarray.DatasetView> Size: 0B Dimensions: () Data variables: *empty*
What about my existing netcdf/zarr files?#
They are still valid. There have been no changes on this end and don’t plan to make any.
The only difference is which function handles I/O. There used to be functions
in the main arviz namespace like arviz.from_zarr
and methods to InferenceData
such as .to_netcdf
. These are now part of xarray itself:
Function in legacy ArviZ |
New equivalent in xarray |
---|---|
arviz.from_netcdf |
|
arviz.from_zarr |
|
arviz.to_netcdf |
- |
arviz.to_zarr |
- |
arviz.InferenceData.from_netcdf |
- |
arviz.InferenceData.from_zarr |
- |
arviz.InferenceData.to_netcdf |
|
arviz.InferenceData.to_zarr |
Here is an example where we store an InferenceData
as a netcdf then read the generated file
as is through xarray.open_datatree()
idata.to_netcdf("example.nc")
'example.nc'
from xarray import open_datatree
dt = open_datatree("example.nc")
dt
<xarray.DatasetView> Size: 0B Dimensions: () Data variables: *empty*
Other key differences#
DataTree
supports an arbitrary level of nesting (as opposed to the exactly 1 level of nesting in
InferenceData
). Thus, to keep things consistent accessing its groups returns a DataTree
,
even if it is a leaf group. This means that dt["posterior"]
will now return a DataTree
.
In many cases this is irrelevant, but there will be some cases where you’ll want the
group as a Dataset
instead. You can achieve this with dt["posterior"].dataset
.
There are no changes at the variable/DataArray
level. Thus, dt["posterior"]["theta"]
is still
a DataArray
, accessing its variables is one of the cases where having either DataTree
or Dataset
is irrelevant.
Enhanced converter flexibility#
Were you constantly needing to add an extra axis to your data because it didn’t have any chain
dimension? No more!
import numpy as np
rng = np.random.default_rng()
data = rng.normal(size=1000)
# arviz_legacy.from_dict({"posterior": {"mu": data}}) would fail
# unless you did data[None, :] to add the chain dimension
az.rcParams["data.sample_dims"] = "sample"
dt = az.from_dict({"posterior": {"mu": data}})
dt
<xarray.DatasetView> Size: 0B Dimensions: () Data variables: *empty*
# arviz-stats and arviz-plots also take it into account
az.plot_dist(dt);

Note
It is also possible to modify sample_dims
through arguments to the different functions.
New data wrangling features#
We have also added multiple functions to help with common data wrangling tasks,
mostly from and to xarray.Dataset
. For example, you can convert a dataset
to a wide format dataframe with unique combinations of sample_dims
as its rows,
with dataset_to_dataframe()
:
# back to default behaviour
az.rcParams["data.sample_dims"] = ["chain", "draw"]
dt = az.load_arviz_data("centered_eight")
az.dataset_to_dataframe(dt.posterior.dataset)
mu | theta[Choate] | theta[Deerfield] | theta[Phillips Andover] | theta[Phillips Exeter] | theta[Hotchkiss] | theta[Lawrenceville] | theta[St. Paul's] | theta[Mt. Hermon] | tau | |
---|---|---|---|---|---|---|---|---|---|---|
(0, 0) | 7.871796 | 12.320686 | 9.905367 | 14.951615 | 11.011485 | 5.579602 | 16.901795 | 13.198059 | 15.061366 | 4.725740 |
(0, 1) | 3.384554 | 11.285623 | 9.129324 | 3.139263 | 9.433211 | 7.811516 | 2.393088 | 10.055223 | 6.176724 | 3.908994 |
(0, 2) | 9.100476 | 5.708506 | 5.757932 | 10.944585 | 5.895436 | 9.992984 | 8.143327 | 7.604753 | 8.767647 | 4.844025 |
(0, 3) | 7.304293 | 10.037275 | 8.809068 | 9.900924 | 5.768832 | 9.062876 | 6.958424 | 10.298256 | 3.155304 | 1.856703 |
(0, 4) | 9.879675 | 9.149146 | 5.764986 | 7.015397 | 15.688710 | 3.097395 | 12.025763 | 11.316745 | 17.046142 | 4.748409 |
... | ... | ... | ... | ... | ... | ... | ... | ... | ... | ... |
(3, 495) | 1.542688 | 3.737751 | 5.393632 | 0.487845 | 4.015486 | 0.717057 | -2.675760 | 0.415968 | -4.991247 | 2.786072 |
(3, 496) | 1.858580 | -0.291737 | 0.110315 | 1.468877 | -3.653346 | 1.844292 | 6.055714 | 4.986218 | 9.290380 | 4.281961 |
(3, 497) | 1.766733 | 3.532515 | 2.008901 | 0.510806 | 0.832185 | 2.647687 | 4.707249 | 3.073314 | -2.623069 | 2.740607 |
(3, 498) | 3.486112 | 4.182751 | 7.554251 | 4.456034 | 3.300833 | 1.563307 | 1.528958 | 1.096098 | 8.452282 | 2.932379 |
(3, 499) | 3.404464 | 0.192956 | 6.498428 | -0.894424 | 6.849020 | 1.859747 | 7.936460 | 6.762455 | 1.295051 | 4.461246 |
2000 rows × 10 columns
Note it is also aware of ArviZ naming conventions in addition to using
the sample_dims
rcParam
. It can be further customized through a labeller
argument.
Tip
If you want to convert to a long format dataframe, you should use
xarray.Dataset.to_dataframe()
instead.
arviz-stats
#
Stats and diagnostics related functionality have also had some changes, and it should also be noted that out of the 3 new modular libraries it is currently the one lagging behind a bit more. At the same time, it does already have several new features that won’t be added to legacy ArviZ at any point, check out its API reference page for the complete and up to date list of available functions.
Model comparison#
For a long time we have been recommending using PSIS-LOO-CV (loo
) over WAIC.
PSIS-LOO-CV is more robust, has better theoretical properties, and offers diagnostics
to assess the reliability of the estimates. For these reasons, we have decided to remove WAIC
from arviz-stats
, and instead focus exclusively on PSIS-LOO-CV for model comparison.
Thus, now we offer many new features related to LOO. Including:
Compute weighted expectations, including mean, variance, quantiles, etc. See
loo_expectations()
.Compute predictive metrics such as RMSE, MAE, etc. See
loo_metrics()
.Compute CRPS/SCRPS, see
loo_score()
Compute PSIS-LOO-CV for approximate posteriors. See
loo_approximate_posterior()
.
For a complete list check API reference and in particular arviz_stats:api/index#model-comparison
dim
and sample_dims
#
Similarly to the rest of the libraries, most functions take an argument to indicate
which dimensions should be reduced (or considered core dims) in the different computations.
Given arviz-stats
is the one with behaviour and API closest to xarray itself,
this argument can either be dim
or sample_dims
as a way to keep the APIs of ArviZ
and xarray similar.
Let’s see the differences in action. ess
uses sample_dims
. This means we can do:
dt = az.load_arviz_data("non_centered_eight")
az.ess(dt, sample_dims=["chain", "draw"])
<xarray.DatasetView> Size: 656B Dimensions: (school: 8) Coordinates: * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu float64 8B 1.65e+03 theta_t (school) float64 64B 2.058e+03 2.51e+03 ... 2.455e+03 2.757e+03 tau float64 8B 1.115e+03 theta (school) float64 64B 1.942e+03 2.199e+03 ... 2.079e+03 2.106e+03
but we can’t do:
try:
az.ess(dt, sample_dims=["school", "draw"])
except Exception as err:
import traceback
traceback.print_exception(err)
Traceback (most recent call last):
File "/tmp/ipykernel_12737/2159531800.py", line 2, in <module>
az.ess(dt, sample_dims=["school", "draw"])
~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/arviz_stats/sampling_diagnostics.py", line 141, in ess
return data.azstats.ess(
~~~~~~~~~~~~~~~~^
sample_dims=sample_dims, group=group, method=method, relative=relative, prob=prob
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/arviz_stats/accessors.py", line 96, in ess
return self._apply(
~~~~~~~~~~~^
"ess", sample_dims=sample_dims, method=method, relative=relative, prob=prob, **kwargs
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
)
^
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/arviz_stats/accessors.py", line 291, in _apply
group_i: apply_function_to_dataset(
~~~~~~~~~~~~~~~~~~~~~~~~~^
func,
^^^^^
...<3 lines>...
kwargs=kwargs,
^^^^^^^^^^^^^^
)
^
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/arviz_stats/accessors.py", line 56, in apply_function_to_dataset
result = func(da, **subset_kwargs)
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/arviz_stats/base/dataarray.py", line 77, in ess
return apply_ufunc(
self.array_class.ess,
...<9 lines>...
},
)
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 1267, in apply_ufunc
return apply_dataarray_vfunc(
variables_vfunc,
...<4 lines>...
keep_attrs=keep_attrs,
)
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 310, in apply_dataarray_vfunc
result_var = func(*data_vars)
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 730, in apply_variable_ufunc
broadcast_compat_data(arg, broadcast_dims, core_dims)
~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 675, in broadcast_compat_data
order = tuple(old_dims.index(d) for d in reordered_dims)
File "/home/osvaldo/anaconda3/envs/arviz/lib/python3.13/site-packages/xarray/computation/apply_ufunc.py", line 675, in <genexpr>
order = tuple(old_dims.index(d) for d in reordered_dims)
~~~~~~~~~~~~~~^^^
ValueError: tuple.index(x): x not in tuple
This limitation doesn’t come from the fact that interpreting the “school” dimension as “chain”
makes no sense but from the fact that when using ess
on multiple variables (aka on a Dataset)
all dimensions in sample_dims
must be present in all variables.
Consequently, the following cell is technically valid even if it still makes no sense conceptually:
az.ess(dt, var_names=["theta", "theta_t"], sample_dims=["school", "draw"])
<xarray.DatasetView> Size: 96B Dimensions: (chain: 4) Coordinates: * chain (chain) int64 32B 0 1 2 3 Data variables: theta (chain) float64 32B 1.129e+03 408.2 329.2 580.9 theta_t (chain) float64 32B 499.2 339.0 430.1 1.052e+03
When we restrict the target variables to only “theta” and “theta_t” we make it so all variables have both “school” and “draw” dimension.
All computations where it is required that input variables all have the full set of
dimensions we want to operate on use sample_dims
. On ArviZ’s side this includes
ess
, rhat
or mcse
. Xarray only has an example of this:
to_stacked_array()
.
On the other hand, hdi
uses dim
. This means that both values we attempted
for sample_dims
with ess
are valid without caveats:
dt.azstats.hdi(dim=["chain", "draw"])
<xarray.DatasetView> Size: 840B Dimensions: (ci_bound: 2, school: 8) Coordinates: * ci_bound (ci_bound) <U5 40B 'lower' 'upper' * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu (ci_bound) float64 16B -1.119 9.318 theta_t (school, ci_bound) float64 128B -1.199 2.192 ... -1.451 1.739 tau (ci_bound) float64 16B 0.004998 7.744 theta (school, ci_bound) float64 128B -2.087 14.15 ... -2.658 13.1
here we have reduced both “chain” and “draw” dimensions like we did in ess
.
The only difference is hdi
also adds a “ci_bound” dimension, so instead
of ending up with scalars and variables with a “school” dimension only,
we end up with variables that have either “ci_bound” or (“ci_bound”, “school”) dimensionality.
Let’s continue with the other example:
dt.azstats.hdi(dim=["school", "draw"])
<xarray.DatasetView> Size: 328B Dimensions: (chain: 4, ci_bound: 2) Coordinates: * chain (chain) int64 32B 0 1 2 3 * ci_bound (ci_bound) <U5 40B 'lower' 'upper' Data variables: mu (chain, ci_bound) float64 64B -1.611 8.839 ... -0.9635 9.697 theta_t (chain, ci_bound) float64 64B -1.438 1.679 -1.493 ... -1.405 1.669 tau (chain, ci_bound) float64 64B 0.0111 7.34 0.02253 ... 0.1617 8.106 theta (chain, ci_bound) float64 64B -2.477 12.87 -2.538 ... -3.462 12.54
We are now reducing the subset of dim
present in each variable. That means
that mu
and tau
only have the “draw” dimension reduced, whereas theta
and theta_t
have both “draw” and “school” reduced. Consequently, all variables end up with
(“chain”, “ci_bound”) dimensions.
All computations where operating over different subsets of the provided dimensions makes
sense use dim
. On ArviZ’s side this includes functions like hdi
, eti
or kde
.
Most xarray functions fall in this category too, some examples are mean()
,
quantile()
, std()
or cumsum()
.
Accessors on xarray objects#
We are also taking advantage of the fact that xarray allows third party libraries to register
accessors on its object. This means that after importing arviz_stats
(or a library that imports
it like arviz.preview
) DataArrays, Datasets and DataTrees get a new attribute, azstats
.
This attribute is called accessor and exposes ArviZ functions that act on the object from which
the accessor is used.
We plan to have most functions available as both top level functions and accessors to help with discoverability of ArviZ functions. But not all functions can be implemented as accessors to all objects. Mainly, functions that need multiple groups can be available on the DataTree accessor, but not on Dataset or DataArray ones. Moreover, at the time of writing, some functions are only available as one of the two options but should be extended soon.
We have already used the azstats
accessor to compute the HDI, now we can check that
we get the same result when using ess
through the accessor than what we got when using
the top level function:
dt.azstats.ess()
<xarray.DatasetView> Size: 656B Dimensions: (school: 8) Coordinates: * school (school) <U16 512B 'Choate' 'Deerfield' ... 'Mt. Hermon' Data variables: mu float64 8B 1.65e+03 theta_t (school) float64 64B 2.058e+03 2.51e+03 ... 2.455e+03 2.757e+03 tau float64 8B 1.115e+03 theta (school) float64 64B 1.942e+03 2.199e+03 ... 2.079e+03 2.106e+03
Computational backends#
We have also modified a bit how computations accelerated by optional dependencies are handled. There are no longer dedicated “flag classes” like we had for Numba and Dask. Instead, low level stats functions are implemented in classes so we can subclass and reimplement only bottleneck computations (with the rest of the computations being inherited from the base class).
The default computational backend is controlled by rcParams["stats.module"]
which can be
“base”, “numba” or a user defined custom computational module[2].
dt = az.load_arviz_data("radon")
az.rcParams["stats.module"] = "base"
%timeit dt.azstats.histogram(dim="draw")
172 ms ± 10.6 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
az.rcParams["stats.module"] = "numba"
%timeit dt.azstats.histogram(dim="draw")
108 ms ± 12.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
az.rcParams["stats.module"] = "base"
The histogram method is one of the re-implemented ones, mostly so it scales better to larger data. However, it should be noted that we haven’t really done much profiling nor in-depth optimization efforts. Please open issues if you notice performance regressions or open issues/PRs to discuss and implement faster versions of the bottleneck methods.
Array interface#
It is also possible to install arviz-stats
without xarray or arviz-base
in which case,
only a subset of the functionality is available, and through an array only API.
This API has little to no defaults or assumptions baked into it, leaving all the choices
to the user who has to be explicit in every call.
Due to the dependencies
needed to install this minimal version of arviz-stats
being only NumPy and SciPy
we hope it will be particularly useful to other developers.
PPL developers can for example use arviz-stats
for MCMC diagnostics without having to add
xarray or pandas as dependencies of their library. This will ensure they are using
tested and up to date versions of the diagnostics without having to implement or maintain
them as part of the PPL itself.
The array interface is covered in detail at the Using arviz_stats array interface page.
arviz-plots
#
Out of the 3 libraries, arviz-plots
is the one with the most changes at all levels,
breaking changes, new features more layers to explore.
More and better supported backends!#
One of they key efforts of the refactor has been simplifying the way we interface with the different plotting backends supported. arviz-plots has more backends: matplotlib, bokeh and plotly are all supported now, with (mostly) feature parity among them. All while having less backend related code!
This also means that az.style
is no longer an alias to matplotlib.style
but its own
module with similar (reduced API) that sets the style for all compatible and installed
backends (unless a backend is requested explicitly):
az.style.use("arviz-vibrant")
dt = az.load_arviz_data("centered_eight")
az.plot_rank(dt, var_names=["mu", "tau"], backend="matplotlib");

import plotly.io as pio
pio.renderers.default = "notebook"
pc = az.plot_rank(dt, var_names=["mu", "tau"], backend="plotly")
pc.show()
At the time of writing, there are three cross-backend themes defined by ArviZ:
arviz-variat
, arviz-vibrant
and arviz-cetrino
.
Plotting function inventory#
The following functions have been renamed or restructured:
ArviZ <1 |
ArviZ >=1 |
---|---|
plot_bpv |
plot_ppc_pit, plot_ppc_tstat |
plot_dist_comparison |
plot_prior_posterior |
plot_ecdf |
plot_dist, plot_ecdf_pit |
plot_ess |
plot_ess, plot_ess_evolution |
plot_forest |
plot_forest, plot_ridge |
plot_ppc |
plot_ppc_dist |
plot_posterior, plot_density |
plot_dist |
plot_trace |
plot_trace_dist, plot_trace_rank |
Others have had their code rewritten and their arguments updated to some extent, but kept the same name:
plot_autocorr
plot_bf
plot_compare
plot_energy
plot_lm
plot_loo_pit
plot_mcse
plot_pair
plot_parallel
plot_rank
The following functions have been added:
plot_ppc_interval()
Some functions have been removed and we don’t plan to add them:
plot_dist (notice we have
plot_dist
but it is a different function)plot_kde (this is now part of
plot_dist
)plot_violin
And there are also functions we plan to add but aren’t available yet.
plot_elpd
plot_khat
plot_ppc_censored
plot_ppc_residuals
plot_ts
Note
For now, the documentation for arviz-plots defaults to latest
which is built
from GitHub with each commit. If you see some of the functions in the last block already
on the example gallery you should be able to try them, but only if you install
the development version! See Installation
You can see all of them at the arviz-plots gallery.
What to expect from the new plotting functions#
There are two main differences with the plotting functions here in legacy ArviZ:
The way of forwarding arguments to the plotting backends.
The return type is now
PlotCollection
, one of the key features ofarviz-plots
. A quick overview in the context ofplot_xyz
is given here but it later has a section of its own.
Other than that, some arguments have been renamed or gotten different defaults,
but nothing major. Note, however, that we have incorporated elements
from grammar of graphics into arviz-plots
, now that we’ll cover the internals
of plot_xyz
in passing we’ll use some terms from grammar of graphics.
If you have never heard about grammar of graphics we recommend you take
a look at Overview before continuing.
kwarg forwarding#
Most plot_xyz
functions now have a visuals
and a stats
argument. These arguments
are dictionaries whose keys define where their values are forwarded too. The values
are also dictionaries representing keyword arguments that will be passed downstream
via **kwargs
. This allows you to send arbitrary keyword arguments to all the different
visual elements or statistical computations that are part of a plot without
bloating the call signature with endless xyz_kwargs
arguments like in legacy ArviZ.
These same arguments also allow indicating a visual element should not be added to the plot,
or providing pre computed statistical summaries for faster re-rendering of plots (at the time
of writing pre-computed inputs are only working in plot_forest
but should be extended soon).
In addition, the call signature of new plotting functions is plot_xyz(..., **pc_kwargs)
,
with these pc_kwargs
being forwarded to the initialization of PlotCollection
.
This argument allows controlling the layout of the figure as well
as any aesthetic mappings that might be used by the plotting function.
For a complete notebook introduction on this see Introduction to batteries-included plots
New return type: PlotCollection
#
All plot_xyz
functions now return a “plotting manager class”. At the time of writing
this means either PlotCollection
(vast majority of plots) or
PlotMatrix
(for upcoming plot_pair
for example).
These classes are the ones that handle faceting and
aesthetic mappings and allow the plot_xyz
functions to
focus on the visuals and not on the plot layout or encodings.
See Using PlotCollection objects for more details on how to work with
existing PlotCollection
instances.
Plotting manager classes#
As we have just mentioned, plot_xyz
use these plotting manager classes and then return them
as their output. In addition, we hope users will use these classes directly to help them
write custom plotting functions more easily and with more flexibility.
By using these classes, users should be able to focus on writing smaller functions that take
care of a “unit of plotting”. You can then use their .map
methods to apply these plotting
functions as many times as needed given the faceting and aesthetic mappings defined by the user.
Different layouts and different mappings will generally not require changes to these plotting
functions, only to the arguments that define aesthetic mappings and the faceting strategy.
Take a look at Create your own figure with PlotCollection if that sounds interesting!
Other arviz-plots features#
There are also helper functions to help compose or extend existing plotting functions.
For example, we can create a new plot, with a similar layout to that of plot_trace_dist
or plot_rank_dist
but custom diagnostics in each column: distribution, rank and ess evolution:
az.combine_plots(
dt,
[
(az.plot_dist, {"kind": "ecdf"}),
(az.plot_rank, {}),
(az.plot_ess_evolution, {}),
],
var_names=["theta", "mu", "tau"],
coords={"school": ["Hotchkiss", "St. Paul's"]},
)
<arviz_plots.plot_collection.PlotCollection at 0x73bab414efd0>

Other nice features#
Citation helper#
We have also added a helper to cite ArviZ in your publications and also the methods implemented in it.
You can get the citation in BibTeX format through citation()
:
Extended documentation#
One recurring feedback we have received is that the documentation was OK for people very familiar with Bayesian statistics and probabilistic programming, but not so much for newcomers. Thus, we have added more introductory material and examples to the documentation, including a separated resource that show how to use ArviZ “in-context”, see EABM. And we attempted to make the documentation easier to navigate and understand for everyone.