Making 2D histogram plots with the ParaMonte visualization tools

NOTE

If you are viewing an HTML version of this MATLAB live script on the web, you can download the corresponding MATLAB live script visualization_histogram2.mlx file to this HTML page at,

https://github.com/cdslaborg/paramontex/blob/fbeca6745684c798ff28c1bf57cfae0c190db478/MATLAB/mlx

Once you download the file, open it in MATLAB to view and interact with its contents, which is the same as what you see on this page.

First, let's clean up the MATLAB environment and make sure the path to the ParaMonte library is in MATLAB's path list.

clc;
clear all;
close all;
format compact; format long;
%%%%%%%%%%%% IMPORTANT %%%%%%%%%%%%%
% Set the path to the ParaMonte library:
% Change the following path to the ParaMonte library root directory, 
% otherwise, make sure the path to the ParaMonte library is already added
% to MATLAB's path list.
pmlibRootDir = './';
addpath(genpath(pmlibRootDir),"-begin");
%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
% change MATLAB's working directory to the folder containing this script
% if MATLAB Live Scripts did not create a temporary folder, we would not
% have all of these problems!
try
    setwdFilePath = websave("setwd.m","https://github.com/cdslaborg/paramontex/blob/fbeca6745684c798ff28c1bf57cfae0c190db478/MATLAB/mlx/setwd.m");
    run(setwdFilePath); % This is a MATLAB script that you can download from the same GitHub location given in the above.
catch % alas, we will have to run the simulations in MATLAB Live Script's temporary folder
    filePath = mfilename('fullpath');
    [currentDir,fileName,fileExt] = fileparts(filePath); cd(currentDir);
    cd(fileparts(mfilename('fullpath'))); % Change working directory to source code directory.
end

ParaMonte's default visualization tools

The ParaMonte library ships with several visualization tools that automate much of the MATLAB coding required to visualize the output of the simulations performed by the ParMonte library samplers.

By replacing the input dataFrame to these tools and following the conventions of the ParaMonte library, one can also use these visualization tools for any dataset that may not have been generated by the ParaMonte library.

Consider the following Markov chain on the web in compact format generated by the ParaDRAM sampler of the ParaMonte library to sample a MultiVariate Normal distribution. Since this is a chain file as inidicated by its suffix "_chain.txt" We will read this file via the ParaDRAM sampler's readChain() method.

pm = paramonte();

pmpd = pm.ParaDRAM();

url = "https://github.com/cdslaborg/paramontex/blob/fbeca6745684c798ff28c1bf57cfae0c190db478/MATLAB/mlx/sampling_multivariate_normal_distribution_via_paradram/out/mvn_serial_process_1_chain.txt";

pmpd.readChain(url); % read the chain file from the web

ParaDRAM - WARNING: The ParaDRAM input simulation specification `pmpd.spec.outputDelimiter` is not set. ParaDRAM - WARNING: This information is essential for successful reading of the requested chain file(s). ParaDRAM - WARNING: Proceeding with the default assumption of comma-delimited chain file contents... ParaDRAM - NOTE: 1 files detected matching the pattern: ParaDRAM - NOTE: "https://github.com/cdslaborg/paramontex/blob/fbeca6745684c798ff28c1bf57cfae0c190db478/MATLAB/mlx/sampling_multivariate_normal_distribution_via_paradram/out/mvn_serial_process_1_chain.txt" ParaDRAM - NOTE: processing file: "D:\Dropbox\Projects\20180101_ParaMonte\paramontex\MATLAB\mlx\visualization\temp_20201004_040633_051.txt" ParaDRAM - NOTE: reading the file contents... ParaDRAM - NOTE: done in 0.497990 seconds. ParaDRAM - NOTE: ndim = 4, count = 50000 ParaDRAM - NOTE: computing the sample correlation matrix... ParaDRAM - NOTE: creating the heatmap plot object from scratch... ParaDRAM - NOTE: done in 0.765700 seconds. ParaDRAM - NOTE: computing the sample covariance matrix... ParaDRAM - NOTE: creating the heatmap plot object from scratch... ParaDRAM - NOTE: done in 0.155680 seconds. ParaDRAM - NOTE: computing the sample autocorrelation... ParaDRAM - NOTE: creating the line plot object from scratch... ParaDRAM - NOTE: creating the scatter plot object from scratch... ParaDRAM - NOTE: creating the lineScatter plot object from scratch... ParaDRAM - NOTE: done in 0.871320 seconds. ParaDRAM - NOTE: adding the graphics tools... ParaDRAM - NOTE: creating the line plot object from scratch... ParaDRAM - NOTE: creating the scatter plot object from scratch... ParaDRAM - NOTE: creating the lineScatter plot object from scratch... ParaDRAM - NOTE: creating the line3 plot object from scratch... ParaDRAM - NOTE: creating the scatter3 plot object from scratch... ParaDRAM - NOTE: creating the lineScatter3 plot object from scratch... ParaDRAM - NOTE: creating the histogram plot object from scratch... ParaDRAM - NOTE: creating the histogram2 plot object from scratch... ParaDRAM - NOTE: creating the histfit plot object from scratch... ParaDRAM - NOTE: creating the contour plot object from scratch... ParaDRAM - NOTE: creating the contourf plot object from scratch... ParaDRAM - NOTE: creating the contour3 plot object from scratch... ParaDRAM - NOTE: creating the grid plot object from scratch... ParaDRAM - NOTE: The processed chain files are now stored in the newly-created component "pmpd.chainList" of the ParaDRAM - NOTE: ParaDRAM object as a cell array. For example, to access the contents of the first (or the only) chain ParaDRAM - NOTE: file, try: ParaDRAM - NOTE: ParaDRAM - NOTE: pmpd.chainList{1}.df ParaDRAM - NOTE: ParaDRAM - NOTE: To access the plotting tools, try: ParaDRAM - NOTE: ParaDRAM - NOTE: pmpd.chainList{1}.plot.<PRESS TAB TO SEE THE LIST OF PLOTS> ParaDRAM - NOTE: ParaDRAM - NOTE: For example, ParaDRAM - NOTE: ParaDRAM - NOTE: pmpd.chainList{1}.plot.line.make() % to make 2D line plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.scatter.make() % to make 2D scatter plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.lineScatter.make() % to make 2D line-scatter plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.line3.make() % to make 3D line plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.scatter3.make() % to make 3D scatter plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.lineScatter3.make() % to make 3D line-scatter plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.contour3.make() % to make 3D kernel-density contour plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.contourf.make() % to make 2D kernel-density filled-contour plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.contour.make() % to make 2D kernel-density plots. ParaDRAM - NOTE: pmpd.chainList{1}.plot.histogram2.make() % to make 2D histograms. ParaDRAM - NOTE: pmpd.chainList{1}.plot.histogram.make() % to make 1D histograms. ParaDRAM - NOTE: pmpd.chainList{1}.plot.grid.make() % to make GridPlot ParaDRAM - NOTE: ParaDRAM - NOTE: To plot or inspect the variable autocorrelations or the correlation/covariance matrices, try: ParaDRAM - NOTE: ParaDRAM - NOTE: pmpd.chainList{1}.stats.<PRESS TAB TO SEE THE LIST OF COMPONENTS> ParaDRAM - NOTE: ParaDRAM - NOTE: For more information and examples on the usage, visit: ParaDRAM - NOTE: ParaDRAM - NOTE: https://www.cdslab.org/paramonte

This method automatically generates a set of tools that can be used to visualize the contents of the compact chain file. Note that these visualization tools are not unique to this particular method of the ParaDRAM sampler or other ParaMonte samplers. For the sake of illustration however, we will create plots using the above dataset read via readChain() method of the ParaDRAM routine.

chain = pmpd.chainList{1};

chain.plot.histogram2.make();

Be default, the visualization tools are loaded with a set of predefined settings. For example, ParaMonte visualizations are by default colored (unless mutiple variables are to be displayed). These however, can be readily changed. For example, to change the colormap,

chain.plot.histogram2.colormap

ans = struct with fields:

enabled: 1 values: "parula"

chain.plot.histogram2.colormap.values = autumn;

chain.plot.histogram2.make()

To draw the 2D histogram, the ParaMonte visualizer utilizes the histogram2() function of MATLAB. One can pass pairs of (key,value) properties to this MATLAB function by defining those keyword properties in the histogram2 component of the plot object. There are a few properties defined already in this structure,

chain.plot.histogram2.histogram2.kws

ans = struct with fields:

edgeColor: "none" faceColor: "flat" displayStyle: "bar3" showemptybins: "off"

chain.plot.histogram2.colormap.values = flipud(cold());

chain.plot.histogram2.make();

To reset the properties of the plot object to the default settings, try,

chain.plot.histogram2.reset();
ParaDRAM - NOTE: resetting the properties of the histogram2 plot...

To reset the entire plot object including reading the data again from the input dataFrame, try,

chain.plot.histogram2.reset("hard");
ParaDRAM - NOTE: creating the histogram2 plot object from scratch...

Similarly, to change the properties of the colorbar, try,

chain.plot.histogram2.colorbar.kws
ans = struct with fields:
    fontSize: []
chain.plot.histogram2.colorbar.kws.fontSize = 12;

To change properties that do not exist, simple add them to the kws component, for example,

chain.plot.histogram2.colorbar.kws.location = "northoutside";

chain.plot.histogram2.make();

chain.plot.histogram2.colorbar.kws

ans = struct with fields:

fontSize: 12 location: "northoutside"

Remember that a handle to all objects in the plot is also stored in the currentFig component of the object. Most of the properties of the figure, axes, and the plots can be also changed directly via these handles. For example, to change the colorbar label, we could try,

chain.plot.histogram2.currentFig.colorbar.Label.String

ans = 'Density of Points'

chain.plot.histogram2.reset();

ParaDRAM - NOTE: resetting the properties of the histogram2 plot...

chain.plot.histogram2.make();

chain.plot.histogram2.currentFig.axes.ZScale = "log";

chain.plot.histogram2.currentFig.colorbar.Label.FontSize = 12;

chain.plot.histogram2.currentFig.colorbar.Label.Interpreter = "tex"; % set the interpreter for the colorbar

chain.plot.histogram2.currentFig.colorbar.Label.String = "Density of Sampled Points";

Choosing different columns of data to plot

By default, the column named the first two variables of the sampled space are shown in the plot. This can be readily changed to any paris of variables, like,

chain.df.Properties.VariableNames
ans = 1×11 cell array
    {'ProcessID'}    {'DelayedRejectionStage'}    {'MeanAcceptanceRate'}    {'AdaptationMeasure'}    {'BurninLocation'}    {'SampleWeight'}    {'SampleLogFunc'}    {'SampleVariable1'}    {'SampleVariable2'}    {'SampleVariable3'}    {'SampleVariable4'}

chain.plot.histogram2.make( "xcolumns", "SampleVariable3", "ycolumns", 11);

Notice the possibility of use of both column indices and column names to point to a data column in the dataFrame.

Unicolor plot

Turning the colormap off is very simple,

chain.plot.histogram2.colormap.enabled = false; % make monocolor plot

chain.plot.histogram2.make();

Plotting multiple columns of data in a single plot

The columns of data that are plotted are determined by the corresponding column names in xcolumns and ycolumns of the plot object:

If xcolumns is set to empty [], then the count of data will be used as the x values,
If ycolumns is set to empty [], then the all data columns will be drawn on the same plot (not recommended),

chain.df.Properties.VariableNames
ans = 1×11 cell array
    {'ProcessID'}    {'DelayedRejectionStage'}    {'MeanAcceptanceRate'}    {'AdaptationMeasure'}    {'BurninLocation'}    {'SampleWeight'}    {'SampleLogFunc'}    {'SampleVariable1'}    {'SampleVariable2'}    {'SampleVariable3'}    {'SampleVariable4'}
chain.plot.histogram2.xcolumns
ans = "SampleVariable3"
chain.plot.histogram2.ycolumns
ans =     11

To make plots multiple columns of data in a single plot, simply add the column names to the corresponding component, or more simply,

chain.plot.histogram2.reset("hard");

ParaDRAM - NOTE: creating the histogram2 plot object from scratch...

chain.plot.histogram2.colormap.enabled = false;

chain.plot.histogram2.xcolumns = 8:9;

chain.plot.histogram2.legend.enabled = true;

chain.plot.histogram2.legend.labels = ["variable1 - variable4", "variable2 - variable3"];

chain.plot.histogram2.legend.kws.location = "northwest";

chain.plot.histogram2.make("ycolumns", ["SampleVariable4", 10]) % notice the ability to mix column number with column names, or simply pass column ranges

view([46 54]); % change the 3D view to this angle

or, multiple variables against a single variable,

chain.plot.histogram2.legend.enabled = true;

chain.plot.histogram2.legend.labels = []; % reset the labels to automatic

chain.plot.histogram2.make("xcolumns", "SampleVariable2", "ycolumns", [10,11])

Plotting specific rows of data

Selected rows of data can be also plotted, if not all data observations have to be included. For example, we can exclude the burnin episode as determined by the ParaMonte sampler,

chain.plot.histogram2.reset();

ParaDRAM - NOTE: resetting the properties of the histogram2 plot...

burnin = chain.df.BurninLocation(end); % get the inferred burning location at the end of the chain

chain.plot.histogram2.rows = burnin:3:chain.count; % plot every one out of 10 data rows, starting from the burnin location to the end of the chain.

chain.plot.histogram2.make();

Exporting figures to external files

To extract a figure to an external PNG file, try,

chain.plot.histogram2.exportFig("exportedFigure.png","-m4");

The above command will extract the current active figure to an output file with the relatively high resoluton as specified by the flag -m4. To make the exported figure smaller, one could specify -m2 instead. In addition, to generate a figures with background transparency, the flag -transparent can be added to the exportFig function call,

% chain.plot.histogram2.exportFig("exportedFigure.png","-m2 -transparent") % uncomment to export the figure with transparency

Final Note:

To see other more sophisticated types of plots that can be automatically made with the ParaMonte visualization tools, visit: https://www.cdslab.org/paramonte/notes/examples/matlab/mlx/