When is the input file needed?

The input files to the ParaMonte library sampler routines are optional. If not provided, all simulation specifications will be set to appropriate default values or the ParaMonte samplers’ best guess for the proper values. However, if the user wants to fine-tune the specifics of a sampling simulation, then depending on the programming language interface to the ParaMonte library, the input file may or may not be the sole method of setting up the simulation specifications,

  • The C/C++ programming language: Providing the path to an input external file is the sole method of simulation setup by the user.
  • The Fortran programming language: Providing the path to an input external file is the preferred method of simulation setup by the user. However, the input file is not the sole method of specifying the simulation setup. See the ParaMonte Fortran documentation of the generic interface getErrSampling for more information about the alternative, less flexible method.
  • The MATLAB programming language: Providing the path to an input external file is NOT the preferred method of simulation setup by the user. This is because the ParaMonte MATLAB library already has a much more flexible dynamic method of simulation specifications setup from within the MATLAB programming language. Nevertheless, it is equally possible to specify everything from within an input simulation specifications file.
  • The Python programming language: Providing the path to an input external file is NOT the preferred method of simulation setup by the user. This is because the ParaMonte Python library already has a much more flexible dynamic method of simulation specifications setup from within the Python programming language. Nevertheless, it is equally possible to specify everything from within an input simulation specifications file.

The structure of the optional input file

Here is a summary of useful guidelines and rules for writing ParaMonte sampler input files.

Organization

  • The input file structure for all ParaMonte samplers is the same in all programming languages.
  • The simulation specifications for each ParaMonte sampler (e.g., the ParaDRAM MCMC sampler) must be grouped under the ParaMonte routine’s name. We call each group a namelist.
  • Each group, corresponding to one ParaMonte sampler routine, is identified by a group name preceded by an & and ending by a forward slash / (see below for an example input file).
  • Multiple namelist groups can coexist within a single input file. Only the ones relevant to the simulation of interest will be read and used. The rest will be ignored.
  • Comments are allowed anywhere inside the input file.
  • Comments must begin with an exclamation mark (!).
  • Comments can appear on an empty line or after a value assignment.

Variables

  • all variable assignments are optional and can be dropped or commented out. In such cases, the ParaMonte routines will assign appropriate default values to the missing variables in the input file.
    Variables within a namelist group can be separated from each other by colon or whitespace characters.
  • The order by which the variables appear within a namelist group is irrelevant and unimportant.
  • Variables can be defined multiple times, but only the last definition will be considered as input.
  • All variable names are case-insensitive. However, for clarity, the ParaMonte library follows the camelCase code-writing practice.

Values

Like variables, values within a namelist group can be separated by either a colon or whitespace characters.

  • Strings
    • String values must be enclosed with single or double quotation marks: '' or " ". String values can be continued on multiple lines; however, any additional whitespace characters caused by the line continuation will NOT be ignored.
  • Logical (Boolean)
    Logical values are all case-insensitive and can be either .true., true, or t for a TRUE value or .false., false, or f for a FALSE value.
  • Real (Float)
    • Real values are, by default, double-precision in MATLAB and Python programming languages. But they can be single, double, or quad precision within the C and C++ programming languages and any precision supported by the processor within the Fortran programming language.
    • The double precision can hold up to 16 digits of precision and represent numbers as large as $\approx 10^{307}$ and as tiny as $\approx 10^{-307}$.

Verctors

  • All vectors and arrays that are specified inside the input file begin with index 1. This follows the convention of the majority of science-oriented programming languages and libraries including but not limited to Fortran, Julia, Mathematica, MATLAB, R, LAPACK, and Eigen (C++).
  • Vectors (and arrays) of strings, integers, or real numbers can be specified as comma-separated or space-separated values. For example,
    ! real-valued vector of length 4, specified as the starting point of an MCMC simulation
    proposalStart = 1.0, -100 3, 5.6e7 8.d1
    

    You may have noticed above that some values are comma-separated while others are space-separated, which is a valid syntax.

  • Vector (an array) values may be specified separately on multiple lines and in random order like the following,
    ! a vector of strings specifying the names of the variables that are going to be sampled in the simulation, 
    ! each corresponds to one dimension of the objective function.
    domainAxisName(2) = "secondVariable"
    domainAxisName(1) = "FirstVariable"
    domainAxisName(3:4) = "ThirdVariable", "FourthVariable"
    
  • Vector values may be selectively provided in the input file, and some values may be missing. For example,
    ! a vector of length 4 specifying the random walker's step sizes along different dimensions of the objective function in a ParaDRAM simulation.
    proposalStd(3) = 3.0
    proposalStd(1:2) = 1.0, 2.0
    

    or,

    proposalStd = 1.0, 2.0, 3.0     ! This is identical to the above representation
    

    Notice that the missing fourth variable will not be read from the input file. Instead, the ParaMonte routines will assign it a default value.

  • Similar values in a vector that appear sequentially can be represented in abbreviated format via a repetition pattern rule involving *. For example,
    ! vector of length 4, specifying the lower limits of the domain of the objective function along each dimension
    domainCubeLimitLower = -3.d100, 2*-20.0, -100
    

    is equivalent to,

    domainCubeLimitLower = -3.d100, -20.0, -20.0, -100
    

    or,

    domainCubeLimitLower = 3*-3.d100
    

    is equivalent to,

    domainCubeLimitLower = -3.d100, -3.d100, -3.d100,    ! notice the fourth value is missing
    

    In the latter example, only the first three values were provided. In such cases, the missing elements will be assigned appropriate default values.

Arrays

  • The Array representation rules are identical to the vectors described in the previous section. For example, the following array value assignments are all equivalent,
    ! a symmetric matrix of size 4-by-4 of 64-bit real numbers representing the initial covariance matrix of the ParaDRAM sampler
    proposalCov =   1.0, 0.0, 0.0, 0.0,
                            0.0, 1.0, 0.0, 0.0,
                            0.0, 0.0, 1.0, 0.0,
                            0.0, 0.0, 0.0, 1.0,
    

    or,

    proposalCov(:,1) = 1.0, 0.0, 0.0, 0.0,
    proposalCov(:,2) = 0.0, 1.0, 0.0, 0.0,
    proposalCov(:,3) = 0.0, 0.0, 1.0, 0.0,
    proposalCov(:,4) = 0.0, 0.0, 0.0, 1.0,
    

    or,

    proposalCov(1:4,1:4) = 1.0, 4*0.0, 1.0, 4*0.0, 1.0, 4*0.0, 1.0
    

Example contents of a ParaDRAM simulation input file

The following box shows an example input specifications file for a ParaDRAM simulation of an objective function defined on a 4-dimensional domain (Notice the group name &ParaDRAM at the beginning and / at the end). Notice the ample usage of the comment symbol wherever the user deems it appropriate,

!   DESCRIPTION:
!
!       The external input file for sampling the 4-dimensional Multivariate Normal distribution function as implemented in the accompanying source files.
!       This file is common between all supported programming language environments.
!
!   NOTE:
!
!       All simulation specifications (including this whole file) are optional and can be nearly safely commented out.
!       However, if domain boundaries are finite, set them explicitly.
!
!   USAGE:
!
!       --  Comments must begin with an exclamation mark `!`.
!       --  Comments can appear anywhere on an empty line or, after a variable assignment
!           (but not in the middle of a variable assignment whether in single or multiple lines).
!       --  All variable assignments are optional and can be commented out. In such cases, appropriate default values will be assigned.
!       --  Use ParaDRAM namelist (group) name to group a set of ParaDRAM simulation specification variables.
!       --  The order of the input variables in the namelist groups is irrelevant and unimportant.
!       --  Variables can be defined multiple times, but only the last definition will be considered as input.
!       --  All variable names are case insensitive. However, for clarity, this software follows the camelCase code-writing practice.
!       --  String values must be enclosed with either single or double quotation marks.
!       --  Logical values are case-insensitive and can be either .true., true, or t for a TRUE value, and .false., false, or f for a FALSE value.
!       --  All vectors and arrays in the input file begin with index 1. This is following the convention of
!           the majority of science-oriented programming languages: Fortran, Julia, Mathematica, MATLAB, and R.
!
!      For comprehensive guidelines on the input file organization and rules, visit:
!
!           https://www.cdslab.org/paramonte/generic/latest/usage/sampling/paradram/input/
!
!      To see detailed descriptions of each of variables, visit:
!
!           https://www.cdslab.org/paramonte/generic/latest/usage/sampling/paradram/specifications/
!
&paradram

    ! Base specifications.

    description                         = "
This\n
    is a\n
        multi-line\n
            description.\\n"                                    ! strings must be enclosed with "" or '' and can be continued on multiple lines.
                                                                ! No comments within strings are allowed.
    domain                              = "cube"
    domainAxisName                      = "variable1"
                                          "variable2"           ! values can appear in multiple lines.
    domainBallAvg                       = 0 0 0 0               ! values can be separated with blanks or commas.
    domainBallCor                       = 1 0 0 0
                                          0 1 0 0
                                          0 0 1 0
                                          0 0 0 1
    domainBallCov                       = 1 0 0 0
                                          0 1 0 0
                                          0 0 1 0
                                          0 0 0 1
    domainBallStd                       = 1 1 1 1
    domainCubeLimitLower                = 4*-1.e10              ! repetition pattern rules apply here. 4 dimensions => 4-element vector of values.
    domainCubeLimitUpper(1)             = +1.e10                ! Elements of vectors can be set individually.
    domainCubeLimitUpper(2:4)           = 3*+1.e10              ! Elements of vectors can be set individually.
    domainErrCount                      = 100
    domainErrCountMax                   = 1000
    inputFileHasPriority                = FALSE                 ! This is relevant only to simulations within the Fortran programming language.
    outputChainFileFormat               = "compact"
   !outputColumnWidth                   = 25                    ! This is an example of a variable that is commented out.
                                                                ! Therefore, its value will not be read by the sampler routine.
                                                                ! To pass it to the routine, simply remove the `!` mark at the beginning of the line.
    outputFileName                      = "./out/mvn"           ! A forward-slash character at the end of the string value would indicate the specified path
                                                                ! is to be interpreted as the name of the folder to contain the simulation output files.
                                                                ! The base name for the simulation output files will be generated from the current date and time.
                                                                ! Otherwise, the specified base name at the end of the string will be used in naming the simulation output files.
    outputPrecision                     = 17
    outputReportPeriod                  = 1000
    outputRestartFileFormat             = "ascii"
    outputSampleSize                    = -1
    outputSeparator                     = ","
    outputSplashMode                    = "normal"              ! or quiet or silent.
    outputStatus                        = "retry"               ! or extend.
    parallelism                         = "multi chain"         ! "singleChain" would also work. Similarly, "multichain", "multi chain", or "multiChain".
    parallelismMpiFinalizeEnabled       = false                 ! TRUE, true, .true., .t., and t would be also all valid logical values representing truth.
    parallelismNumThread                = 3                     ! number of threads to use in shared-memory parallelism.
   !randomSeed                          = 2136275,
   !targetAcceptanceRate                = 0.23e0

    ! MCMC specifications.

    outputChainSize                     = 10000
    outputSampleRefinementCount         = 10
    outputSampleRefinementMethod        = "BatchMeans"
    proposal                            = "normal"              ! or "uniform" as you wish.
    proposalCor(:, 1)                   = 1 0 0 0               ! first matrix column.
    proposalCor(:, 2:4)                 = 0 1 0 0
                                          0 0 1 0
                                          0 0 0 1               ! other matrix columns.
    proposalCov                         = 1 0 0 0
                                          0 1 0 0
                                          0 0 1 0
                                          0 0 0 1               ! or specify all matrix elements in one statement.
    proposalScale                       = "2*0.5*Gelman"        ! The asterisk here means multiplication since it is enclosed within quotation marks.
   !proposalStart                       = 4*1.e0                ! four values of 1.e0 are specified here by the repetition pattern symbol *
    proposalStartDomainCubeLimitLower   = 4*-10.e0              ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
    proposalStartDomainCubeLimitUpper   = 4*+10.e0              ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
    proposalStartRandomized             = false
    proposalStd                         = 4*1.0                 ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.

    ! DRAM specifications.

    burninAdaptationMeasure             = 1.
    proposalAdaptationCount             = 10000000
    proposalAdaptationCountGreedy       = 0
    proposalAdaptationPeriod            = 35
    proposalDelayedRejectionCount       = 5
    proposalDelayedRejectionScale       = 4*1., 2.              ! The first four elements are 1, followed by 2.

/

Why is input-file the preferred method of simulation setup?

Specifying the properties of a ParaMonte simulation via an external input file is particularly beneficial when the ParaMonte library routines are called from within compiled languages (e.g., C/C++/Fortran). The reasons might be already clear to advanced programmers:

  • Specifying the simulation properties in an external input file ensures your simulation’s highest level of flexibility and portability by avoiding the hardcoding of simulation specifications into your compiled code. Imagine you specify a simulation property inside your code, compile and run it, and then realize that you want to change that property value to something else. Without an external input file, you would have to recompile your code every time for every property change.
  • Also, the same specification input file can be used to set up the same simulation settings from any programming language without a single line of change in the input file. The contents of the input files are programming-language-agnostic.
  • All variable names are case-insensitive across all programming languages when specified from the input file.
  • The order by which the simulation specification variables appear in the input file is irrelevant.
  • Multiple simulation namelist groups, each corresponding to an independent call to a different ParaMonte routine, can be placed within a single input file, resulting in a cleaner, more portable organization of the input data for the given simulation problem.


If you have any questions about the topics discussed on this page, feel free to ask in the comment section below, or raise an issue on the GitHub page of the library, or reach out to the ParaMonte library authors.