When is the input file needed?

The input files to the ParaMonte library routines are completely optional and if not provided, all simulation specifications will be set to some appropriate default values or to the ParaMonte routine’s best guess for the proper values. However, if the user wants to fine-tune the specifics of a ParaMonte simulation, then depending on the programming language interface to the ParaMonte library, the input file may or may not be the sole method of setting up the simulation specifications,

  • The C/C++ programming language: Providing the path to an input external file is the sole method of simulation setup by the user.
  • The Fortran programming language: Providing the path to an input external file is the preferred method of simulation setup by the user. However, the input file is not the sole method of specifying the simulation setup. See the usage page for more information about the alternative, less flexible method.
  • The MATLAB programming language: Providing the path to an input external file is NOT the preferred method of simulation setup by the user. This is because the ParaMonte MATLAB library already has a much more flexible dynamic method of simulation specifications setup from within the MATLAB programming language. Nevertheless, it is equally possible to specify everything from within an input simulation specifications file.
  • The Python programming language: Providing the path to an input external file is NOT the preferred method of simulation setup by the user. This is because the ParaMonte Python library already has a much more flexible dynamic method of simulation specifications setup from within the Python programming language. Nevertheless, it is equally possible to specify everything from within an input simulation specifications file.

The structure of the optional input file

Here is a summary of useful guildelines and rules for writing ParaMonte input files.

Organization

  • The structure of the input file for all ParaMonte samplers is the same in all programming languages.
  • The simulation specifications for each ParaMonte sampler (e.g., the ParaDRAM MCMC sampler) must be grouped under the ParaMonte routine’s name. We call each group a namelist.
  • Each group, corresponding to one ParaMonte sampler routine, is identified by a group-name preceded by an & and ends by a forward-slash / (see below for an example input file).
  • Multiple namelist groups can coexist within a sinle input file. Only the ones relevant to the simulation of interest will be read and used. The rest will be ignored.
  • Comments are allowed anywhere inside the input file.
  • Comments must begin with an exclamation mark (!).
  • Comments can appear anywhere on an empty line or, after a value assignment.

Variables

  • all variable assignments are optional and can be dropped or commented out. In such cases, the ParaMonte routines will assign appropriate default values to the missing variables in the input file.
  • Variables within a namelist group can be separated from each other by either colons , or white-space characters.
  • The order by which the variables appear within a namelist group is irrelevant and unimportant.
  • Variables can be defined multiple times, but only the last definition will be considered as input.
  • All variable names are case-insensitive. However, for clarity, the ParaMonte library follows the camelCase code-writing practice.

Values

  • Like variables, values within a namelist group can be separated from each other by either colons , or white-space characters.

  • Strings
    • String values must be enclosed with either single or double quotation marks: '' or "".
    • String values can be continued on multiple lines, however, any additional white-space characters as a result of the line-continuation will NOT be ignored.
  • Logical (Boolean)
    • Logical values are all case-insensitive and can be either .true., true, or t for a TRUE value, and .false., false, or f for a FALSE value.
  • Real (Float)
    • Real values are by default double-precision and capable of holding up to 16 digits of precision and representing numbers as large as $\approx 10^{307}$ and as tiny as $\approx 10^{-307}$.

Verctors

  • All vectors and arrays that are specified inside the input file begin with index 1. This follows the convention of the majority of science-oriented programming languages and libraries including but not limited to Fortran, Julia, Mathematica, MATLAB, R, LAPACK, and Eigen (C++).
  • Vectors (and arrays) of strings, integers, or real numbers can be specified as comma-separated or space-separated values. For example,
    ! real-valued vector of length 4, specified as the starting point of an MCMC simulation
    startPointVec = 1.0, -100 3, 5.6e7 8.d1
    

    You may have noticed in the above that some values are comma-separated while some others are space-separated. This is valid.

  • Vector (and array) values may be specified separately on multiple lines and in random order like the following,
    ! a vector of strings specifying the names of the variables that are going to be sampled in the simulation, 
    ! each corresponding to one dimension of the objective function.
    variableNameList(2) = "secondVariable"
    variableNameList(1) = "FirstVariable"
    variableNameList(3:4) = "ThirdVariable", "FourthVariable"
    
  • Vector values may be selectively provided in the input file and some values may be missing. For example,
    ! a vector of length 4 specifying the random walker's step-sizes along different dimensions of the objective function in a ParaDRAM simulation.
    proposalStartStdVec(3) = 3.0
    proposalStartStdVec(1:2) = 1.0, 2.0
    

    or,

    proposalStartStdVec = 1.0, 2.0, 3.0     ! this is identical to the above representation
    

    Notice that the missing fourth variable will not be read from the input file. Instead, it be assigned a default value by the ParaMonte routines.

  • Similar values in a vector that appear sequenctially, can be represented in abbreviated format via a repetition pattern rule involving *. For example,
    ! vector of length 4, specifying the lower limits of the domain of the objective function along each dimension
    domainLowerLimitVec = -3.d100, 2*-20.0, -100
    

    is equivalent to,

    domainLowerLimitVec = -3.d100, -20.0, -20.0, -100
    

    or,

    domainLowerLimitVec = 3*-3.d100
    

    is equivalent to,

    domainLowerLimitVec = -3.d100, -3.d100, -3.d100,    ! notice the fourth value is missing
    

    Note that in the latter example, only the first three values were provided. In such case, the missing elements will be assigned appropriate default values.

Arrays

  • The Array representation rules are identical to those of vectors described in the previous section. For example, the following array value assignments are all equivalent,
    ! a symmetric matrix of size 4-by-4 of 64-bit real numbers representing the initial covariance matrix of the ParaDRAM sampler
    proposalStartCovMat =   1.0, 0.0, 0.0, 0.0,
                            0.0, 1.0, 0.0, 0.0,
                            0.0, 0.0, 1.0, 0.0,
                            0.0, 0.0, 0.0, 1.0,
    

    or,

    proposalStartCovMat(:,1) = 1.0, 0.0, 0.0, 0.0,
    proposalStartCovMat(:,2) = 0.0, 1.0, 0.0, 0.0,
    proposalStartCovMat(:,3) = 0.0, 0.0, 1.0, 0.0,
    proposalStartCovMat(:,4) = 0.0, 0.0, 0.0, 1.0,
    

    or,

    proposalStartCovMat(1:4,1:4) = 1.0, 4*0.0, 1.0, 4*0.0, 1.0, 4*0.0, 1.0
    

Example contents of a ParaMonte simulation input file

The following box shows the contents of an example input specifications file for a ParaDRAM simulation of an objective function defined on a 4-dimensional domain (Notice the group name &ParaDRAM at the beginning and / at the end). Notice the ample usage of the comment symbol wherever the user deems appropriate,

!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!%  
!%  Description:
!%      +   Run the Monte Carlo sampler of the ParaMonte library given the input log-target density function `getLogFunc()`.
!%  Output:
!%      +   The simulation output files.
!%  Author:
!%      +   Computational Data Science Lab, Monday 9:03 AM, May 16 2016, ICES, UT Austin
!%  Visit:
!%      +   https://www.cdslab.org/paramonte
!%
!%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%
!
!   USAGE:
!
!       --  Comments must begin with an exclamation mark (!).
!       --  Comments can appear anywhere on an empty line or, after a value assignment.
!       --  All variable assignments are optional and can be commented out. In such cases, appropriate default values will be assigned.
!       --  Use ParaDRAM namelist (group) name to group a set of ParaDRAM simulation specification variables.
!       --  The order of the input variables in the namelist groups is irrelevant and unimportant.
!       --  Variables can be defined multiple times, but only the last definition will be considered as input.
!       --  All variable names are case insensitive. However, for clarity, this software follows the camelCase code-writing practice.
!       --  String values must be enclosed with either single or double quotation marks.
!       --  Logical values are case-insensitive and can be either .true., true, or t for a TRUE value, and .false., false, or f for a FALSE value.
!       --  All vectors and arrays in the input file begin with index 1. This is following the convention of 
!           the majority of science-oriented programming languages: Fortran, Julia, Mathematica, MATLAB, and R.
!
!      For comprehensive guidelines on the input file organization and rules, visit: 
!   
!           https://www.cdslab.org/paramonte/notes/usage/paradram/input/
!   
!      To see detailed descriptions of each of variables, visit:
!   
!           https://www.cdslab.org/paramonte/notes/usage/paradram/specifications/
!

&ParaDRAM

    ! Base specifications

    description                         = "
This\n
    is a\n
        multi-line\n
            description."                                   ! strings must be enclosed with "" or '' and can be continued on multiple lines.
                                                            ! No comments within strings are allowed.
   !outputColumnWidth                   = 25                ! this is an example of a variable that is commented out and 
                                                            ! therefore, its value won't be read by the sampler routine.
                                                            ! To pass it to the routine, simply remove the ! mark at 
                                                            ! the beginning of the line.
    outputRealPrecision                 = 17
   !outputDelimiter                     = ","
    outputFileName                      = "./out/"          ! the last forward-slash character indicates that this 
                                                            ! is the folder where the output files will have to stored.
                                                            ! However, since no output filename prefix has been specified,
                                                            ! the output filenames will be assigned a randomly-generated prefix.
   !sampleSize                          = 111
    randomSeed                          = 2136275,
    chainFileFormat                     = "compact"
    variableNameList                    = "variable1"       ! Notice the missing fourth variable name here. 
                                        , "variable2"       ! Any missing variable name will be automatically assigned an appropriate name. 
                                        , "variable3"
    domainLowerLimitVec                 = 4*-1.e300         ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
    domainUpperLimitVec                 = 4*1.e300          ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
    parallelizationModel                = "single chain"    ! "singleChain" would also work. Similarly, "multichain", "multi chain", or "multiChain".
   !targetAcceptanceRate                = 0.23e0
    progressReportPeriod                = 1000
    maxNumDomainCheckToWarn             = 100
    maxNumDomainCheckToStop             = 1000
    restartFileFormat                   = "binary"
    overwriteRequested                  = true              ! FALSE, false, .false., .f., and f would be also all valid logical values representing False
    silentModeRequested                 = false             ! FALSE, false, .false., .f., and f would be also all valid logical values representing False
   !mpiFinalizeRequested                = true              ! TRUE, true, .true., .t., and t would be also all valid logical values representing True

    ! MCMC specifications

    chainSize                           = 30000
    startPointVec                       = 4*1.e0            ! four values of 1.e0 are specified here by the repetition pattern symbol *
   !sampleRefinementCount               = 10
    sampleRefinementMethod              = "BatchMeans"
    randomStartPointDomainLowerLimitVec = 4*-100.e0         ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
    randomStartPointDomainUpperLimitVec = 4*100.0           ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
    randomStartPointRequested           = false

    ! DRAM specifications

    scaleFactor                         = "2*0.5*Gelman"    ! The asterisk here means multiplication since it is enclosed within quotation marks.
    proposalModel                       = "normal"          ! or "uniform" as you wish.
    adaptiveUpdateCount                 = 10000000
    adaptiveUpdatePeriod                = 35
    greedyAdaptationCount               = 0
   !delayedRejectionCount               = 5
   !delayedRejectionScaleFactorVec      = 5*1.
   !burninAdaptationMeasure             = 1.
   !proposalStartStdVec                 = 4*1.0             ! repetition pattern rules apply again here. 4 dimensions => 4-element vector of values.
   !proposalStartCorMat                 =   1,0,0,0,        ! 2-dimensional correlation-matrix definition, although it is commented out and won't be read.
   !                                        0,1,0,0,
   !                                        0,0,1,0,
   !                                        0,0,0,1,
   !proposalStartCovMat                 =   100,0,0,0,
   !                                        0,100,0,0,
   !                                        0,0,100,0,
   !                                        0,0,0,100,

/

Why is input-file the preferred method of simulation setup?

Specifying the properties of a ParaMonte simulation via an external input file is particularly beneficial when the ParaMonte library routines are called from within compiled languages (e.g., C/C++/Fortran). The reasons might be already clear to advanced programmers:

  • Specifying the simulation properties in an external input file ensures the highest level of flexibility and portability of your simulation by avoiding the hardcoding of simulation specifications into your compiled code. Imagine you specify a simulation property inside your code, compile and run it, then you realize that you wanted to change that property value to something else. Without an external input file, you would have to recompile your code every time for every property change.
  • Also, the same specification input file can be used to set up the same simulation settings from any programming language without a single line of change in the input file. The contents of the input files are programming-language-agnostic.
  • All variable names, when specified from the input file, are all case-insensitive across all programming languages.
  • The order by which the simulation specification variables appear in the input file is irrelevant.
  • Multiple simulation namelist groups, each corresponding to an independent call to a different ParaMonte routine, can be placed within a single input file, resulting in a cleaner, more portable organization of the input data for the given simulation problem.


If you have any questions about the topics discussed on this page, feel free to ask in the comment section below, or raise an issue on the GitHub page of the library, or reach out to the ParaMonte library authors.