## Vectorization in MATLAB

Experienced programmers who are concerned with producing compact and fast programs, try to avoid for-loops wherever possible in their MATLAB codes. There is a reason for this: for-loops and while-loops have significant overhead in interpreted languages such as MATLAB and Python.

There is, of course, a remedy for this inefficiency. Since MATLAB is a matrix language, many of the matrix-level operations and functions are carried out internally using compiled C, Fortran, or assembly codes and are therefore executed at near-optimum efficiency. This is true of the arithmetic operators `*`

, `+`

,`-`

,`\`

, `/`

and of relational and logical operators.

However, for-loops may be executed relatively slowly, depending on what is inside the loop, and MATLAB may or may not be able to optimize the loop. One of the most important tips for producing efficient M-files is to avoid for-loops in favor of vectorized constructs, that is, to convert for-loops into an equivalent vector or matrix operations. Vectorization has important benefits beyond simply increasing the speed of execution. It can lead to shorter and more readable MATLAB code. Furthermore, it expresses algorithms in terms of high-level constructs that are more appropriate for high-performance computing. For example, consider the process of summation of a random vector in MATLAB,

```
n = 5e7; x = randn(n,1);
tic, s = 0; for i=1:n, s = s + x(i)^2; end, toc
```

```
Elapsed time is 0.581945 seconds.
```

Now doing the same thing, using array notation would yield,

```
tic, s = sum(x.^2); toc
```

```
Elapsed time is 0.200450 seconds.
```

Amazing, isn’t it? You get almost 3x speedup in your MATLAB code if you use vectorized computation instead of for-loops. Later on, we will see that MATLAB has inherited these excellent vectorization techniques and syntax for matrix calculations from its high-performance ancestor, Fortran.

### Exercise

How do you vectorize the following code?

```
i = 0;
for t = 0:.01:10
i = i + 1;
y(i) = sin(t);
end
```

**Answer**

```
t = 0:.01:10;
y = sin(t);
```

## Vectorization of array operations

Vectorization of arrays can be done through **array operators**, which perform the same operation for all elements in the data set. These types of operations are useful for repetitive calculations. For example, suppose you collect the volume (V) of various cones by recording their diameter (D) and height (H). If you collect the information for just one cone, you can calculate the volume for that single cone as,

```
D = 0.2;
H = 0.04;
V = 1/12*pi*(D^2)*H
```

```
V =
4.1888e-04
```

Now, suppose we collect information on 10,000 cones. The vectors `D`

and `H`

each contain 10,000 elements, and you want to calculate 10,000 volumes. In most programming languages (except Fortran and R which have similar vectorization capabilities to MATLAB), you need to set up a loop similar to this MATLAB code (here instead of 10000, I am using 7),

```
D = [-0.2 1.0 1.5 3.0 -1.0 4.2 3.14];
H = [0.0400 1.0000 2.2500 9.0000 1.0000 17.6400 9.8596];
for n = 1:7
V(n) = 1/12*pi*(D(n)^2)*H(n);
end
V
```

```
V =
0.0004 0.2618 1.3254 21.2058 0.2618 81.4640 25.4500
```

With MATLAB, you can perform the calculation for each element of a vector with similar syntax as the scalar case,

```
V = 1/12*pi*(D.^2).*H; % Vectorized Calculation
V
```

```
V =
0.0004 0.2618 1.3254 21.2058 0.2618 81.4640 25.4500
```

**Important:**Placing a period (

`.`

) before the operators `*`

, `/`

, and `^`

, transforms them into array operators.## Logical array operations

MATLAB comparison operators also accept vector inputs and return vector outputs. For example, suppose while collecting data from 10,000 cones, you record several negative values for the diameter. You can determine which values in a vector are valid with the `>=`

operator,

```
D = [-0.2 1.0 1.5 3.0 -1.0 4.2 3.14];
D >= 0
```

```
ans =
0 1 1 1 0 1 1
```

```
class(ans)
```

```
ans =
logical
```

You can directly exploit the logical indexing power of MATLAB to select the valid cone volumes, `Vgood`

, for which the corresponding elements of D are nonnegative,

```
Vgood = V(D >= 0) % removing all data corresponding to negative diameters
```

```
Vgood =
0.2618 1.3254 21.2058 81.4640 25.4500
```

MATLAB allows you to perform a logical AND or OR on the elements of an entire vector with the functions all and any, respectively. You can throw a warning if all values of `D`

are below zero,

```
if all(D < 0) % gives no warning because not all values are negative
warning('All values of diameter are negative.')
end
```

or,

```
if (D < 0)
warning('Some values of diameter are negative.')
end
```

```
Warning: Some values of diameter are negative.
```

MATLAB can also compare two vectors of the same size, allowing you to impose further restrictions. This code finds all the values where `V`

is nonnegative and `D`

is greater than `H`

,

```
D = [-0.2 1.0 1.5 3.0 -1.0 4.2 3.14];
H = [0.0400 1.0000 2.2500 1.5000 1.0000 0.6400 9.8596];
V((V >= 0) & (D > H))
```

```
ans =
21.2058 81.4640
```

```
V
```

```
V =
0.0004 0.2618 1.3254 21.2058 0.2618 81.4640 25.4500
```

```
(V >= 0) & (D > H)
```

```
ans =
0 0 0 1 0 1 0
```

The resulting vector is the same size as the inputs. To aid comparison, MATLAB contains special values to denote **overflow**, **underflow**, and **undefined operators**, such as **inf** and **nan**. Logical operators isinf and isnan exist to help perform logical tests for these special values. For example, it is often useful to exclude `NaN`

values from computations,

```
x = [2 -1 0 3 NaN 2 NaN 11 4 Inf];
xvalid = x(~isnan(x))
```

```
xvalid =
2 -1 0 3 2 11 4 Inf
```

**Warning:**Note that

`Inf == Inf`

returns `true`

; however, `NaN == NaN`

always returns `false`

in MATLAB.## Matrix Operations

Matrix operations act according to the rules of linear algebra. These operations are most useful in vectorization if you are working with multidimensional data. Suppose you want to evaluate a function, $F$, of two variables, $x$, and $y$,

To evaluate this function at every combination of points in the $x$ and $y$, you need to define a grid of values,

```
x = -2:0.2:2;
y = -1.5:0.2:1.5;
[X,Y] = meshgrid(x,y);
F = X.*exp(-X.^2-Y.^2);
```

Without `meshgrid()`

, you might need to write two for-loops to iterate through vector combinations. The function `ndgrid()`

also creates number grids from vectors, but unlike `meshgrid()`

, it can construct grids beyond three dimensions. `meshgrid()`

can only construct 2-D and 3-D grids.

The following table contains a list of MATLAB functions that are commonly used in vectorized codes,

Function | Description |
---|---|

`all` | Determine if all array elements are nonzero or true |

`any` | Determine if any array elements are nonzero |

`cumsum` | Cumulative sum |

`diff` | Differences and Approximate Derivatives |

`find` | Find indices and values of nonzero elements |

`ind2sub` | Subscripts from linear index |

`ipermute` | Inverse permute dimensions of N-D array |

`logical` | Convert numeric values to logicals |

`meshgrid` | Rectangular grid in 2-D and 3-D space |

`ndgrid` | Rectangular grid in N-D space |

`permute` | Rearrange dimensions of N-D array |

`prod` | Product of array elements |

`repmat` | Repeat copies of array |

`reshape` | Reshape array |

`shiftdim` | Shift dimensions |

`sort` | Sort array elements |

`squeeze` | Remove singleton dimensions |

`sub2ind` | Convert subscripts to linear indices |

`sum` | Sum of array elements |

## Why is vectorized code faster than for-loops?

The reason for the speedup in vectorized has to be sought in the way the memory of the computer is built. The figure below represents a schematic diagram of the Central Processing Unit (CPU) of every modern computer in relationship with computer memory.

At the highest level of the memory hierarchy, closest to the CPU, we have the CPU register. A processor register is a quickly accessible location available to a computer’s CPU. Registers usually consist of a small amount of fast storage and maybe read-only or write-only. The CPU has super fast access to data stored in the register. But the problem is that this memory is very small, typically on the orders of bits of information.

At the second level of the hierarchy of memory, we have the CPU cache, typically comprised of three different levels L1, L2, L3, which rank from fastest to slowest respectively, in terms of CPU access. However, the faster the cache memory, the smaller it is. Therefore, L1 is the fastest of the three, but also the smallest of the three levels.

CPU Caching was invented to solve a significant problem. In the early decades of computing, the main memory was extremely slow and incredibly expensive — but CPUs weren’t particularly fast, either. Starting in the 1980s, the gap began to widen quickly. Microprocessor clock speeds took off, but memory access times improved far less dramatically. As this gap grew, it became increasingly clear that a new type of fast memory was needed to bridge the gap. See the figure below.

After CPU cache, there the Random Access Memory (RAM) which you hear the most about when you go to buy a new computer. Typical computers contain 4-32 Gb of RAM. When you open MATLAB and create some variables, all of your data is stored on this memory. However, this memory is the slowest of all in terms of access to CPU.

When you use for-loops in MATLAB to perform some specific calculations on a vector, you are asking MATLAB to go to this memory at each loop iteration to fetch an element of the loop, bring it to the CPU, perform the set of operations requested, and send it back to memory. However, the CPU is much more capable than doing a single calculation at a time. Therefore, if you could somehow tell MATLAB to fetch a bulk of elements from your vector and bring them to CPU to perform the requested operations, your code would become much faster. The way to tell MATLAB to do so is called **vectorization**. By vectorizing your code, you tell MATLAB to bring as much information as possible to the highest memory level close to CPU, to perform the operations on all of them simultaneously and return the result for all of them back to the memory altogether. This results in much faster code, since nowadays, as the figure above shows, the bottleneck in code performance is not the CPU speed, but the memory access.

## Measuring the performance of your MATLAB functions and scripts

MATLAB has several built-in methods of timing how long it takes to run a MATLAB function or script. The `timeit()`

function as well as `tic`

and `toc`

, are in particular very useful. Use the `timeit()`

function for rigorous measurement of your function’s execution time. Use `tic`

and `toc`

to estimate time for smaller portions of code that are not complete functions.

For additional details about the performance of your code, such as function call information and execution time of individual lines of code, MATLAB has more sophisticated tools such as *MATLAB® Profiler*.

### Timing MATLAB functions

To measure the time required to run a function, whether built-in or your own, you can use the `timeit()`

function. The `timeit()`

function calls the user-specified function multiple times and returns the median of the time measurements. This function takes **a handle to the function whose performance is to be measured** and returns the typical execution time, in seconds.

For example, suppose that you want to measure the performance of MATLAB’s built-in function, `isprime()`

for a given input value to this function. You can compute the time to execute the function using `timeit()`

like the following,

```
timeit( @()isprime(10^14) ) % pass the function as a handle to timeit()
```

```
ans =
0.0787
```

Note that, this function `isprime()`

will have different performance given different input numbers,

```
timeit( @()isprime(10^4) ) % pass the function as a handle to timeit()
```

```
ans =
2.0402e-05
```

### Time Portions of Code

To estimate how long a portion of your program takes to run or to compare the speed of different implementations of portions of your program, you can use MATLAB stopwatch timer functions: `tic`

and `toc`

. Invoking `tic`

starts the timer, and the next `toc`

reads the elapsed time.

```
tic
% The program section to time.
toc
```

Sometimes programs run too fast for `tic`

and `toc`

to provide useful data. If your code is faster than **1/10 second**, consider timing it while running in a loop, and then average the result to find the time for a single run of the loop.

#### The `cputime()`

function vs. `tic/toc`

and `timeit()`

There is another MATLAB function that can do the timing of your scripts or your functions: The `cputime()`

function measures the total CPU time and **sums across all threads (cores) in the CPU**. This measurement is different from the **wall-clock time** that `timeit()`

or `tic/toc`

return, and could be misleading. For example, the CPU time for the pause function is typically small, but the wall-clock time accounts for the actual time that MATLAB execution is paused. Therefore, the wall-clock time might be longer.

If your function uses four processing cores equally, the CPU time could be approximately four times higher than the wall-clock time.

Frequently, your best choice to measure the performance of your code is `timeit()`

or `tic`

and `toc`

. These functions return wall-clock time. Note that, unlike `tic`

and `toc`

, the `timeit()`

function calls your code multiple times, and, therefore, considers the cost of first-time calls to your functions, which are typically more time-consuming than subsequent calls.

### Some tips for Measuring Performance

- Always time a
**significant enough**portion of code. Normally, the code that you are timing should**take more than 1/10 second to run**, otherwise, the timing may not be very accurate. - Put the code you are trying to time into a function instead of timing it at the command line or inside a script.
- Unless you are trying to measure
**first-time cost of running your code**, run your code multiple times. Use the`timeit()`

function for multiple calls timing of your function. - Avoid
`clear all`

when measuring the performance of your MATLAB scripts. This will add additional time to wipe MATLAB workspace from all current existing variable definitions, and therefore contaminate the timing measurements of the actual code in your MATLAB scripts. - When performing timing measurements, assign your output to a variable instead of letting it default to
`ans()`

.