PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428Jekyll2019-12-07T14:20:13-06:00http:/DSP2019F/Amir Shahmoradihttp:/DSP2019F/shahmoradi@utexas.edu<![CDATA[Quiz 3: Monte Carlo Integration]]>http:/DSP2019F/quiz/3-monte-carlo2019-11-21T00:00:00-06:002019-11-21T00:00:00-06:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>This quiz aims to testing your basic knowledge of Monte Carlo integration techniques in Python. Please commit all of your answers to your repository within 15 minutes from the start of the quiz.</p>
<hr />
<hr />
<p><br /></p>
<p><strong>1.</strong> <a href="https://www.cdslab.org/recipes/programming/monte-carlo-sampling-of-bimodal-gaussian/monte-carlo-sampling-of-bimodal-gaussian" target="_blank">Monte Carlo sampling of the sum of two Gaussian distributions</a>.</p>
<p><a href="http:/DSP2019F/quiz/3-monte-carlo">Quiz 3: Monte Carlo Integration</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on November 21, 2019.</p><![CDATA[Homework 4: Clustering techniques]]>http:/DSP2019F/homework/4-clustering2019-11-21T00:00:00-06:002019-11-21T00:00:00-06:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>♣ <strong>Due Date: Tuesday Dec 10, 2019 2:00 PM</strong>. This homework aims at giving you some basic experience with the some of the most popular clustering techniques.</p>
<p><strong>1.</strong> <a href="https://www.cdslab.org/recipes/programming/clustering-kmeans/clustering-kmeans" target="_blank">Kmeans Clustering</a>.</p>
<p><a href="http:/DSP2019F/homework/4-clustering">Homework 4: Clustering techniques</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on November 21, 2019.</p><![CDATA[Final exam: semester project]]>http:/DSP2019F/exam/1-semester-project2019-11-01T00:00:00-05:002019-11-01T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>This is page describes the final semester project that will serve as the final exam for this course. Please submit all your efforts for this project (all files, data, and results) in <code>DSP2019F/exams/final/</code> directory in your private repository for this course. Don’t forget to push your answers to your remote Github repository by <strong>2 PM, Wednesday, Dec 11, 2019</strong>. <strong>Note: I strongly urge you to attend the future lectures until the end of the semester and seek help from the instructor (Amir) to tackle this project.</strong></p>
<p>Inside the directory for the project (<code>DSP2019F/exams/final/</code>) create three other folders: <code>data</code>, <code>src</code>, <code>results</code>. The <code>data</code> folder contains the <a href="http:/DSP2019F/exam/1-problem/cells.mat" target="_blank">input data</a> for this project. The <code>src</code> folder should contain all the codes that you write for this project, and the <code>results</code> folder should contain all the results generated by your code.</p>
<p>For your final project, you can pick one of the following two projects:</p>
<div class="post_toc"></div>
<h2 id="nonlinear-regression">Nonlinear Regression</h2>
<p><br /><br /></p>
<h3 id="data-reduction-and-visualization">Data reduction and visualization</h3>
<p>Our goal in this project is to fit a mathematical model of the growth of living cells to real experimental data for the growth of a cancer tumor in the brain of a rat. You can download the data in the form of a MATLAB data file for this project from <a href="http:/DSP2019F/exam/1-problem/cells.mat" target="_blank">here</a>. Write a set of separate Python codes that perform the following tasks one after the other, and output all the results to the <code>results</code> folder described above. Since you have multiple Python codes each in a separate file for different purposes, you should also write a <code>main</code> Python code, such that when the user of your codes runs on the Bash command line,</p>
<pre><code class="language-bash">main.py
</code></pre>
<p>then all the necessary Python codes to generate all the results will be called by this <code>main</code> script.</p>
<p>Initially at time $t=0 ~\mathrm{[days]}$, $100,000\pm10,000$ brain tumor cells are injected to the brain of the rat. These cells are then allowed to grow for 10 days. Then starting at day 10, the brain of the rat is imaged using an <a href="https://en.wikipedia.org/wiki/Magnetic_resonance_imaging" target="_blank">MRI machine</a>.</p>
<p>Each image results in a 4-dimensional double-precision MATLAB matrix <code>cells(:,:,:,:)</code>, corresponding to dimensions <code>cells(y,x,z,time)</code>. This data is collected from MRI imaging of the rat’s brain almost every other day for two weeks. For example, <code>cells(:,:,:,1)</code> contains the number of cells at each point in space (y,x,z) at the first time point, or, <code>cells(:,:,10,1)</code> represents a (XY) slice of MRI at $z=1$ and $t=1 [days]$.</p>
<p>Therefore, the vector of times at which we have the number of tumor cells measured would be,</p>
<script type="math/tex; mode=display">Time = [ 0, 10, 12, 14, 16, 18, 20, 22 ] ~,</script>
<p>in units of days. Given this data set,</p>
<p><strong>1. </strong> First write a Python script that reads the input MATLAB binary file containing cell numbers at different positions in the rat’s brain measured by MRI, on different days.</p>
<p><strong>2. </strong> Write Python codes that generate a set of figures as similar as possible to the following figures (specific color-codes of the curves and figures do not matter, focus more on the format of the plots and its parts).</p>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/tvccZSliceSubplotWithXYlab_rad_00gy_1_t10.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/tvccZSliceSubplotWithXYlab_rad_00gy_2_t12.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/tvccZSliceSubplotWithXYlab_rad_00gy_3_t14.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/tvccZSliceSubplotWithXYlab_rad_00gy_5_t16.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/tvccZSliceSubplotWithXYlab_rad_00gy_6_t18.0.png" width="900" />
</figure>
<p><br /></p>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/tvccZSliceSubplotWithXYlab_rad_00gy_7_t20.0.png" width="900" />
</figure>
<p><br /></p>
<h3 id="obtaining-the-error-in-tumor-cell-count">Obtaining the error in tumor cell count</h3>
<p><strong>3. </strong> Our assumption here is that the uncertainty in the total number of tumor cells at each time point is given by the number of tumor cells at the boundary of the tumor. Therefore,</p>
<ul>
<li>
<p><strong>(This part is optional extra credit.)</strong> you will have to write a Python code that identifies the boundary of the tumor at each time point and then sums over the count cells in all boundary points and uses that as the error in the number of tumor cell counts.</p>
</li>
<li>
<p>If you did not solve the above optional part, then assume that the uncertainty in the count of tumor cells at any given point in time is just $5\%$ of the total count of tumor cells. For the illustration of the error bars, you will need Python functions such as <code>pyplot.errorbar()</code> of <code>matplotlib</code> module. In the end, you should get and save a figure in your project’s figure folder like the following figure,</p>
</li>
</ul>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/growthCurve.png" width="900" />
</figure>
<p><br /></p>
<p>Note that this part of the project is completely independent of the modeling part described in the following section.</p>
<h3 id="the-mathematical-model-of-tumor-growth">The mathematical model of tumor growth</h3>
<p><strong>4. </strong> Now our goal is to fit the time evolution of the growth of this tumor, using a mathematical model. To do so, we need to find the best-fit parameters of the model. The mathematical model we will use here is called the <a href="https://en.wikipedia.org/wiki/Gompertz_function" target="_blank">Gompertzian growth model</a>. Here, we will use a slightly modified form of the Gompertzian function of the following form,</p>
<script type="math/tex; mode=display">N(t,\lambda,c) = N_0 \times \exp\bigg( \lambda~\bigg[ 1-\exp(-ct) \bigg] \bigg) ~,</script>
<p>where $N(t,\lambda,c)$ is the <strong>predicted number</strong> of tumor cells at time $t$, $N_0$ is the initial number of tumor cells at time $t=0$ days, $\lambda$ is the growth rate parameter of the model, and $c$ is just another parameter of the model. We already know the initial value of the number of tumor cells, $N_0=100,000\pm10,000$. Therefore, we can fix $N_0$ to $100,000$ in the equation of the model given above.</p>
<p>However, we don’t know the values of the parameters $\lambda$ and $c$. Thus, we would like to find their best values given the input tumor cell data using some Python optimization algorithm.</p>
<p>This Gompertzian growth model is called our <strong>physical model</strong> for this problem, because it describes the physics of our problem (The physics/biology of the tumor growth).</p>
<h4 id="combining-the-physical-model-with-a-regression-model">Combining the physical model with a regression model</h4>
<p>Now, if our physical model was ideally perfect in describing the data, the curve of the model prediction would pass through all the points in the growth curve plot of the above figure, thus providing a perfect description of data. This is, however, never the case, as it is famously said <strong>all models are wrong, but some are useful</strong>. In other words, the model prediction never matches observation perfectly.</p>
<p>Therefore, we have to seek for the parameter values that can bring the model prediction us as close as possible to data. To do so, we define a <strong>statistical model</strong> in addition to the <strong>physical model</strong> described above. In other words, we have to define a statistical regression model (the renowned <strong>least-squares method</strong>) that gives us the probability $\pi(\log N_{obs}|\log N(t))$ of observing individual data points at each of the given times,</p>
<script type="math/tex; mode=display">\pi(\log N_{obs} | \log N(t,\lambda,c),\sigma) = \frac{1}{\sigma\sqrt{2\pi}} \exp\bigg( - \frac{ \big[ \log N_{obs}(t)-\log N(t,\lambda,c) \big]^2}{2\sigma^2} \bigg) ~,</script>
<p>Note that our statistical model given above is a Normal probability density function, with its mean parameter represented by <strong>the log</strong> of the output of our physical model, $\log N(t,\lambda,c)$, and its standard deviation represented by $\sigma$, which is unknown, and we seek to find it. The symbol $\pi$, whenever it appears with parentheses, like $\pi()$, it means the probability of the entity inside the parentheses. However, whenever it appears alone, it means the famous number PI, $\pi\approx 3.1415$.</p>
<p><strong>Why do we use the logarithm of the number of cells instead of using the number of cells directly?</strong> The reason behind it is slightly complicated. A simple (but not entirely correct argument) is the following: We do so, because the tumor cell counts at later times become extremely large numbers, on the order of several million cells (For example, look at the number of cells in the late stages of the tumor growth, around $t=20$ days). Therefore, to make sure that we don’t hit any numerical precision limits of the computer when dealing with such huge numbers, we work with the logarithm of the number of tumor cells instead of their true non-logarithmic values.</p>
<p>We have seven data points, so the overall probability of observing all of data $\mathcal{D}$ together (the time vector and the logarithm of the number of cells at different times) given the parameters of the model, $\mathcal{L}(\mathcal{D}|\lambda,c,\sigma)$, is the product of their individual probabilities of observations given by the above equation,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\mathcal{L}(\mathcal{D}|\lambda,c,\sigma)
&= \prod_{i=1}^{n=8} \pi(\log N_{obs}(t_i) | \log N(t_i,\lambda,c),\sigma) \\\\
&= \prod_{i=1}^{n=8} \frac{1}{\sigma\sqrt{2\pi}} \exp\bigg( - \frac{ \big[ \log N_{obs}(t_i)-\log N(t_i,\lambda,c) \big]^2}{2\sigma^2} \bigg) ~.
\end{align*} %]]></script>
<p>Frequently, however, you would want to work with $\log\mathcal{L}$ instead of $\mathcal{L}$. This is again because the numbers involved are extremely small often below the precision limits of the computer. So, by taking the logarithm of the numbers, we work instead with the number’s exponent, which looks just like a normal number (not so big, not so small). So, by taking the log, the above equation becomes,</p>
<script type="math/tex; mode=display">% <![CDATA[
\begin{align*}
\log\mathcal{L}(\mathcal{D}|\lambda,c,\sigma)
&= \sum_{i=1}^{n=8} \log \pi( \log N_{obs}(t_i) | \log N(t_i,\lambda,c),\sigma) \\\\
&= \sum_{i=1}^{n=8} \log \bigg[ \frac{1}{\sigma\sqrt{2\pi}} \exp\bigg( - \frac{ \big[ \log N_{obs}(t_i) - \log N(t_i,\lambda,c) \big]^2}{2\sigma^2} \bigg) \bigg] ~.
\end{align*} %]]></script>
<p><br /></p>
<p><strong>5. </strong>
Now the goal is to use an optimization algorithm in Python, such as <code>fmin()</code> of <code>scipy</code> package, to find the most likely set of the parameters of the model $\lambda,c,\sigma$ that give the highest likelihood of obtaining the available data, which is given by the number $\log\mathcal{L}(\mathcal{D}|\lambda,c,\sigma)$ from the above equation. So we want to find the set of parameters for which this number given by the above equation is maximized. You can also use any Python optimization function or method that you wish, to obtain the best parameters.</p>
<p>However, if you use <code>fmin()</code> of <code>scipy</code> package, then note that this function finds the minimum of an input function, not the maximum. What we want is to find the maximum of $\log\mathcal{L}(\mathcal{D}|\lambda,c,\sigma)$.
What is the solution then? Very simple.
We can multiply the value of $\log\mathcal{L}(\mathcal{D}|\lambda,c,\sigma)$ by a negative sign so that the maximum value is converted to a minimum. But, note that the position (the set of parameter values) at which this minimum occurs, will remain the same as the maximum position for $\log\mathcal{L}(\mathcal{D}|\lambda,c,\sigma)$.</p>
<p>So, now rewrite your likelihood function above by multiplying its final result (which is just number) by a negative sign. Then you pass this modified function to <code>fmin()</code> of <code>scipy</code> package and you find the optimal parameters. Note that <code>fmin()</code> of <code>scipy</code> package takes as input also a set of initial staring parameter values to initiate the search for the optimal parameters. You can use $(\lambda,c,\sigma) = [10,0.1,1]$ as your starting point given to <code>fmin()</code> of <code>scipy</code> package to search for the optimal values of the parameters.</p>
<p>Then redraw the above tumor evolution curve and show the result from the model prediction as well, like the following,</p>
<figure>
<img src="http:/DSP2019F/exam/1-problem/figures/growthCurveFit.png" width="900" />
</figure>
<p><br /></p>
<p>Report also your best-fit parameters in a file and submit them with all the figures and your codes to your exam folder repository.</p>
<p><br /><br /></p>
<h2 id="hierarchical-clustering">Hierarchical Clustering</h2>
<p>Consider the set of (x,y) coordinates of 1000 points in this file: <a href="http:/DSP2019F/exam/2-problem/points.txt" target="_blank">points.txt</a>. Plotting these points would yield a scatter plot like the black points in the following plot,</p>
<figure>
<img src="http:/DSP2019F/exam/2-problem/scatterPlot.png" width="900" />
</figure>
<p><br /></p>
<p>The red points on this plot delineate the borders of the three ellipsoids from which these points have been drawn. Suppose, we did not have any a priori knowledge of these ellipsoids and we wanted to <strong>guess them</strong> to the best of our knowledge using Machine Learning methods, in particular, clustering techniques.</p>
<p>The problem here, however, is slightly complex than the above supposition: We may not even know, a priori, how many clusters exist in our data set. Many clustering techniques have been developed over the past decades to automatically answer the question of how many clusters exist in a dataset and where and which objects belong to what clusters.</p>
<p>Here, we want to focus on a very special approach. To predict the original ellipsoids from which these points are drawn, we can start with a very simple assumption: suppose all points came from one single ellipsoid. We can build this hypothetical ellipsoid by constructing the covariance matrix of the set of points in the dataset and then scale it properly such that the ellipsoid covers all the points in our dataset. Here is a procedure to this in Python,</p>
<p>First, we read the data set and visualize it,</p>
<pre><code class="language-python">%matplotlib notebook
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
# read data
Data = pd.read_csv("points.txt")
Point = np.array([Data.x,Data.y])
fig = plt.figure( figsize=(4.5, 4) \
, dpi= 100 \
, facecolor='w' \
, edgecolor='w' \
) # create figure object
ax = fig.add_subplot(1,1,1) # Get the axes instance
ax.plot( Point[0,:] \
, Point[1,:] \
, 'r.' \
, markersize = 1 \
) # plot with color red, as line
ax.set_xlabel('X')
ax.set_ylabel('Y')
fig.savefig('points.png', dpi=200) # save the figure to an external file
plt.show() # display the figure
</code></pre>
<p>This will display the following figure,</p>
<figure>
<img src="http:/DSP2019F/exam/2-problem/points.png" width="900" />
</figure>
<p><br /></p>
<p>Now, here is a script that computes the covariance matrix of a given sample of points (here, our dataset),</p>
<pre><code class="language-python">def getMinVolPartition(Point):
import numpy as np
npoint = len(Point[0,:])
ndim = len(Point[:,0])
ncMax = npoint // (ndim + 1) # max number of clusters possible
BoundingEllipsoidCenter = np.array([np.mean(Point[0,:]),np.mean(Point[1,:])])
SampleCovMat = np.mat(np.cov(Point))
SampleInvCovMat = np.mat(np.linalg.inv(SampleCovMat))
PointNormed = np.mat(np.zeros((ndim,npoint)))
for idim in range(ndim):
PointNormed[idim,:] = Point[idim] - BoundingEllipsoidCenter[idim]
MahalSq = PointNormed.T * SampleInvCovMat * PointNormed
maxMahalSq = np.max(MahalSq)
BoundingEllipsoidVolume = np.linalg.det(SampleCovMat) * maxMahalSq**ndim
BoundingEllipsoidCovMat = SampleCovMat * maxMahalSq
print(
"""
nd = {}
np = {}
ncMax = {}
SampleCovMat = {}
InvCovMat = {}
max(MahalSq) = {}
BoundingEllipsoidCenter = {}
BoundingEllipsoidCovMat = {}
BoundingEllipsoidVolume = {}
""".format( ndim
, npoint
, ncMax
, SampleCovMat[:]
, SampleInvCovMat
, maxMahalSq
, BoundingEllipsoidCenter
, BoundingEllipsoidCovMat
, BoundingEllipsoidVolume
))
return BoundingEllipsoidCenter, BoundingEllipsoidCovMat
</code></pre>
<p>Calling this function would give an output like the following,</p>
<pre><code class="language-python">getMinVolPartition(Point)
</code></pre>
<pre><code> nd = 2
np = 1000
ncMax = 333
SampleCovMat = [[1.0761723 0.36394188]
[0.36394188 0.71635847]]
InvCovMat = [[ 1.12198982 -0.5700206 ]
[-0.5700206 1.68554491]]
max(MahalSq) = 14.185346024371276
BoundingEllipsoidCenter = [6.44826263 6.14296536]
BoundingEllipsoidCovMat = [[15.26587652 5.16264153]
[ 5.16264153 10.16179275]]
BoundingEllipsoidVolume = 128.47580579408614
(array([6.44826263, 6.14296536]), matrix([[15.26587652, 5.16264153],
[ 5.16264153, 10.16179275]]))
</code></pre>
<p>where, the variable <code>BoundingEllipsoidCenter</code> is a vector of lebght two, representing the center of the bounding ellipsoid of these points, the variable <code>BoundingEllipsoidCovMat</code> represents the 2-by-2 covariance matrix of this bounding ellipsoid, and the variable <code>BoundingEllipsoidVolume</code> is the determinant of this bounding ellipsoid, essentially representing the volume encosed by it.</p>
<p>To visualize this bounding ellipsoid, we can use the following code,</p>
<pre><code class="language-python">def getRandMVU(numRandMVU,MeanVec,CovMat,isInside=True):
"""
generates numRandMVU uniformly-distributed random points from
inside an ndim-dimensional ellipsoid with Covariance Matrix CovMat,
centered at MeanVec[0:ndim].
Output:
Numpy matrix of shape numRandMVU by ndim
"""
import numpy as np
ndim = len(MeanVec)
AvgStdMVN = np.zeros(ndim)
CovStdMVN = np.eye(ndim)
RandStdMVN = np.random.multivariate_normal(AvgStdMVN,CovStdMVN,numRandMVU)
DistanceSq = np.sum(RandStdMVN**2, axis=1)
#print(len(DistanceSq))
if isInside:
UnifRnd = np.random.random((numRandMVU,))
UnifRnd = (UnifRnd**(1./ndim)) / np.sqrt(DistanceSq)
CholeskyLower = np.linalg.cholesky(np.mat(CovMat))
#print(CholeskyLower[1,0])
RandMVU = np.zeros(np.shape(RandStdMVN))
for iRandMVU in range(numRandMVU):
if isInside:
RandStdMVN[iRandMVU] *= UnifRnd[iRandMVU]
else:
RandStdMVN[iRandMVU] /= np.sqrt(DistanceSq[iRandMVU])
for i in range(ndim):
RandMVU[iRandMVU,i] = RandMVU[iRandMVU,i] + CholeskyLower[i,i] * RandStdMVN[iRandMVU,i]
for j in range(i+1,ndim):
RandMVU[iRandMVU,j] = RandMVU[iRandMVU,j] + CholeskyLower[j,i] * RandStdMVN[iRandMVU,i]
RandMVU[iRandMVU] += MeanVec
return RandMVU
</code></pre>
<p>The above code takes an input covariance matrix <code>CovMat</code> corresponding to an ellipsoid of interest centered at <code>MeanVec</code>, then outputs a set of <code>numRandMVU</code> points that lie on the boundary of this ellipsoid. If the optional argument <code>isInside</code> is set to <code>False</code>, then the output random points will be uniformly distributed inside the ellipsoid.</p>
<p>Here is an illustration of the bounding ellipsoid of the points we are interested to classify in this problem,</p>
<pre><code class="language-python">MeanVec, CovMat = getMinVolPartition(Point)
RandMVU = getRandMVU( numRandMVU=10000
, MeanVec=MeanVec
, CovMat=CovMat
, isInside = False
)
%matplotlib notebook
import matplotlib.pyplot as plt
fig = plt.figure( figsize=(4.5, 4) \
, dpi= 100 \
, facecolor='w' \
, edgecolor='w' \
) # create figure object
# plot the points
plt.plot( Point[0,:] \
, Point[1,:] \
, 'r.' \
, markersize = 2 \
)
# plot the center point
plt.plot( MeanVec[0] \
, MeanVec[1] \
, 'b.' \
, markersize = 10 \
)
# plot the bounding ellipsoid
plt.scatter(RandMVU[:,0],RandMVU[:,1],1)
ax.set_xlabel('X')
ax.set_ylabel('Y')
plt.show()
</code></pre>
<figure>
<img src="http:/DSP2019F/exam/2-problem/boundingEllipsoid.png" width="900" />
</figure>
<p><br /></p>
<p>So far, we have been able to find an ellipsoid that encloses all of the points in our problem. But here is the second question: Does this single ellipsoid accurately describe the original ellipsoid(s) from which the points were drawn? and does it really represent the least-volume bounding ellipsoid for all of these points?</p>
<p>One way to ensure that this ellipsoid is indeed the least-volume ellipsoid is to check and see if the points are uniformly distributed inside our single ellipsoid. But this turns out to be a very challenging task.</p>
<p>An easier way to see if the single ellipsoid is a good fit to our points is to compute the area of the single ellipsoid, then move on to assume that our data came from <strong>two ellipsoids</strong> instead of a single ellipsoid. At this point, we can use the K-means clustering method to find the two clusters from which these points could have been drawn.</p>
<p>Now, here is the critical step: We compute the sum of the areas enclosed by these two ellipsoids (which could overlap, but that is fine, <strong>we proceed as if they were not overlapping</strong>). Then we can compare this sum with the area of the original single ellipsoid in the above figure:</p>
<ol>
<li>If the area of the single ellipsoid is smaller than the sum of the areas of the two child-ellipsoids, we assume that all of the points in our dataset came from the single ellipsoid, and stop further searches for potentially more clusters in our dataset.</li>
<li>However, if the area of the single ellipsoid is larger than the sum of the areas of the two child-ellipsoids, then we know that the two smaller ellipsoids are likely better fit to our dataset than a single ellipsoid. Therefore, our dataset was likely generated from two-ellipsoids.</li>
</ol>
<p>But what if there are more than two ellipsoids responsible for the generation of the points? One way to test this hypothesis is to repeat the above procedure for the two child-ellipsoids and see whether any one of them can be replaced with two sub-child-ellipsoids instead. This procedure can be then repeated for as many times as needed, until the algorithm stops, implying that all of the child-ellipsoids have been found, <strong>or</strong>, the number of points for a sub-clustering task becomes 3 or less than 3, in which case no more clustering is possible, because we need at least three points to form a 2D ellipsoid.</p>
<p>Write an algorithm based on the above description and provided scripts that can classify all of the points in a given dataset into an automatically-determined number of ellipsoids, such that each point in the dataset is enclosed by at least one ellipsoid. The first graph above shows an example of a set of such ellipsoids illustrated by the green dots.</p>
<p>Note that the ellipsoids found by your algorithm are not unique, meaning that different runs of the algorithm could potentially yield different sets of best-fit ellipsoids. However, we can hope that each set of such ellipsoids found by the algorithm is a good approximation to the original ellipsoids from which the points were drawn.</p>
<p>Here is an animation of this algorithm at work, for a set of points with an evolving overall-shape over time,</p>
<figure>
<img src="http:/DSP2019F/exam/2-problem/ellipsoids_forever.gif" width="900" />
</figure>
<p><br /></p>
<p><br /><br /></p>
<p><a href="http:/DSP2019F/exam/1-semester-project">Final exam: semester project</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on November 01, 2019.</p><![CDATA[Homework 3: More on Python programming, Monte Carlo methods, and regression]]>http:/DSP2019F/homework/3-more-on-python-programming-data-science2019-10-31T00:00:00-05:002019-10-31T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>♣ <strong>Due Date: Thursday Nov 21, 2019 2:00 PM</strong>. This homework aims at giving you some extra experience with the syntax of Python and programming via Python.</p>
<p><strong>1.</strong> <a href="https://www.cdslab.org/recipes/programming/one-line-check-even-number/one-line-check-even-number" target="_blank">Check if number is even in one line function definition</a>.</p>
<p><strong>2.</strong> <a href="https://www.cdslab.org/recipes/programming/finding-maximum-value-via-recursive-function/finding-maximum-value-via-recursive-function" target="_blank">Finding the maximum value of an array via recursive function calls</a>.</p>
<p><strong>3.</strong> <a href="https://www.cdslab.org/recipes/programming/finding-maximum-location-via-recursive-function/finding-maximum-location-via-recursive-function" target="_blank">Finding the position of the maximum value of an array via recursive function calls</a>.</p>
<p><strong>4.</strong> <a href="https://www.cdslab.org/recipes/programming/monte-carlo-approximation-of-pi/monte-carlo-approximation-of-pi" target="_blank">Monte Carlo approximation of the number Pi</a></p>
<p><strong>5.</strong> <a href="https://www.cdslab.org/recipes/programming/random-walk-central-limit-theorem/random-walk-central-limit-theorem" target="_blank">Understanding the Central Limit Theorem via random walk</a></p>
<p><strong>6.</strong> <a href="https://www.cdslab.org/recipes/programming/monte-carlo-sampling-of-distribution-functions/monte-carlo-sampling-of-distribution-functions" target="_blank">Monte Carlo sampling of distribution functions</a></p>
<p><strong>7.</strong> <a href="https://www.cdslab.org/recipes/programming/simulating-monty-hall-game/simulating-monty-hall-game" target="_blank">Simulating the Monty Hall game</a></p>
<p><strong>8.</strong> <a href="https://www.cdslab.org/recipes/programming/regression-standard-normal-distribution/regression-standard-normal-distribution" target="_blank">Regression: obtaining the most likely mean and standard deviation of a set of Standard Normally Distributed Random Variables</a></p>
<p><strong>9.</strong> <a href="https://www.cdslab.org/recipes/programming/regression-predicting-future-global-land-temperature/regression-predicting-future-global-land-temperature" target="_blank">Regression: Predicting the global land temperature of the Earth in 2050 from the past data</a></p>
<p><a href="http:/DSP2019F/homework/3-more-on-python-programming-data-science">Homework 3: More on Python programming, Monte Carlo methods, and regression</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on October 31, 2019.</p><![CDATA[Quiz 2: Python control constructs and program units]]>http:/DSP2019F/quiz/2-control-constructs-and-program-units2019-10-21T00:00:00-05:002019-10-21T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>This quiz aims to testing your basic knowledge of Control constructs and program units in Python.</p>
<hr />
<hr />
<p><br /></p>
<ol>
<li>Suppose you write a Python module, which you would also like to run it as a standalone Python code. If you wanted to make sure that some specific Python statements are executed only when the code is run a Python code (and not a module), you may recall from the lecture, that we had to use and if block like the following,
<pre><code class="language-python">if __name__ == "__main__":
<Python statements>
</code></pre>
<p>Briefly explain what this if block does and mean.</p>
</li>
<li>Suppose you write a module named <code>myModule</code>, which contains the function <code>myfunc</code>. Now you import this module to another code.<br />
(A) Write down the import statement that would enable you to use <code>myfunc</code> with name <code>f</code> instead.<br />
(B) What would be the output of the following Python print statement,
<pre><code class="language-python">import myModule as mm
print(mm.__name__)
</code></pre>
</li>
<li>Suppose there are two lists of numbers,
<pre><code class="language-python">even = [0,2,4,6,8]
odd = [1,3,5,7,9]
</code></pre>
<p>Write a <strong>one-line</strong> Python statement (list comprehension) that gives a list <code>summ</code> whose elements are the sum of the respective elements in the above two lists <code>odd</code> and <code>even</code>, that is,</p>
<pre><code class="language-python">summ
</code></pre>
<pre><code>[1, 5, 9, 13, 17]
</code></pre>
<p>(Hint: You can use <code>zip</code> function inside a list comprehension. There is also a more efficient way of achieving the goal, without list comprehension. Any guess?)</p>
</li>
<li>Consider the following for-loop,
<pre><code class="language-python">mylist = list(range(0,10,2))
for item in mylist:
mylist.append(item+1)
</code></pre>
<p>How many iterations does this for-loop perform before ending? Explain briefly why.</p>
</li>
<li>Write a recursive function named <code>getSum</code> or <code>get_sum</code> that takes an input integer and gives as the output, the sum of all positive integers up to and including the input integer, for example,
<pre><code class="language-python">getSum(-1)
</code></pre>
<pre><code>0
</code></pre>
<pre><code class="language-python">getSum(0)
</code></pre>
<pre><code>0
</code></pre>
<pre><code class="language-python">getSum(1)
</code></pre>
<pre><code>1
</code></pre>
<pre><code class="language-python">getSum(2)
</code></pre>
<pre><code>3
</code></pre>
<pre><code class="language-python">getSum(3)
</code></pre>
<pre><code>6
</code></pre>
</li>
</ol>
<p><a href="http:/DSP2019F/quiz/2-control-constructs-and-program-units">Quiz 2: Python control constructs and program units</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on October 21, 2019.</p><![CDATA[Homework 2: Python programming]]>http:/DSP2019F/homework/2-python-programming2019-10-01T00:00:00-05:002019-10-01T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>♣ <strong>Due Date: Thursday Oct 17, 2019 2:00 PM</strong>. This homework aims at giving you some experience with the syntax of Python and programming via Python.</p>
<p><strong>1.</strong> <a href="https://www.cdslab.org/recipes/programming/python-call-script-from-bash/python-call-script-from-bash" target="_blank">Python script call from the Bash command line</a>.</p>
<p><strong>2.</strong> <a href="https://www.cdslab.org/recipes/programming/python-variable-aliasing-copying/python-variable-aliasing-copying" target="_blank">Python aliasing vs. copying variables</a>.</p>
<p><strong>3.</strong> <a href="https://www.cdslab.org/recipes/programming/implementing-gaussian-function/implementing-gaussian-function" target="_blank">Implementing the Bell-shaped (Gaussian) function</a>.</p>
<p><strong>4.</strong> <a href="https://www.cdslab.org/recipes/programming/branching-pythonic-way/branching-pythonic-way" target="_blank">Branching, the Pythonic way</a>.</p>
<p><strong>5.</strong> <a href="https://www.cdslab.org/recipes/programming/fibonacci-sequence-via-recursive-function-calls/fibonacci-sequence-via-recursive-function-calls" target="_blank">Computing the Fibonacci sequence via recursive function calls</a>.</p>
<p><strong>6.</strong> <a href="https://www.cdslab.org/recipes/programming/fibonacci-sequence-via-for-loop/fibonacci-sequence-via-for-loop#python" target="_blank">Computing the Fibonacci sequence via for-loop</a>.</p>
<p><strong>7.</strong> <a href="https://www.cdslab.org/recipes/programming/isprime-recursive/isprime-recursive" target="_blank">Checking if an input is a prime number (via recursive function calls)?</a>.</p>
<p><strong>8.</strong> <a href="https://www.cdslab.org/recipes/programming/largest-prime-number-smaller-than-input/largest-prime-number-smaller-than-input" target="_blank">Getting the largest prime number smaller than the input value</a>.</p>
<p><strong>9.</strong> <a href="https://www.cdslab.org/recipes/programming/triangle-area/triangle-area" target="_blank">Computing the area of a triangle</a>.</p>
<p><strong>10.</strong> <a href="https://www.cdslab.org/recipes/programming/while-loop-to-for-loop/while-loop-to-for-loop" target="_blank">The while-loop implementation of for-loop</a>.</p>
<p><strong>11.</strong> <a href="https://www.cdslab.org/recipes/programming/command-line-input-arguments-summation/command-line-input-arguments-summation" target="_blank">Command line input arguments summation via sum()</a>.</p>
<p><strong>12.</strong> <a href="https://www.cdslab.org/recipes/programming/command-line-input-arguments-eval/command-line-input-arguments-eval" target="_blank">Command line input arguments summation via eval()</a>.</p>
<p><strong>13.</strong> <a href="https://www.cdslab.org/recipes/programming/precision-error-paradox/precision-error-paradox" target="_blank">Impact of machine precision on numerical computation</a>.</p>
<p><strong>14.</strong> <a href="https://www.cdslab.org/recipes/programming/integer-overflow/integer-overflow" target="_blank">Integer overflow</a>.</p>
<p><strong>15.</strong> <a href="https://www.cdslab.org/recipes/programming/modifying-loop-index-value/modifying-loop-index-value" target="_blank">Modifying the index of a for-loop</a>.</p>
<p><a href="http:/DSP2019F/homework/2-python-programming">Homework 2: Python programming</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on October 01, 2019.</p><![CDATA[Quiz 1: Version control system, programming history]]>http:/DSP2019F/quiz/1-version-control-system-programming-history2019-09-12T00:00:00-05:002019-09-12T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<!--
This is the solution to [Quiz 1: Problems - Version control system](1-problems-version-control-system.html){:target="_blank"}.
The following figure illustrates the grade distribution for this quiz.
<figure>
<img src="http:/DSP2019F/quiz/gradeDist/gradeHistQuiz1.png" width="700">
<figcaption style="text-align:center">
Maximum possible points is 1.
</figcaption>
</figure>
-->
<p>This quiz aims at testing your basic knowledge of Version Control System and the history of programming. Don’t forget to push your answers to your remote repository by the end of quiz time. Push all your answers to <strong>quiz/1/</strong> folder in your Github project.</p>
<hr />
<hr />
<p><br /></p>
<ol>
<li>
<p>Which of the following Git commands can add all the <strong>new</strong> and <strong>modified-existing</strong> files to the staging area? choose all that apply.<br />
<br />
(A) <code>git add -A</code><br />
(B) <code>git add --A</code><br />
(C) <code>git add -all</code><br />
(D) <code>git add --all</code><br />
(E) <code>git add -u</code><br />
(F) <code>git add .</code><br />
(G) <code>git add .; git add -u</code><br />
(H) <code>git add .; git add --u</code><br />
(I) <code>git add -u; git add .</code><br />
(J) <code>git add --u; git add .</code></p>
<p><strong>Answer: A, D, G, F, I</strong></p>
</li>
<li>
<p>Which of the following Git commands <strong>both</strong> stages and commits <strong>only modified and deleted files</strong> but <strong>NOT</strong> the <em>new files</em> added to the repository since the last commit. Choose all that apply.<br />
<br />
(A) <code>git commit</code><br />
(B) <code>git commit -a</code><br />
(C) <code>git commit -am</code></p>
<p><strong>Answer: B, C</strong></p>
</li>
<li>
<p>Write down the Git command that lists all Git commands for you.</p>
<p><strong>Answer:</strong></p>
<pre><code class="language-bash"> $ git help -a
</code></pre>
</li>
<li>
<p>(A) What is the closest programming language to machine code (i.e., binary code)?</p>
<p><strong>Answer:</strong><br />
Assembly</p>
<p>(B) Does it need interpretation in order to become machine-comprehensible?</p>
<p><strong>Answer:</strong><br />
Yes. An <em>Assembler</em> interprets the program for the machine.</p>
</li>
<li>
<p>(A) Name the oldest high-level programming language that is still in active daily use.</p>
<p><strong>Answer:</strong><br />
Fortran</p>
<p>(B) Approximately how many decades is it old? ($\pm15$ years is acceptable answer. the decade it was created is also an acceptable answer)</p>
<p><strong>Answer:</strong><br />
in 1950s</p>
</li>
<li>
<p>(A) Name a second-generation programming language.</p>
<p><strong>Answer:</strong><br />
Assembly</p>
<p>(B) Which language-generation are Fortran, C, C++, MATLAB, Python, R?</p>
<p><strong>Answer:</strong><br />
third, third, third, fourth, fourth, fourth</p>
</li>
<li>
<p>In what decades C, C++, and MATLAB/Python were created, respectively?</p>
<p><strong>Answer:</strong><br />
1970s, 1980s, 1980s, 1990s</p>
</li>
<li>
<p>Name an ancestor programming language of C.</p>
<p><strong>Answer:</strong><br />
B</p>
</li>
<li>
<p>Name a programming language ancestor of C++.</p>
<p><strong>Answer:</strong><br />
C, Simula</p>
</li>
<li>
<p>Name a programming language ancestor of MATLAB and a programming language ancestor of Python.</p>
<p><strong>Answer:</strong><br />
Fortran/C</p>
</li>
<li>
<p>How would you distinguish exponential behavior vs. power-law behavior (function) in a 2-dimensional plot?</p>
<p><strong>Answer:</strong><br />
An exponential curve looks like a line only when the X-axis is plotted on log-scale.<br />
A power-law curve looks like a line only when both the X- and Y- axes are plotted on log-scale.</p>
</li>
</ol>
<p><a href="http:/DSP2019F/quiz/1-version-control-system-programming-history">Quiz 1: Version control system, programming history</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on September 12, 2019.</p><![CDATA[Homework 1: Version Control Using Git and Github]]>http:/DSP2019F/homework/1-version-control-using-git-github2019-09-04T00:00:00-05:002019-09-04T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>♣ <strong>Due Date: Wednesday Sep 12, 2019 2:00 PM</strong>. This homework aims at giving you some experience on how to create Git branches, develop your project on multiple branches, merge them, resolve potential conflicts between different branches upon merging, and finally how to delete them. It also gives you some experience with using other commonly-used Git commands.</p>
<p>First, use the following Markdown language references, or any other reference that you find or prefer, to design a Github-interpretable README file for each of folders in your project for this course, and a Github web-page for your project.</p>
<ul>
<li><a href="http:/DSP2019F/lecture/1/markdown-cheatsheet-online.pdf" target="_blank">Markdown language cheat-sheet (pdf)</a></li>
<li><a href="https://blog.ghost.org/markdown/" target="_blank">Markdown language reference (web)</a></li>
<li><a href="https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet" target="_blank">Adam Pritchard’s Markdown cheat-sheet (web)</a></li>
</ul>
<p>Write your code sections of your answer in Markdown syntax.
For example,<br />
<code>
```bash <br />
$ git branch -d test <br />
error: Cannot delete branch 'test' checked out at 'C:/Users/Amir/git/foo' <br />
```
</code>
<br />
will display the following text highlighted as <em>bash</em> code, in your <em>readme.md</em> file (albeit, with different style and color).</p>
<pre><code class="language-bash">$ git branch -d test
error: Cannot delete branch 'test' checked out at 'C:/Users/Amir/git/foo'
</code></pre>
<p><strong>1.</strong> <a href="https://www.cdslab.org/recipes/programming/version-control-using-git-github/version-control-using-git-github" target="_blank">Version-control using Git and GitHub</a>.</p>
<p><a href="http:/DSP2019F/homework/1-version-control-using-git-github">Homework 1: Version Control Using Git and Github</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on September 04, 2019.</p><![CDATA[Announcement 1: Assessing your programming knowledge and interests]]>http:/DSP2019F/announcement/1-assessing-your-knowledge-and-interests2019-08-22T00:00:00-05:002019-08-22T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>The goal of this survey is to assess your prior programming experience and to identify the favorite programming language and Data Science topics for this class, as well as the language and topics that your advisor deems essential for your research. Ask Amir to send you a link to the survey if you have not already received it.</p>
<h2 id="survey-results-as-of-august-22-2019">Survey results as of August 22, 2019</h2>
<p>The following are the summaries of the responses to the survey questions. The total number of survey respondents is 10. It appears that at least half of the class, knows at least one programming language at some elementary level. Also it appears that almost everyone in this class prefers Python to other choices for programming.</p>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<img src="http:/DSP2019F/announcement/initial-survey/Q1.png" width="100%" />
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<img src="http:/DSP2019F/announcement/initial-survey/Q2.png" width="100%" />
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<img src="http:/DSP2019F/announcement/initial-survey/Q3.png" width="100%" />
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<img src="http:/DSP2019F/announcement/initial-survey/Q4.png" width="100%" />
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<img src="http:/DSP2019F/announcement/initial-survey/Q5.png" width="100%" />
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<img src="http:/DSP2019F/announcement/initial-survey/Q6.png" width="100%" />
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<img src="http:/DSP2019F/announcement/initial-survey/Q7.png" width="100%" />
</figure>
</div>
<p><br /></p>
<p><a href="http:/DSP2019F/announcement/1-assessing-your-knowledge-and-interests">Announcement 1: Assessing your programming knowledge and interests</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on August 22, 2019.</p><![CDATA[Announcement 0: Student-professor connection day]]>http:/DSP2019F/announcement/0-student-professor-connection-day2019-08-22T00:00:00-05:002019-08-22T00:00:00-05:00Amir Shahmoradihttp:/DSP2019Fshahmoradi@utexas.edu
<p>On the first day of our class, we will try to get to know each other and I will attempt to describe my research work and educational background for you, as well as what we should expect from this course. Then I will present the results of the survey that I sent out to you a week ago to assess your programming knowledge, your favorite programming language, and the programming language that you would need for your research. Based on the survey results and your feedback in class, we will decide on the choice of language and the design of this course.</p>
<div class="post_toc"></div>
<h2 id="about-me-amir-the-instructor">About me, Amir, the instructor</h2>
<p>I am a physicist and researcher, and currently a faculty member at <a target="_blank" href="https://www.uta.edu/physics/">the Department of Physics</a> as well as the Data Science Program in <a target="_blank" href="https://www.uta.edu/science/index.php">The College of Science</a> at <a target="_blank" href="https://www.uta.edu/">The University of Texas at Arlington</a>. You can find more information about me, our group, and our research at <a target="_blank" href="https://www.cdslab.org">cdslab.org</a>. Here is a summary of my life in a few pictures:</p>
<p>I was introduced to the world of information and computer programming around 1991 by my father and elder brother. By the end of elementary school, I was so impressed with and knew enough about computer software to write a few simple <a target="_blank" href=" https://en.wikipedia.org/wiki/PC_game">computer games</a> in <a target="_blank" href="https://en.wikipedia.org/wiki/QBasic">QBasic programming language</a> on our first family personal computer, <a target="_blank" href="https://en.wikipedia.org/wiki/IBM_386SLC">IBM 386</a>. Here is an example of how computer games looked liked 30 years ago.</p>
<div class="center">
<div class="video-wrapper">
<div class="video-container">
<iframe width="560" height="315" src="https://www.youtube.com/embed/4TSF5sIgorA" frameborder="0" allowfullscreen=""></iframe>
<!-- <iframe width="853" height="480" src="https://www.youtube.com/embed/0XL8RNxzrdw?rel=0" frameborder="0" allowfullscreen></iframe> -->
</div>
</div>
</div>
<p><br /></p>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<a href="https://cdslaborg.github.io/connection/memoriesOfGreen.jpg" target="_blank">
<img src="https://cdslaborg.github.io/connection/memoriesOfGreen.jpg" width="100%" />
</a>
<figcaption>A portrait of me at high-school by my friends</figcaption>
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<a href="https://cdslaborg.github.io/connection/EinsteinSandals.jpg" target="_blank">
<img src="https://cdslaborg.github.io/connection/EinsteinSandals.jpg" width="100%" />
</a>
<figcaption>What I imagined I'd do as a physicist</figcaption>
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<a href="https://cdslaborg.github.io/connection/PhDdefense.gif" target="_blank">
<img src="https://cdslaborg.github.io/connection/PhDdefense.gif" width="100%" />
</a>
<figcaption>A scene from my Ph.D. defense</figcaption>
</figure>
</div>
<p>I never imagined a day I would use computer programming for scientific purposes. Nevertheless, there has been almost no single day that I have not used scientific programming for my work and research, ever since I entered graduate school.</p>
<h2 id="my-research-topics">My research topics</h2>
<p>The following are a few examples of what I do nowadays as part of my scientific research at UT Austin.</p>
<h3 id="theoretical-astrophysics">Theoretical Astrophysics</h3>
<p>While my current focus of research is mathematical modelling of tumor growth and Monte Carlo samplers, I am and have been working in multiple branches of science and engineering for several years, from the subatomic world of <a target="_blank" href="https://en.wikipedia.org/wiki/Elementary_particle">elementary particles</a>, to the microscopic world of <a target="_blank" href="https://en.wikipedia.org/wiki/Macromolecule">biological macromolecules</a>, to <a target="_blank" href="https://en.wikipedia.org/wiki/Gamma-ray_burst">astrophysical phenomena</a> occurring on the grandest scales of the <a target="_blank" href="https://en.wikipedia.org/wiki/Observable_universe">observable Universe</a>.</p>
<p>For several years of my research, I have been working on understanding Gamma-Ray Bursts (GRB) and their physics. Below is a movie of the moment a Short-duration GRB is generated from the merger of a binary Neutron star system.</p>
<div class="center">
<div class="video-wrapper">
<div class="video-container">
<iframe width="853" height="480" src="https://www.youtube.com/embed/P2ESs1rPO_A?rel=0" frameborder="0" allowfullscreen=""></iframe>
</div>
</div>
</div>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/astro_1.png" width="100%" />
</figure>
<p><br /></p>
<h3 id="theoretical-biology-bioinformatics">Theoretical Biology, Bioinformatics</h3>
<p>I have also worked for a few years in the field of bioinformatics and evolutionary biology. The overarching goal in the field of protein bioinformatics and biophysics is to understand how proteins fold into their unique structure, and what determines the stability of the protein <abbr title="3-Dimensional">3D</abbr> structure.</p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/bio_1.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/bio_2.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<p>One of the workhorses of this field, is therefore <a href="https://en.wikipedia.org/wiki/Molecular_dynamics" target="_blank">molecular dynamic simulation</a> to probe the dynamics of proteins and their interactions with other molecules. The following is a 1.5ns molecular dynamics simulation of Human Influenza H1 Hemagglutinin protein (<a href="https://www.rcsb.org/pdb/explore.do?structureId=1rd8" target="_blank">1RD8</a>, chains AB).</p>
<div class="center">
<div class="video-wrapper">
<div class="video-container">
<iframe width="853" height="480" src="https://www.youtube.com/embed/0XL8RNxzrdw?rel=0" frameborder="0" allowfullscreen=""></iframe>
</div>
</div>
</div>
<p><br /></p>
<h3 id="petroleum-engineering">Petroleum Engineering</h3>
<figure>
<img src="https://cdslaborg.github.io/connection/petro_1.png" width="100%" />
</figure>
<p><br /></p>
<h3 id="computational-oncology">Computational Oncology</h3>
<p>What you see in the figures below, is a representation of the growth of Glioblastoma tumor cells in a Rat’s brain over time.</p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/onco_1.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/onco_2.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<figure>
<img src="https://cdslaborg.github.io/connection/onco_3.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<h4 id="the-temporal-evolution-of-the-growth-of-glioma-tumor-in-rat">The temporal evolution of the growth of Glioma tumor in rat</h4>
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/tvccZSliceSubplotWithXYlabWithTB_rad_00gy_1_t10.0.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/GBGlastLong.gif" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<!--
<figure>
<img src="https://cdslaborg.github.io/connection/tvccZSliceSubplotWithXYlabWithTB_rad_00gy_2_t12.0.png" width="100%">
</figure><br>
---
<br>
<figure>
<img src="https://cdslaborg.github.io/connection/tvccZSliceSubplotWithXYlabWithTB_rad_00gy_3_t14.0.png" width="100%">
</figure><br>
---
<br>
<figure>
<img src="https://cdslaborg.github.io/connection/tvccZSliceSubplotWithXYlabWithTB_rad_00gy_5_t16.0.png" width="100%">
</figure><br>
---
<br>
<figure>
<img src="https://cdslaborg.github.io/connection/tvccZSliceSubplotWithXYlabWithTB_rad_00gy_6_t18.0.png" width="100%">
</figure><br>
---
<br>
<figure>
<img src="https://cdslaborg.github.io/connection/tvccZSliceSubplotWithXYlabWithTB_rad_00gy_7_t20.0.png" width="100%">
</figure><br>
-->
<h3 id="monte-carlo-simulation-and-integration-methods">Monte Carlo Simulation and Integration Methods</h3>
<p>One of the fields on which my research is currently focused, is developing Monte Carlo optimizer/sampler and integrator algorithms for Bayesian inverse problems.</p>
<h4 id="development-of-monte-carlo-sampling-algorithms">Development of Monte Carlo sampling algorithms</h4>
<p>Below you see example animations of two <a href="https://en.wikipedia.org/wiki/Markov_chain_Monte_Carlo" target="_blank">Markov Chain Monte Carlo</a> (MCMC) samplers, both of which sample a double Gaussian-peak function, but with different MCMC sampling parameters.</p>
<figure>
<img src="https://cdslaborg.github.io/connection/PDF_RS_H_Forever20ms.gif" width="100%" />
<figcaption style="text-align:center">Example of highly-efficiency, but bad-mixing MCMC sampler.</figcaption>
</figure>
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/PDF_RS_L_Forever20ms.gif" width="100%" />
<figcaption style="text-align:center">Example of low-efficiency, but good-mixing MCMC sampler.</figcaption>
</figure>
<p><br /></p>
<h4 id="development-of-monte-carlo-integration-algorithms">Development of Monte Carlo integration algorithms</h4>
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/DRI.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/DLI.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/GR3D.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/GR2D.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/GR2D.gif" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/EB3D.png" width="100%" />
</figure>
<p><br /></p>
<hr />
<p><br /></p>
<figure>
<img src="https://cdslaborg.github.io/connection/EB2D.gif" width="100%" />
</figure>
<p><br /></p>
<h4 id="biomedical-data-science">Biomedical Data Science</h4>
<p><br /></p>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<a href="https://cdslaborg.github.io/connection/PT509_ST9501_SE00144_ADC0016.png" target="_blank">
<img src="https://cdslaborg.github.io/connection/PT509_ST9501_SE00144_ADC0016.png" width="100%" />
</a>
</figure>
</div>
<!-- include.path must be given relative to site.url, which is the project's root directory -->
<div style="display:block;text-align:center;margin-right:auto;margin-left:auto">
<figure>
<a href="https://cdslaborg.github.io/connection/PT509_ST9501_SE00144_ADC0016_overlay.png" target="_blank">
<img src="https://cdslaborg.github.io/connection/PT509_ST9501_SE00144_ADC0016_overlay.png" width="100%" />
</a>
</figure>
</div>
<p><br /><br /></p>
<p><a href="http:/DSP2019F/announcement/0-student-professor-connection-day">Announcement 0: Student-professor connection day</a> was originally published by Amir Shahmoradi at <a href="http:/DSP2019F">PHYS 5391 - Fall 2019 - TTH 14:00-13:30 - Life Sciences Building LS 428</a> on August 22, 2019.</p>