tags/sawsimunfolding disastershttp://blog.tremily.us//tags/sawsim/unfolding disastersikiwiki2012-02-22T16:17:31Zpysawsim.manager: parallelization-framework-agnostic testinghttp://blog.tremily.us//posts/pysawsim-manager/2010-10-27T20:03:35Z2010-10-27T19:59:04Z
<p><span class="infobox">
Available in a <a href="http://blog.tremily.us//tags/sawsim/../git/">git</a> repository.<br />
Repository: <a href="git://tremily.us/sawsim.git" rel="vcs-git" title="sawsim repository">sawsim</a><br />
Browsable repository: <a href="http://git.tremily.us/?p=sawsim.git" rel="vcs-git" title="sawsim repository">sawsim</a><br />
Author: W. Trevor King<br />
</span></p>
<p>Over the past week I've been working up a <a href="http://blog.tremily.us//tags/sawsim/../../posts/Python/">Python</a> framework for my
<a href="http://blog.tremily.us//tags/sawsim/../../posts/sawsim/">sawsim</a> simulator, to allow parallel execution on a variety of
systems. I think I've got the kinks worked out now, and <code>sawsim</code> (or
anything else that can subclass <code>pysawsim.manager.Job</code> can now be
transparently executed in parallel using any of</p>
<ul>
<li>Python's <a href="http://docs.python.org/library/threading.html">threading</a> module (the <code>thread</code> manager).</li>
<li>Python's <a href="http://docs.python.org/library/multiprocessing.html">multiprocessing</a> module (the <code>subproc</code> manager).</li>
<li><a href="http://blog.tremily.us//tags/sawsim/../../posts/MPI/">MPI</a> (via <a href="http://mpi4py.scipy.org/">mpi4py</a>, the <code>mpi</code> manager).</li>
<li><a href="http://blog.tremily.us//tags/sawsim/../../posts/PBS/">PBS</a> (via <a href="https://subtrac.sara.nl/oss/pbs_python">pbs_python</a>, the <code>pbs</code> manager).</li>
</ul>
<p>Most of the difficulty is in handling systems where not all of these
approaches are viable, (e.g. Python 2.5, so no <code>multiprocessing</code>
module, or an SMP host that wants to use <code>subproc</code> without installing
mpi4py or pbs_python). I also ran up against Python's <a href="http://bugs.python.org/issue5155">issue5155</a>
when I tried to test the <code>subproc</code> manager from within a:</p>
<pre><code>$ nosetests --with-doctest --doctest-tests --processes $N pysawsim
</code></pre>
<p>on a Python 2.6.2 system (it was fixed by 2.6.3).</p>
<p>Note that Python's <a href="http://docs.python.org/faq/library#can-t-we-get-rid-of-the-global-interpreter-lock">GIL restricts threads to a single core</a>. This
means that the <code>thread</code> manager is basically decorative, but I wrote
it anyway because it forms the basis of several other managers, and
it's the only manager that will definitely work on Python 2.5.</p>
sawsimhttp://blog.tremily.us//posts/sawsim/2012-02-22T16:17:31Z2010-10-23T18:03:10Z
<p><span class="infobox">
Available in a <a href="http://blog.tremily.us//tags/sawsim/../git/">git</a> repository.<br />
Repository: <a href="git://tremily.us/sawsim.git" rel="vcs-git" title="sawsim repository">sawsim</a><br />
Browsable repository: <a href="http://git.tremily.us/?p=sawsim.git" rel="vcs-git" title="sawsim repository">sawsim</a><br />
Author: W. Trevor King<br />
</span></p>
<h1>Introduction</h1>
<p>My <a href="http://blog.tremily.us//tags/sawsim/../../posts/Thesis/">thesis</a> project investigates protein unfolding via the
experimental technique of <a href="http://blog.tremily.us//tags/sawsim/../../posts/Force_spectroscopy/">force spectroscopy</a>. In force
spectroscopy, we mechanically stretch chains of proteins, usually by
pulling one end of the chain away from a surface with an <a href="http://en.wikipedia.org/wiki/Atomic_force_microscopy">AFM</a>.</p>
<p>For velocity clamp experiments (the simplest to carry out
experimentally), the experiments produce "sawtooth" force-displacement
curves. As the protein stretches, the tension increases. At some
point, a protein domain unfolds, increasing the total length of the
chain and relaxing the tension. As we continue to stretch the
protein, we see a series of unfolding peaks. The <span class="createlink">GPLed</span>
program <a href="http://blog.tremily.us//tags/sawsim/../../posts/Hooke/">Hooke</a> analyzes the sawtooth curves and extracts lists of
unfolding forces.</p>
<p>Lists of unfolding forces are not particularly interesting by
themselves. The most common approach for extracting some physical
insights from the unfolding curves is to take a guess at an
explanatory model and check the predicted behavior of the model
against the measured behavior of the protein. If the model does a
good job of explaining the protein behavior, it might be what's
actually going on behind the scenes. Sawsim is my (<a href="http://dx.doi.org/10.1016/j.ijbiomac.2009.12.001">published</a>!)
tool for simulating force spectroscopy experiments and matching the
simulations to experimental results.</p>
<p>The main benefits of sawsim are its ability to simulate systems with
arbitrary numbers of states (see the <span class="createlink">manual</span>) and to
easily compare the simulated data with experimental values. The
following figure shows a long valley of reasonable fits to some
ubiquitin unfolding data. See the IJBM paper (linked above) for more
details.</p>
<p><a href="http://blog.tremily.us//tags/sawsim/../../posts/sawsim/fit-space.png"><img src="http://blog.tremily.us//tags/sawsim/../../posts/sawsim/fit-space.png" width="1063" height="709" alt="Fit space Surface bump for photodiode sensitivity" title="Surface bump for photodiode sensitivity" class="img" /></a></p>
<h1>Getting started</h1>
<p>Sawsim should run anywhere you have a C compiler and Python 2.5+.
I've tested it on Gentoo and Debian, and I've got an ebuild in my
<a href="http://blog.tremily.us//tags/sawsim/../../posts/Gentoo_overlay/">Gentoo overlay</a>. It should also run fine on Windows, etc., but I
don't have access to any Windows boxes with a C compiler, so I haven't
tested that (<a href="http://blog.tremily.us//tags/sawsim/../../contact/">email me</a> if you have access to such a machine
and want to try installing Sawsim).</p>
<p>See the <span class="createlink">README</span>, <span class="createlink">manual</span>, and <a href="http://pypi.python.org/pypi/pysawsim/">PyPI page</a> for
more details.</p>
Git and sawsimhttp://blog.tremily.us//posts/Git_and_sawsim/2010-11-17T14:23:57Z2008-09-01T02:19:56Z
<p>I recently learned about distributed versioning systems like <a href="http://blog.tremily.us//tags/sawsim/../../posts/Git/">Git</a>,
<a href="http://www.selenic.com/mercurial/wiki/">Mercurial</a>, <a href="http://bazaar-vcs.org/">Bazaar</a>, etc., and I moved my <a href="http://blog.tremily.us//tags/sawsim/../../posts/sawsim/">sawsim</a> development
to Git from Subversion (instructions at <a href="http://www.simplisticcomplexity.com/2008/03/05/cleanly-migrate-your-subversion-repository-to-a-git-repository/">Simplistic Complexity</a>).</p>
<p>The benefit of a distributed version control system for me is that it
removes the need for a central repository. That means that if Bob
pulls my code and makes some changes, and Alice pulls my code and
makes some other changes, Bob and Alice can get together and merge
their code without my knowledge/permission/whatever. I think this
will lead to more lively and effective code.</p>
<p>As for picking Git, I'm sure all would be sufficient for my needs, and
Git seemed more like <a href="http://importantshock.wordpress.com/2008/08/07/git-vs-mercurial/">my style</a>.</p>
Simulation progress IIhttp://blog.tremily.us//posts/Simulation_progress_II/2010-11-17T14:23:57Z2008-08-07T09:22:32Z
<p>Alright, my tensions now agree with Prof. Yang's simulated tensions
for 4-domain ubiquitin poly-proteins. Presumably our
Bell-model-predicted histograms will match up as well, but that will
have to wait until my jobs make it to the head of the queue on the
cluster. Yes, shocking as that may seem, someone else (Brad) is using
the cluster! I was getting spoiled having it all to myself. In the
last 16 days since Maui has been up we've been running just under 6%
efficiency. Wohoo ;). Good thing I didn't spend too much time tuning
for efficiency :p.</p>
Simulation progresshttp://blog.tremily.us//posts/Simulation_progress/2010-11-17T14:23:57Z2008-07-23T04:07:38Z
<p>Oh, I forgot to mention, but I've been making some changes to the
simulation unrelated to the testing. Well, some of them have been
related…</p>
<p>In the tension model utilities, you can now pick <em>x_min</em> and <em>x_max</em>
when plotting energy landscapes <em>E(x)</em> (that was for plotting Kramers’
cusp potential). You can also pick <em>F_max</em> when plotting <em>F(x)</em>.</p>
<p>I also ripped apart the tension-group idea and put it back together.
The initial implementation was a bit embarassing. Now you can do things
that tension groups are intended for, like having two seperate WLC groups
(for, say, folded- and unfolded-domains).</p>
<p>Subversion is lots of fun, by the way ;).</p>
Giving up on Gompertz theoryhttp://blog.tremily.us//posts/Giving_up_on_Gompertz_theory/2011-12-15T18:57:28Z2008-07-01T04:22:51Z
<p>I think I've spent enough time trying to find a nice analytic way to guess parameters for a Gompertz model fit to my unfolding probability densities.
I now have a heuristic which seems to work :p, and I suppose I'll be satisfied with that for the time being.</p>
<p>On to find out about analytic solutions to Kramers' unfolding rates.</p>
<p>Update: I figured out how to use the <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm">NIST reference</a> while writing
my <a href="http://blog.tremily.us//tags/sawsim/../../posts/sawsim/">sawsim</a> paper, listing the mean and standard deviation of the
Gumbel distribution. So many names… Anyway, the <code>pysawsim</code> tests
now use the <a href="http://git.tremily.us/?p=sawsim.git;a=blob;f=pysawsim/test/bell_rate.py;hb=837c425eaeccae280cc7f7784d03dfcfcb03678c#l106">improved guessing procedure</a>.</p>
Gompertz' paperhttp://blog.tremily.us//posts/Gompertz_paper/2010-11-17T13:23:24Z2008-07-01T02:05:40Z
<p>The natural logarithm (log base e) used to be called the <em>hyperbolic logarithm</em> (see <a href="http://www.humboldt.edu/~mef2/Presentations/Estimations.html">here</a>).
Other than the understandably old fasioned notation, Gompertz article is wonderfully written, so much so that I can follow all his math, even with the strange notation (although obviously I had to look hyperbolic logs up).
He makes a number of brief excursions to set up the problem, including a reference to "the great age of the patriarchs of scripture".
He even takes a break to clarify his exponent notation. I love this guy.</p>
<p>All the non-symbolic math was starting to confuse me, so I went back and Googled a bit more, turning up <a href="http://dx.doi.org/10.1016/j.amc.2003.08.086">this</a> paper which claims to help fit the Gompertz distribution, and even (gasp) uses the same Gompertz distribution that Gompertz does :p.
It even talks about probability density functions.
I dunno if I'm ready for another relationship yet though, I've been hurt so many times before ;).</p>
Gompertz/Gumbel distributionshttp://blog.tremily.us//posts/Gompertz-Gumbel_distributions/2010-11-17T14:23:57Z2008-07-01T00:39:51Z
<p><a href="http://www.statistik.uni-dortmund.de/de/content/forschung/publikationen/forschungsberichte/Life-Table-Forecasting.pdf">This paper</a> by Peter Pflaumer gives a bit more technical detail to the whole Gompertz/Gumbel lifespan thing.
I need to remember that my model works, I just need a stable way of estimating the model parameters from which to start optimizing a fit.
Pflaumer derives a formula for the mode, but it is still trancendental in the force scaling parameter.</p>
<p>Hmm, I should see what Garrett thinks of the <a href="http://dx.doi.org/10.1016/j.insmatheco.2006.02.012">economics</a> of this distribution, since that's what it was originally developed for.
He probably already knows all about it, why didn't I think to ask last weekend?
Ah, because I started this on Sunday and saw him on Saturday :p.</p>
<p>I though <a href="http://dx.doi.org/10.1016/j.cam.2003.12.030">this</a> aught to do it for me, but it seems their rate has an extra <code>-ln(P)</code> in it.
Rats.</p>
<p>Desperate times call for going back to <a href="http://www.jstor.org/stable/107756">Gompertz' original paper</a>.</p>
Gumbel/Fisher-Tippett distributionshttp://blog.tremily.us//posts/Gumbel-Fisher-Tippett_distributions/2010-11-17T14:36:33Z2008-06-30T01:10:54Z
<p>Aha, the probability of Bell-model unfolding under constant force-loading has a name!
It is a <a href="http://en.wikipedia.org/wiki/Fisher-Tippett_distribution">Fisher-Tippett distribution</a>, of which the Gumbel distribution is a particular type.
However, NIST refers to it as a <a href="http://www.itl.nist.gov/div898/handbook/eda/section3/eda366g.htm">minimum Gumbel distribution</a>.</p>
<p>Hmm, hopefully I'm not just confusing myself looking at the standardized form, let me go double check...
What is a <a href="">cumulative distribution function</a> anyway?
Ah, <code>CDF(x)</code> is just the probability that the variable will be <code><= x</code>, so the probability distribution function is given by <code>PDF(x) = -d(CDF)/dx</code>.</p>
<p>Alright, looks like my distribution is a bit different than the Fisher-Tippett because I need a non-unity a factor <code>a</code> in <code>PDF(x) = exp(-ax/b)*exp[-exp(x/b)]</code> with <code>z := exp(-x/b)</code>.
Basically, I have a Fisher-Tippett distribution with a poorly scaled <code>x</code>, but I don't know how to rescale <code>x</code> until I've fit my distribution.
So the search continues...</p>
<p>The <a href="http://en.wikipedia.org/wiki/Gompertz-Makeham_law_of_mortality">Gompertz-Makeham Law</a> law for exponentially increasing failure rate is what I want.
But Wikipedia says this is the same as Fisher-Tippett with time inversion.</p>
<p>There is a nice discussion of aging in general, but not much math <a href="http://longevity-science.org/Failure-Models-2006.pdf">here</a>.</p>
<p>Update: Related posts:</p>
<ul>
<li><a href="http://blog.tremily.us//tags/sawsim/../../posts/Gompertz-Gumbel_distributions/">Gompertz-Gumbel distributions</a></li>
<li><a href="http://blog.tremily.us//tags/sawsim/../../posts/Gompertz_paper/">Gompertz paper'</a></li>
<li><a href="http://blog.tremily.us//tags/sawsim/../../posts/Giving_up_on_Gompertz_theory/">Giving up on Gompertz theory</a></li>
</ul>