Scientific
Computing: Algorithm and Analysis - Part One
Over the
last decade and some, the life sciences have seen an exponential growth due to,
in a large part, to the methods employed in modelling, analysis, and
simulation. What was once the domain of the engineering sciences is now
likewise the province of bioinformatics and biophysics. And the software used
by aeronautics engineers is now applied by bioengineers to study molecular
dynamics simulations and their role in exploring the synthesis of large
molecules. Physical and chemical processes are understood more clearly today
owing to the application of algorithms and computer languages open to practically
every biological systems analyst. We would not have an implantable cardiac
defibrillator were it not for the biomedical engineers, clinicians and
researchers working together as a team. In the past, research was reserved for
the university and large institutions which could afford the resources, and were
either often gifted with large grants e.g. by the government, a drug company,
or by a foundation.
In addition,
during the last decade or so, Health IT is making the electronic medical record
system available to clinicians and hospitals. This has improved patient
management in many cases. The OR (operating room) has seen PCs and mobile
devices used by physicians to view x-rays, and even robotics guided surgical
equipment is being used, for example, by the OB-GYN to extirpate a uterine
mass. We are also witnessing the increasing use of nanotechnology to treat
cancer and other difficult cases.
All of the
preceding scenarios involve the use of algorithms and applications. Some use
low-level languages, while some use high-level ones. All require writing a
program whether for an embedded system, such as in robotics, or a program for
use in chemometrics which is an interdisciplinary field that combines statistics
and chemistry.
ALGORITHMS.
Several
books have been written on the description, nature, design, analysis, language
specifics, and optimization of algorithms. What is an algorithm? An algorithm
is a sequence of well-defined instructions that takes a value or set of values
entered as input into a computer (or other programmable device) to obtain a
specific result or group of results, as output.[1] An algorithm then is a “computable” solution
to a problem.
An example algorithm is Euclid’s
algorithm which calculates the greatest common divisor (GCD) of two natural numbers a and b.
The greatest common divisor g is the largest natural number that divides
both a and b without leaving a remainder. The identity bases on gcd(a, b) = gcd(a-b, b).
Synonyms for the GCD include the greatest common factor (GCF), the highest
common factor (HCF), and the greatest common measure (GCM). Euclid’s
algorithm, which computes the GCD of two integers, suffices to calculate the
GCD of arbitrarily many integers. To find the GCD of two numbers by this
algorithm, repeatedly replace the larger by subtracting the smaller from it
until the two numbers are equal. E.g. 132, 168 -> 132, 36 -> 96, 36 ->
60, 36 -> 24, 36 -> 24, 12 -> 12, 12 so the GCD of 132 and 168 is 12. The following block of code is design to express Euclid's algorithm.
For the following C++
code, you could use Code:Blocks available at this site: http://www.codeblocks.org/
C++ code for Euclid’s algorithm would be as follows:
START of
CODE
#include
<iostream.h>
// Fundamental idea of Euclid's algorithm (one of the oldest known algorithms)
// gcd(a,0) = a
// gcd(a,b) = gcd(b,a%b)
int gcd (int a, int b)
{
int temp;
while (b != 0)
{
temp = a % b;
a = b;
b = temp;
}
return(a);
}
int main ()
{
int x, y;
cout << "Enter two natural numbers: ";
cin >> x >> y;
cout << "gcd(" << x << ", " << y << ") = " << gcd(x,y) << endl;
return(0);
}
// Fundamental idea of Euclid's algorithm (one of the oldest known algorithms)
// gcd(a,0) = a
// gcd(a,b) = gcd(b,a%b)
int gcd (int a, int b)
{
int temp;
while (b != 0)
{
temp = a % b;
a = b;
b = temp;
}
return(a);
}
int main ()
{
int x, y;
cout << "Enter two natural numbers: ";
cin >> x >> y;
cout << "gcd(" << x << ", " << y << ") = " << gcd(x,y) << endl;
return(0);
}
END of
CODE
The binary GCD algorithm,
also known as Stein's algorithm, is an algorithm that computes the greatest common divisor of two non-negative integers. By replacing divisions
and multiplications with shifts, it gains more efficiency by using the binary
nature of computers, especially in embedded platforms that have no direct
processor support for division. For
large integers, a fast GCD algorithm
is Lehmer’s, an improvement on the simpler but slower Euclidean algorithm.
Example
Applications.
The problem
may be to compute for the interest on a time-deposit. Or, the problem may be to
sort all the city’s residents’ names (e.g. in alphabetical order, last name
first), addresses, phone numbers, and zip codes, for the purpose of publishing
a directory, for print or online access. Either way, the algorithm input values
may be numerical, or alphabetical, or both (alphanumeric). All this information
may then be stored in databases, for later retrieval.
Another
example of this process is the Human Genome Project (HGP) which has been able
to determine the sequence of chemical base pairs which make up DNA as well as identifying
and mapping the approximately 20,000 of the human genome. The genome was broken
into smaller pieces, approximately 150,000 base pairs in length. From the start
to completion of this project, it took approximately two decades to study.
Scientists have made strides in identifying particular genes that underlie a
large number of diseases, both genetic and acquired. This project was
officially completed in 2003, but the data continues to provide leads into the
origin and nature of certain illnesses. Now, it is evolving into public and
private ventures to provide individuals, with tissue samples, with a profile of
their DNA and their propensities to a certain a disease or diseases. This then
gives clinicians an insight into the possible means of either prevention or
treatment of the patient’s condition.
The same algorithm may be coded differently,
depending on the programming language chosen. And conversely, specific
languages may be “better” suited for a specific algorithm or purpose.
Languages of
Choice.
We’ll just take up a few
of these, since already known ones are now applied to biology, and new ones are
steadily being developed. The choice of language depends on at least 4 items:
- Familiarity. Previous exposure or work with a particular programming language, and its derivatives. Consider FORTRAN and its child, BASIC. Then we would have Visual Basic, Visual Basic for Applications (VBA), True Basic, etc.
- Ease of learning and use. Since Theory and Experiment are melding, the ability to express one’s ideas (or hypothesis) depends on how rapidly the scientists can develop a program or module to test it. At this point there has to be a computer programmer, or a team member who is adept at this craft. This is where the algorithms are tested, possibly with computer generated “normal” test data first to make sure the algorithms make sense. Once you have actual subject data (animal model or human patient), you can run the actual findings in the same computer program in place of the test data.
- Time factor. From protocol design to analysis of results, the ability of the research team to come up with the study publication depends, in no small way, on the effort to learn and apply technology in less time than previously known. The pressure is on to produce a significant outcome. By this I mean, whether the results are positive or negative, it must sensibly add and improve on previously held concepts.
- Cost. The team head and his colleagues must have a firm grasp on the financial needs of the experiment itself. If it’s tight on a budget, then it has to consider using free open source software, or usually, build a program. If there is a free software package that meets its requirements, then you’re good to go. Otherwise, you’ll have to shop around for cost-effective software.
1.) Adapted from: Introduction to Algorithms, 3rd
Edition; T.H.Cormen, C.E. Leiserson, R.L.Rivest, C. Stein. ©2009 Massachussets
Institute of Technology
Happy coding.
Fernando Yaakov Lalana, M.D.
------------------------------------------------------------
Sites you
may want to visit:
Applications Websites
Applications Websites
QResearch:
Scientific
Computing:
Empower 3 Chromatography:
Databases
Science
Direct:
Bioinformatics
Factsheet:
Entrez, The
Life Sciences Search Engine:
http://www.ncbi.nlm.nih.gov/sites/gquery
Application
to try out…
Find the
Greatest Common Divisor [Web App]:
VSEncryptor
64-bit [Encryption Software, Windows Free,]:
http://download.cnet.com/VSEncryptor-64-Bit/3000-2092_4-75629878.html?tag=mncol;2
No comments:
Post a Comment