The field of human genetics
has seen many breakthroughs in the past few
decades. Thanks to the Human Genome Project, we now have a nearly
complete sequence of the
DNA that makes us human. And thanks to other
researchers, we now understand many of the details of how our genes
create human cells.
Of course, as often happens in science, learning
more about our genetics has raised as many
new questions as it has
answered. There are still some enormous gaps
in our understanding of DNA-- how it works,
and how it came to be.
Let's take a look at some of the currently unsolved riddles
From Soup to DNA
The first question is simple-- how did DNA ever manage to come
into existence, in the first place? It's a long and complex
molecule, and extremely unlikely to have
formed spontaneously on its own, back in the days of the primordial
Even worse, DNA is useless without a complex
set of proteins that synthesize it, duplicate it, and then ‘read’ it.
Those proteins, in turn, couldn't have formed without the DNA genes
that now produce them. It's a 'chicken
and egg' problem… so, which came first, the proteins
or the DNA?
It is not a problem of raw materials, since
there is strong evidence that the Earth of
4 billion years ago contained the chemical
ingredients for life. Amino acids, nucleic
acids and other 'life friendly' materials are already present on
other planets, in comets and in interstellar space, and it's
also likely that they were synthesized on
Earth by various natural processes.
However, trying to find a pathway from the
primordial soup of Oparin and
to the formation of DNA strands
is not so easy. Scientists have proposed
many theories for the early origins
of life-- from Darwin's 'warm
little pond' ,
to the currently popular 'RNA world' . But so far, nobody
has described a full set of chemical
steps capable of making the jump
from chaos to living organisms.
As a possible answer, the first half of this
book will explore a sequence of early, self-replicating
molecules that bridge the gap from random
organic soup to life. They will start
simple enough that they could have formed
on their own, even in a world with no true
'natural selection'. Then they gradually
become living cells with modern DNA.
We will use some parts of the 'RNA world' theory, but
we'll offer a clearer explanation for the formation of the
first genetic chains, and their eventual
conversion to the DNA that we have today.
Introns and Protein Positioning
Another mystery of modern genetics is the
presence of introns-- lengths of non-coding
DNA that are smack in the middle of almost
every gene. Cells remove the introns before
the gene is transcribed into protein-- either via complex enzymes
called spliceosomes, or by a clever 'self-splicing' action
that lets introns act as a catalyst for their
To put it mildly, this is a completely ridiculous
system! Why on earth would living organisms
go to so much trouble to muck up their genetic
code with such a Rube Goldbergian contraption?
Since introns are present in nearly all living
organisms, the only reasonable conclusion
is that they serve some extremely important
function-- one that
must be just as vital as protein-coding. What could it possibly be?
When we explore the early evolution of DNA,
we'll see a possible answer, since it's likely that the
earliest genetic chains were used for much more than just genetic
code. We'll look at some ways in which our genetic material
could have directly helped proteins to function, particularly in
the very early days of life's evolution. As it happens, some
of those functions are still necessary, and those 'legacy' uses
for RNA are preserved in the introns.
A large part of the story to come is about
the coevolution of regular protein-coding
genes, and the other types of genetic material
that help them to work more effectively.
Gene Count and Satellite DNA
Another mystery that has come out of the
Human Genome Project is the fact that humans
have far fewer genes than was originally
expected. In fact, our gene count is about
the same as a fruit fly's, and not much more
than a simple 959-cell roundworm. How can that be possible?
Intuition would suggest that there are simply
too many pieces to a human being, to be specified
by a mere 23,000 genes. After
all, our body contains dozens of organs,
each built from hundreds of different cell
types. Each of those cells contain hundreds
of specialized structures. If you do the
math, that ought to mean hundreds
of thousands of genes, merely to
specify our cells and organs.
Even worse, let's consider one of our more complicated organs-- our
brains. Humans have about 100 billion neuron
cells. Those cells are arranged just right so we can do some remarkable
and complex tricks.
For example, we are hard-wired so we can
recognize a familiar face in a crowd almost instantly, and then filter
their voice from a dozen
others when we talk to them. Or we can learn
how to look at patterns on a page, and translate them into abstract
concepts such as this
To accomplish such subtle tasks, our brains
need to have some very clever wiring, which
ought to require at least a few genes for
each bit of behavior. There are many thousands
of complex processing tasks that our brains
do routinely, so shouldn't
it take a huge number of genes to make that
In fact, to make behavior more 'evolvable',
it would be very useful to have some
sort of 'programming language'
controlling neuron connections, so
a good connection that produces
a useful new behavior can actually
pass along effectively to the next generation. It's hard
to imagine how any protein could
ever provide that sort of control.
On the other hand, protein-coding genes make
up less than 2% of our DNA. The remaining
98% is frequently very repetitive, and currently
without any known use. It is often called 'junk
DNA', or 'satellite DNA' when scientists are being
Could there be a connection?
The small number of human genes is no longer
a mystery if satellite DNA also serves a
role in our genetics. In fact, it would be
quite logical if the extra 98% of our genome just
happened to contain 98% of our genetic coding.
The human genome includes approximately 2
million distinct chunks of satellite DNA
that are each enclosed within a 'jumping gene', or transposon. Altogether they make
up more than 40% of our DNA. If each piece of satellite DNA functions
as a gene, then we'd have plenty of extra information for coding
the myriad details of our complex bodies.
But, how could such simple, repetitive pieces of genetic code ever
manage to do that?
In the last half of this book we'll talk about the evolution
of genetic scripts, stored in introns and
satellite DNA-- first as a way to manage the size, content and position
of structures within
cells, and then as a way to specify the details
of tissues and organs in multi-cellular organisms.
By their gross appearance, these scripts
are much simpler than the protein-coding
portions of our genome, but they still carry
extremely valuable information. From an evolutionary
point of view, they act just like genes,
but in a more specialized way.
As it will turn out, satellite DNA is probably
the most important part of our genome, and
well deserving of its dominant presence in
our DNA. You might even say that it's
far more clever than the protein-coding parts, since it is data for
a much more complex 'programming language' than the one
that turns DNA into proteins.
Speed of Evolution and Transposons
One final problem in modern genetics is the
speed and reliability of evolution itself.
Simply put, nearly all random changes in
a protein-coding gene will be either ineffective,
or lethal. Only an extremely small percentage
of mutations will ever result in an
Because of that inefficiency, it seems necessary
that there be either an extremely high mutation
rate, or an extremely slow rate of evolutionary
And yet, in practice, even very complex organisms
are able to create functional offspring with
more than a 90% success rate, and species
still manage to evolve quickly enough that they
can handle all but the most cataclysmic of
A simple phenomenon called 'replication slippage' provides
most of the answer. It randomly changes the
length of repetitive DNA in the genome, making some scripts a bit
longer in each generation,
and some shorter. That is a safe way to make
minor changes in the details of multi-cellular creatures, so a species
can ease into the
optimum size, shape and position for each
In fact, most evolutionary change in complex
organisms probably happens via script changes,
or by mutations in the ID sequences that
pick which script to use.
A more dramatic part of the answer may result
from transposons, or 'jumping genes', which sometimes
make more or less random changes to our genetic material. That might
seem like a very bad idea-- unless it somehow increases the
chances for beneficial changes in some 'directed' way.
In the last half of this book, we'll talk about the use of
transposons as a 'delivery vehicle' for scripts stored
in the satellite DNA. Most likely, that is
their primary function, but it looks like they may have an important
fringe benefit, as well.
What if transposons are, in fact, a sophisticated
method to create large mutations with a higher
than usual probability of success? As we'll see, there are some fairly simple genetic 'tricks' that
DNA can use to greatly increase the effectiveness
of evolutionary changes, thanks to the scrambling effect of transposons.
It might make sense to consider transposons
as a way to cause 'smart' mutations that greatly increase
the odds of successful evolution. We'll talk more about their
evolutionary role near the end of this book.
What's to Come
You might say that this is a love story.
It starts with two molecules that just happened
to meet, about 4 billion years ago. There
was some interesting 'chemistry' between
them, so they spent some time together, and eventually created Life.
It's much like one of those multi-generation romance novels,
only the characters are way smaller, and
their bodice-ripping passion happens only on a molecular level.
We'll trace the birth of several generations of chemical characters,
starting from pure random 'primordial soup', and eventually
becoming self-replicating organisms, primitive cells, and then multi-cellular
organisms. Along the way, they'll also develop the modern system
of DNA and protein transcription.
Most of the characters in this story are
rather long and complicated molecules, despite
their sub-microscopic size. However, we'll treat them casually enough that they should
be understandable, even for readers who can't tell a pyrimidine
from a potato.
Our main story line is all about plot and
chemistry. However, if you'd like to have a gradual build up
to the main action, then you might want to start with the first three
chapters in the Appendix. They set the scene, by looking at conditions
on Earth four billion years ago. One chapter pays close attention
to tidal puddles and pools, since they are such a hospitable location
for our story to take place , and an excellent, romantic vacation
destination, besides. But you can skip those chapters completely
if you only want to read the 'juicy' parts!
We'll make many guesses and use plenty of imagination as we
attempt to 'reverse engineer' our DNA heritage. There
is no way to make firm conclusions, when speculating about conditions
4 billion years ago, or even 1 billion years ago. It's highly
unlikely that every single detail of this story will prove to be
correct. But with luck, maybe we'll manage to get a few things
Perhaps some of the insights in this story
will make it easier to understand how we
humans are created from our genes, and how those
genes came to be.