Genes And Knowledge


Stripped to its core, a living organism is nothing more than the packet of information recorded in its genes. And yet, if the very essence of life is information, one has to wonder why a column of numbers or a line of words isn't alive. Obviously, when digits or letters are arranged in a particular sequence, they convoy information. But just as clearly, information, in and of itself, is not alive.


说到底,生物只不过是一包记录在基因中的信息。但是,如果生命的基本要素是信息的话,人们不禁要问,为什么一列 数字或一行文字却没有生命?显然,数字和字母按特定的顺序排列可传输信息;但是,同样也很清楚的是,信息本身却没有生命。

Genetic information is special because it alone can make copies of itself. This remarkable ability is the basis of all the other differences that distinguish the living from the nonliving. Even a crystal of table salt is a form of information. Its sodium and chlorine atoms are arranged in a precise order, but a salt crystal cannot duplicate itself. Of all the substances on earth, only DNA, the molecule that carries genetic information, can orchestrate its own replication.


遗传信息的特殊之处在于它可以独自自我复制。这一非凡的能力是区分生物和非生物的各种差别的基础。即使一颗食盐 晶体也是一种信息,其中钠原子和氯原子按精确的顺序排列。但是,食盐晶体却不能自我复制。地球上所有物质中,唯有携带遗传信息的DNA(脱氧核糖核酸)能够实现自我复制。

DNA's capacity to self-copy, as well as its ability to encode information, stems from its peculiar shape. First described in 1953 by James Warson and Francis Crick, the structure of the DNA (deoxyribonucleic acid) molecule is a double helix, a shape that looks like a long ladder twisted into a corkscrew. Each rung is a letter in a chemical alphabet limited to just four symbols. Arranged in varying but exact sequences, incredibly long strings of these four letters spell out the instructions for building and operating all living things. Every organism that has ever lived on this planet, from the greatest dinosaurs to the tiniest viruses, is a product of information recorded in its own particular version of the DNA molecule.


DNA的自我复制能力和编码能力来源于其独特的形状。一九五三年,詹姆斯·沃森和佛朗西斯·克里克首次描述了DNA分子的结构。其形状为一个双螺旋,象是一个长梯子被拧成了开塞钻似的。每个梯级为有限的四个化学符号中的一个字母,这四个字母排列变化有序,由这些字母连成的极长的字母串给出形成生命和维持生命的指令。曾在地球上生活过的所有的生物,从最大的恐龙到最小的病毒,都是记录在其特殊的DNA分子中的信息的产物。

No one knows how nature happened to settle on a coding system of four symbols. The simplest possible way of recording information, called binary notation, needs just two symbols--1 and 0. Each binary symbol conveys one binary digit, or bit, of information. Like a simple yes or no answer, a bit is the smallest fragment of information one can receive and still learn anything at all. All the information flowing through the circuits of digital computers is encoded in immensely long strings of binary 1s and 0s.


谁也不知道自然界为什么采用这一四个符号的编码系统。最简单的记录信息的方法,即二进制法,仅需两个符号—— 1 和0。每个二进制符号表达一个二进制数位信息。如同一个简单的是或否的回答一样,一个数位是能表达一定意义的最小信单元,数字计算机中所有的信息流都是由很长很长的一串串1和0的代码来表示的。

Whether a chunk of information happens to be recorded by the four symbols of DNA or the digits used by computers, the basics of information processing are much the same. Meaning is captured in a linear sequence of a few simple symbols arranged in a precise order. And even though it is somewhat more complicated than the binary system, DNA's method has worked four billion years.


无论信息是用DNA的四个符号记录的或是用计算机上的二进制数码记录的,信息处理的基本原理确是相同的。要传递的 信息包含在由排列有序的简单符号所组成的线性序列中。DNA的四符号编码方法虽然比二进制编码复杂,但它从诞生到现在已有四十亿年了。

The letters in genetic code (A, T, G, and C) are read off in groups called codons just as computers read 1s and 0s in groups called bytes. Each codon stands for one amino acid, the building block of proteins. For example, if DNA's letters are arranged in order TGG AAG ATC, the first codon-TGG-will be interpreted by the cell's machinery to mean, "Place the amino acid tryptophan here." The next codon, AAG, codes for the amino acid lysine. And so on. One after another, like beads on a string, amino acids are assembled into the proteins that make up living tissue.


遗传编码 的字母(A、T、G和C)是按密码子识读的,就象计算机识读二进制数中由1和0组成的字节一样。每个密码子代 表一个氨基酸。氨基酸是组成蛋白质的最小结构单元。举例来说,如果DNA的排列顺序为TGG AGG ATG,那么,第一个密码子TGG就会被细胞机构解释为“在这里放一个色氨酸”。下一个AAG代表赖氨酸,等等。氨基酸象串珠上的珠子一样,一个接一个地被安放到构成生物组织的蛋白质中。

DNA's double-helix architecture also endows the molecule with the ability to make precise copies of the information it contains. Each rung in DNA's spiral ladder is actually formed by a pair of its chemical letters-adenine (A), thymine (T), guanine (G), and cytosine (C). A and T fit each other perfectly, as do G and C. Consequently, the four chemical letters always form two rungs, AT and GC. With only two kinds of rungs, it might seem that DNA uses a two-symbol code. But, in biochemistry, physical orientation makes a difference. Viewed form the vantage point one of the ladder's side rails, the TA rung is read as a T, while the same rung flipped over, AT, represents an A.


DNA的双螺旋结构还使得分子具有准确复制其本身所包含的信息的能力。螺旋梯上的每一个梯级实际上是由两个字母组 成的。DNA结构中的四个化学字母分别为:腺嘌呤(A)、胸腺嘧啶(T)、鸟嘌呤(G)及胞嘧啶(C)。A和T可完美地对接,G和C也是如此。因此,这四个化学符号总是形成两个梯级AT和GC。由于只有两个梯级,看起来好象DNA用的是双符号系统。实际上在生物化学中,氨基酸的取向是非常重要的。从梯子的一个边框方向看去,TA梯级被识读成T,若将同一梯级翻过来,也就是AT,则被识读为A。

When a DNA molecule copies itself, its rungs split down the middle. Each A lets go of its T and each G releases its C. The side rails of the molecule zipper apart, and the spiral ladder becomes two separate spirals, each with severed half-rungs hanging free. Because A will only bond to T and G will only cling to C, the sequence of broken rungs on each of these half-molecules is a mirror image of the other. From the chemical soup floating around the replicating DNA, unattached letters link up with the mates that are still hanging to the side rails. When this process is completed, two new DNA molecules appear. Each is an exact replica of the parent molecule.

DNA分子复制时,其梯级从中间断开,每个A松开与之相连的T,每个G放开与之相连的C。于是,分子的两个边框就象 拉链似地拉开。由于A只能和T连接,G只能和C连接,每个断开的半分子梯框互成镜像关系。这时,复制DNA周围的化学溶液中未结合的自由字母就与挂在半梯框上的配对字母相连接。当这一过程完成时,两个新的DNA分子就形成了,每一个都是母分子的精确复制品。