What is the genetic code and how does it work?
No matter how much morphological diversity we living beings present, we are all united under the same umbrella: our basic functional unit is the cell. If a living being has a cell on which its entire morphological structure is based, it is known as unicellular (the case of protozoa or bacteria), while those of us with several (from a few hundred to hundreds of billions) are multicellular beings.
Thus, every organism starts from the cell and, therefore, some molecular entities such as viruses are not considered strictly “alive” from a biological point of view. In turn, studies have characterized that each cell contains a whopping 42 million protein molecules. Therefore, it is not surprising that 50% of the weight of dry living tissues is estimated to be composed solely of proteins.
Why do we provide all this seemingly unrelated data? Today we come to unravel the secret of life: the genetic code. As mysterious as it may be at first glance, we assure you that you will understand this concept immediately. The thing is about cells, proteins and DNA. Stay to find out.
- Related article: "Differences between DNA and RNA"
What is the genetic code?
Let's start out clearly and concisely: the genetic code is nothing more than the set of instructions that tell the cell how to make a specific protein. We have already said in previous lines that proteins are the essential structural unit of tissues alive, which is why we are not facing an anecdotal question: without proteins there is no life, so simple.
The characteristics of the genetic code were established in 1961 by Francis Crick, Sydney Brenner, and other collaborating molecular biologists. This term is based on a series of premises, but first we must clarify certain terms to understand them. Go for it:
- DNA: nucleic acid that contains the genetic instructions used in the development and functioning of all existing living organisms.
- RNA: nucleic acid that performs various functions, including directing the intermediate stages of protein synthesis.
- Nucleotides: the organic molecules that, together, give rise to the DNA and RNA chains of living beings.
- Codon or triplet: every 3 amino acids that form RNA form a codon, that is, a triplet of genetic information.
- Amino acid: organic molecules that, in a certain order, give rise to proteins. 20 amino acids are encoded in the genetic code.
The bases of the genetic code
Once we are clear about these very basic terms, it is time for us to explore the main features of the genetic code, established by Crick and his colleagues. These are the following:
- The code is organized in triplets or codons: every three nucleotides (codon or triplet) encodes an amino acid.
- The genetic code is degenerate: there are more triplets or codons than there are amino acids. This means that an amino acid is usually encoded by more than one triplet.
- The genetic code is not overlapping: a nucleotide only belongs to a single triplet. That is, a specific nucleotide is not in two codons at the same time.
- The reading is "without commas": we do not want to incur too complex terminology, so we will say that there are no "spaces" between the codons.
- The nuclear genetic code is universal: the same triplet in different species codes for the same amino acid.
Unraveling the genetic code
We already have the terminological bases and the theoretical pillars. Now it's time to put them into practice. First of all, we will tell you that Each nucleotide receives a name based on a letter, which is conditioned by the nitrogenous base that it presents. The nitrogenous bases are the following: adenine (A), cytosine (C), guanine (G), thymine (T) and uracil (U). Adenine, cytosine, and guanine are universal, while thymine is unique to DNA and uracil is unique to RNA. If you see this, what do you think it means ?:
CCT
CCU
It is time to recover the terms described above. CCT is part of a DNA chain, that is, 3 different nucleotides: one with the cytosine base, another with the cytosine base and another with the thymine base. In the second case of bold letters we are in front of a codon, since it is the “taducidated” DNA genetic information (hence there is a uracil where before there was a thymine) in an RNA chain.
Thus, we can affirm that CCU is a codon that codes for the amino acid proline. As we have said before, the genetic code is degenerate. Thus, the amino acid proline is also encoded by other codons with different nucleotides: CCC, CCA, CCG. So the amino acid proline is encoded by a total of 4 codons or triplets.
It should be noted that it is not that the 4 codons are needed to code for the amino acid, but that any of them is valid. In general, essential amino acids are encoded by 2,3,4 or 6 different codons, except methionine and tryptophan that only respond to one each.
- You may be interested in: "Tryptophan: characteristics and functions of this amino acid"
Why so much complexity?
Let's do calculations. If each codon were encoded by only one nucleotide, only 4 different amino acids could be formed. This would make protein synthesis an impossible process, since in general each protein is made up of about 100-300 amino acids. There are only 20 amino acids included in the genetic code, but these can be arranged in different ways along the "assembly line" to give rise to the different proteins present in our tissues.
On the other hand, if each codon were made up of two nucleotides, the total number of possible "diplets" would be 16. We are still far from the goal. Now, if each codon were made up of three nucleotides (as is the case), the number of possible permutations would increase to 64. Taking into account that there are 20 essential amino acids, with 64 codons it gives to encode each one of them and, on top, offer different variations in each case.
An applied look
We are running out of space, but it is truly complex to concentrate so much information in a few lines. Follow us in the following diagram, because we promise you that closing all this terminological conglomerate is much easier than it seems:
CCT (DNA) → CCU (RNA) → Proline (ribosome)
This small diagram expresses the following: cellular DNA contains the 3 nucleotides CCT, but it cannot “express” genetic information, since it is isolated from the cellular machinery in its nucleus. For this reason, the RNA polymerase enzyme is responsible for TRANSCRIBING (a process known as transcription) the DNA nucleotides into RNA nucleotides, which will form the messenger RNA.
Now we have the CCU codon in the messenger RNA, which will travel out of the nucleus through its pores to the cytosol, where the ribosomes are located. In summary, we can say that messenger RNA gives this information to the ribosome, which "understands" that the amino acid proline must be added to the amino acid sequence already built to give rise to a specific protein.
As we have said before, a protein is made up of about 100-300 amino acids. Thus, any protein formed from the order of 300 amino acids will be encoded by a total of 900 triplets (300x3) or, if you prefer, by 2,700 nucleotides (300x3x3). Now, imagine each of the letters in each of the 2,700 nucleotides, something like: AAAUCCCCGGUGAUUUAUAAGG (...) It is this arrangement, this conglomeration of letters, which really is the genetic code. Easier than it first seemed, right?
Resume
If you ask any biologist interested in molecular biology about the genetic code, you will surely have a conversation for about 4-5 hours. It is truly fascinating to know that the secret of life, unreal as it may seem, is contained in a specific succession of "letters".
So that, the genome of any living being can be mapped with these 4 letters. For example, according to the human genome project, all the genetic information of our species is made up of 3,000 million base pairs (nucleotides), which are found on the 23 pairs of chromosomes within the nucleus of all our cells. Of course, no matter how different living beings are, we all have a common “language”.
Bibliographic references:
- What is the genetic code? genotipia.com. Recovered from: https://genotipia.com/codigo-genetico/
- Asimov, I., & de la Fuente, A. M. (1982). The genetic code (No. Sirsi) i9789688561034). Plaza & Janés.
- Genetic code, National Human Genome Research Institute. Recovered from: https://www.genome.gov/es/genetics-glossary/Codigo-genetico
- Genetic code: characteristics and deciphering, Complutense University of Madrid (UCM). Recovered from: https://www.ucm.es/data/cont/media/www/pag-56185/08-C%C3%B3digo%20Gen%C3%A9tico-caracter%C3%ADsticas%20y%20desciframiento.pdf
- The Genetic Code, Khanacademy.org. Recovered from: https://es.khanacademy.org/science/ap-biology/gene-expression-and-regulation/translation/a/the-genetic-code-discovery-and-properties
- It's official: there are 42 million protein molecules in every cell, europapress.com. Recovered from: https://www.europapress.es/ciencia/laboratorio/noticia-oficial-hay-42-millones-moleculas-proteina-cada-celula-20180117181506.html
- Lee, T. F. (1994). The Human Genome Project: breaking the genetic code of life (No. Sirsi) i9788474325072).