IBM Research Findings

IBM Research Findings

A team from IBM has identified patterns, or "motifs", in the non-coding areas of the human genome and those coded for proteins. The findings were reported in the Proceedings of the National Academy of Sciences (PNAS) journal.

Lead author Isidore Rigoutsos and his colleagues from IBM\’s TJW Research Center used a mathematical tool known as pattern discovery to tease out patterns in the genome. This technique is often used to mine useful information from very large repositories of data in large main frame computers.

They sifted through about six billion letters in the non-coding regions of the human genome and searched for repeating sequence fragments, or motifs. The contents of this paper suggests that so-called ‘junk’ DNA may not be so junky.


The double-stranded DNA molecule is held together by four chemical components called bases:

Adenine (A) bonds with Thymine (T)

Cytosine(C) bonds with Guanine (G)

Groupings of these "letters" form the "code of life". There are over 3 billion base-pairs in the human genome wound into 24 distinct bundles called chromosomes. Within the DNA are about 20,000 -25,000 genes which human cells use as starting templates to make proteins. These sophisticated molecules build and maintain our bodies.

The researchers found millions of motifs in non-coding DNA. Roughly 128,000 of these also occurred in the coding region of the genome. They were also over-represented in genes which are involved in specific biological processes.

These processes include the regulation of transcription
– the beginning of the process that ultimately leads to the translation of the genetic code into a peptide or protein – and communication between cells.

Dr Rigoutsos team\’s work suggests "a connection between a vast area of the genome we didn\’t think was functional with the part of the genome we knew was functional".

Gene silencing

The paper suggests that the actual positioning of the motifs is associated with small RNA molecules that are involved in a process called Post-Transcriptional Gene Silencing (PTGS):

A human embryo starts out as a single fertilized cell that divides into a widely complex series of cells that become a human being. Every cell in that human being contains the same complement of genes. What makes each cell different is the precise way that genes are turned on and off.

PTGS turns genes off after the process of transcription has taken place. One way in which this occurs is through "RNA interference", which involves the introduction of double-stranded RNA molecules. These trigger the degradation of another type of RNA molecule known as messenger RNA (mRNA), "down-regulating" the gene.

During transcription, this molecule encodes and carries information from genes to sites of protein synthesis.

"These regions may indeed contain a structure that we haven\’t seen before," said Dr Rigoutsos. "If indeed one of them corresponds to an active element that is involved in some kind of process, then the extent of cell process regulation that actually takes place is way beyond anything we have seen in the last decade."

Would it be worth your while to consider the incredible consequences of this discovery by the IBM people, and how you could possibly benefit from this understanding??

How about taking this information into consideration when applying appliqués, or some of the practices we offer, such as the c2C, and 3DAA?