FUNCTIONAL GENOMICS

Functional genomics determines the functions of genes on a large scale using new and ‘high-throughput technologies’. The high-throughput analysis involves the simultaneous analysis of all genes of a genome. The high-throughput analysis is also termed as ‘transcriptome analysis’, which is the expression analysis of the full set of RNA molecules produced by a cell under a given set of conditions. Functional genomics is a general approach to assigning biological functions to genes with currently unknown roles in all organisms. It also finds a role in novel drug discovery. Functional genomics is mostly experiment based. Transcriptome analysis facilitates the understanding of metabolic, regulatory and signaling pathways within the cell.

Expressed Sequence Tags (ESTs)

This is one of the high-throughput approaches to genome-wide profiling of gene expression. ESTs are short sequences obtained from complementary DNA (cDNA) clones and they help in the identification of full-length genes. They are about 200–400 nucleotides in length obtained from the 5′ end or 3′ end of cDNA of interest. Libraries of cDNA clones are prepared. To generate EST data, clones in the cDNA library are randomly selected for sequencing from either end of the inserts. The EST data are able to provide a rough estimate of genes that are actively expressed in a genome. This is because the frequencies for particular ESTs reflect the abundance of the corresponding mRNA in a cell and hence gives a picture of the gene expression. By random sequencing of cDNA clones, EST helps to discover new genes. TIGR gene Indices (www.tigr.org/tdb/tgi.shtml) is an EST database and dbEST (http://www.ncbi.nlm.nih.gov/dbEst) is the EST database of GenBank (Figure 11.3).

Serial Analysis of Gene Expression (SAGE)

This is another throughput, sequence-based approach for gene expression. SAGE is more quantitative in determining the mRNA expression in a cell. In this method, short DNA fragments of about 15 bp are excised from cDNA sequences and used as unique markers of gene transcripts. The sequence fragments are called tags. They are subsequently linked together, cloned and sequenced. The transcripts are analysed computationally in a serial manner. Once gene tags are identified, their frequency indicates the level of gene expression. This approach is more efficient than EST analysis, as it uses short nucleotide tag to a gene transcript and allows sequencing of multiple tags in a single clone. SAGE analysis has a better chance of detecting weakly expressed genes (Figure 11.4).

Figure 11.3 EST analysis

Figure 11.4 SAGE-serial analysis of gene expression

The procedure involved can be discused as follows:

First, a cDNA strand of each transcript in the cell must be generated.
The mRNA of eukaryotes is polyadenylated, i.e., a poly(A) tail is added to the 3′ end of the final transcript.
A primer consisting of multiple ‘T’s can be made that will complementary base pair with the poly(A) tail of every mRNAs in a cell.
Once the primer has bound to the mRNA, the enzyme reverse transcriptase can make a DNA strand that is complementary to the RNA.
This DNA strand will then be converted to a double-stranded DNA molecule.
The cDNA that has been created is then cleaved using an ‘anchoring enzyme’.
The anchoring enzyme is a restriction endonuclease that recognizes and cuts specific four bp DNA sequences. Since this enzyme requires only four specific nucleotides, it cleaves DNA molecules often, resulting in every cDNA that has been generated being cut at least once.
The cut cDNA is then bound to streptavidin beads with the help of its multiple thymidine (Ts) at its 3′ end, thereby it is immobilized.
The sample of bound cDNAs is then divided in half and ligated to either linker A or B. These linkers are designed to contain a Type IIS restriction site.
Type IIS restriction endonucleases cut at a defined distance up to 20 bp away from their recognition sites. The Type IIS restriction endonuclease, also called the ‘tagging enzyme’, cleaves the cDNA to release it from its bound bead.
Blunt ends are then created, so that neither the 3′ nor 5′ end has overhanging single-stranded sequences.
Once this is achieved, the cDNA tags bound to linker A and B are ligated to each other to create ditags.
These ditags have linker A on one end, linker B on the other and both transcript tags are adjacent to one another in the middle.
These ditags are then amplified by PCR, using primers that are complementary to sequence in either linker.
Once the ditags have been amplified, they are then cleaved using the anchoring enzyme again.
This has two effects: first, it releases the linkers from either end of the ditag, leaving only sequence from the two tags. Second, it creates sticky ends, or 3′ and 5′ ends that have overhanging, single-stranded DNA that can complementarily base pair with single-stranded DNA of another ditag.
In this way, all of the ditags generated are linked, or concatenated to produce one long string of tags. This collection of tags is then introduced into a vector to be cloned and sequenced.

FUNCTIONAL GENOMICS

FUNCTIONAL GENOMICS

Expressed Sequence Tags (ESTs)

Serial Analysis of Gene Expression (SAGE)

Leave a Reply Cancel reply