Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1.

Article Details

Citation

Nomura N, Miyajima N, Sazuka T, Tanaka A, Kawarabayasi Y, Sato S, Nagase T, Seki N, Ishikawa K, Tabata S

Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1.

DNA Res. 1994;1(1):27-35.

PubMed ID
7584026 [ View in PubMed
]
Abstract

We established a protocol for the prediction of the coding sequences of unidentified human genes based on the double selection and sequence analysis of cDNA clones with inserts carrying unreported 5'-terminal sequences and with insert sizes corresponding to nearly full-length transcripts. By applying the protocol, cDNA clones with inserts longer than 2 kb were isolated from a cDNA library of human immature myeloid cell line KG-1, and the coding sequences of 40 new genes were predicted. A computer search of the sequences indicated that 20 genes contained sequences similar to known genes in the GenBank/EMBL databases. The sequences of the remaining 20 genes were entirely new, and characteristic protein motifs or domains were identified in 32 genes. Other sequence features noted were that the coding sequences of 23 genes were followed by relatively long stretches of 3'-untranslated sequences and that 5 genes contained repetitive sequences in their 3'-untranslated regions. The chromosomal location of these genes has been determined. By increasing the scale of the above analysis, the coding sequences of many unidentified genes can be predicted.

DrugBank Data that Cites this Article

Polypeptides
NameUniProt ID
Probable leucine--tRNA ligase, mitochondrialQ15031Details
Farnesyl pyrophosphate synthaseP14324Details
Phosphatidylserine synthase 1P48651Details
Disintegrin and metalloproteinase domain-containing protein 9Q13443Details
Nucleolar and coiled-body phosphoprotein 1Q14978Details
Splicing factor 3B subunit 3Q15393Details
P2Y purinoceptor 14Q15391Details
Adenylate cyclase type 7P51828Details