Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1.
Article Details
- CitationCopy to clipboard
Nomura N, Miyajima N, Sazuka T, Tanaka A, Kawarabayasi Y, Sato S, Nagase T, Seki N, Ishikawa K, Tabata S
Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1.
DNA Res. 1994;1(1):27-35.
- PubMed ID
- 7584026 [ View in PubMed]
- Abstract
We established a protocol for the prediction of the coding sequences of unidentified human genes based on the double selection and sequence analysis of cDNA clones with inserts carrying unreported 5'-terminal sequences and with insert sizes corresponding to nearly full-length transcripts. By applying the protocol, cDNA clones with inserts longer than 2 kb were isolated from a cDNA library of human immature myeloid cell line KG-1, and the coding sequences of 40 new genes were predicted. A computer search of the sequences indicated that 20 genes contained sequences similar to known genes in the GenBank/EMBL databases. The sequences of the remaining 20 genes were entirely new, and characteristic protein motifs or domains were identified in 32 genes. Other sequence features noted were that the coding sequences of 23 genes were followed by relatively long stretches of 3'-untranslated sequences and that 5 genes contained repetitive sequences in their 3'-untranslated regions. The chromosomal location of these genes has been determined. By increasing the scale of the above analysis, the coding sequences of many unidentified genes can be predicted.
DrugBank Data that Cites this Article
- Polypeptides
Name UniProt ID Probable leucine--tRNA ligase, mitochondrial Q15031 Details Farnesyl pyrophosphate synthase P14324 Details Phosphatidylserine synthase 1 P48651 Details Disintegrin and metalloproteinase domain-containing protein 9 Q13443 Details Nucleolar and coiled-body phosphoprotein 1 Q14978 Details Splicing factor 3B subunit 3 Q15393 Details P2Y purinoceptor 14 Q15391 Details Adenylate cyclase type 7 P51828 Details