Biological and genetic information; Genomic information; 3D structure of protein; Protein function; Application of bioinformatics databases for tropical diseases
1) Students understand basis of bioinformatics .
2) Students have bioinformatics knowledge and skills to apply for tropical medical researches
3) Students use computer based bioinformatic tools properly and masterly.
Course-level Learning Outcomes: CLOs
At the end of the course, the students should be able to
CLO1: Describe basis of bioinformatics and importance of bioinformatics application
CLO2: Select appropiate bioinformatic tools for tropical diseases related biological data analysis.
CLO3: Perform bioinformatics tools properly for biological data analysis in tropical diseases.
I have a question ?
1. How would you construct a benchmark dataset for genomic DNA? What features would you need to consider (e.g., protein versus DNA, degree of conservation, chromosomal rearrangements)?.
2. How is quality control maintained in GenBank, given that thousands of individual investigators submit data?.
1. I tried to make clear for your question. I guessed the benchmark datasets for genomic DNA you mentioned is referred to whole genome sequences for comparative genomics. Mostly, the similar sequences of nucleotides can be considered to have similar functions. If a certain gene contains similar to other genes even other organisms, it would be predicted as it probably has the similar function. Gene, actually, is encoded for a protein, so, we could predict the function of the protein by similarity of nucleotide/amino acid sequences. Here, if you can see the similarity of nucleotide sequences among different species, we would mention that those genes are CONSERVED (Conservation) and we can predict they potentially have similar function. The region which is conserved, particular to amino acid sequence. We can refer as “domain”.
2. For Genbank database, it will be curated with experts and create the reliable nucleotide sequence, reference sequences, for all organisms. Here, the curators who taking care the databases will create reference sequences as Refseq for completed genome sequencing organisms for the purpose of other downstream bioinformatic processes. Moreover, there is protein database, which derived amino acid sequences for primary nucleotide sequences like TrEMBL. You can use this database in case of you need to study on protein researches. But you have to keep in mind that TrEMBL, normally, contains no reviewed protein sequences. It is recommended you are better to use Swiss-prot database instead, by the way, the entries in this database is relatively less compared to TrEMBL.