Resumo
A bioinformática vem revolucionando o modo como os cientistas analisam e interpretam os dados genéticos. Dessa forma, esta revisão destaca o papel fundamental das ferramentas de bioinformática na compreensão dos dados genômicos. O artigo explora a diversidade de softwares e algoritmos disponíveis para processar, analisar e interpretar dados genéticos. Aborda-se a relevância dessas ferramentas na identificação de genes, variações genéticas, predição de estruturas proteicas e estudos de evolução e filogenia. Além disso, são apresentados os desafios enfrentados na bioinformática, incluindo a integração de dados de diferentes fontes, padronização e interpretação dos resultados. É disponibilizado no artigo informações sobre alinhamento de sequências, limpeza de dados de sequenciamento, que são dados importantes quando se trabalha com conjunto de dados genéticos. Destaca-se ainda que discussões como estas são importantes, pois as ferramentas de bioinformática estão constantemente evoluindo, o que requer atualização constante de conhecimento e habilidades por parte dos pesquisadores.
Referências
- ABDELKRIM, R. Bioinformatics: An Exciting Field of Science-Importance and Applications. Journal of Concepts in Structural Biology & Bioinformatics (JSBB), v. 1, n. 4, 2023.
- ARONSON, S. J.; REHM, H. L. Building the foundation for genomics in precision medicine. Nature, v. 526, n. 7573, p. 336-342, 2015.
- BELKADI, A. et al. Whole-genome sequencing is more powerful than whole-exome sequencing for detecting exome variants. Proceedings of the National Academy of Sciences, v. 112, n. 17, p. 5473-5478, 2015.
- BOCK, C. Analysing and interpreting DNA methylation data. Nature Reviews Genetics, v. 13, n. 10, p. 705-719, 2012.
- BOUCKAERT, R. et al. BEAST 2: A Software Platform for Bayesian Evolutionary Analysis. PLOS Computational Biology, v. 10, n. 4, p. e1003537, 2014.
- CERVANTES-PÉREZ, S. A. et al. Challenges and perspectives in applying single nuclei RNA-seq technology in plant biology. Plant Science, v. 325, p. 111486, 2022.
- GUINDON, S. et al. Estimating Maximum Likelihood Phylogenies with PhyML. Em: POSADA, D. (Ed.). Bioinformatics for DNA Sequence Analysis. Methods in Molecular Biology. Totowa, NJ: Humana Press, p. 113–137, 2009.
- IQBAL, N.; KUMAR, P. From Data Science to Bioscience: Emerging era of bioinformatics applications, tools and challenges. Procedia Computer Science, v. 218, p. 1516-1528, 2023.
- JO, H.; KOH, G. Faster single-end alignment generation utilizing multi-thread for BWA. Bio-medical materials and engineering, v. 26, n. s1, p. S1791-S1796, 2015.
- KANEHISA, M. The KEGG database. In: ‘In silico’simulation of biological processes: Novartis Foundation Symposium 247. Chichester, UK: John Wiley & Sons, Ltd, 2002. p. 91-103.
- KANZI, A. M. et al. Next generation sequencing and bioinformatics analysis of family genetic inheritance. Frontiers in Genetics, v. 11, p. 544162, 2020.
- KATOH, K.; ASIMENOS, G.; TOH, H. Multiple alignment of DNA sequences with MAFFT. Bioinformatics for DNA sequence analysis, p. 39-64, 2009.
- LAM, H. Y. K. et al. Detecting and annotating genetic variations using the HugeSeq pipeline. Nature biotechnology, v. 30, n. 3, p. 226-229, 2012.
- LAM, H-M. et al. Resequencing of 31 wild and cultivated soybean genomes identifies patterns of genetic diversity and selection. Nat Genet 42, 1053–1059, 2010.
- LAN, K. et al. A survey of data mining and deep learning in bioinformatics. Journal of medical systems, v. 42, p. 1-20, 2018.
- LARSSON, O.; WAHLESTEDT, C.; TIMMONS, J. A. Considerations when using the significance analysis of microarrays (SAM) algorithm. BMC bioinformatics, v. 6, n. 1, p. 1-6, 2005.
- LARTILLOT, N. PhyloBayes: Bayesian Phylogenetics Using Site-heterogeneous Models. IN: SCORNAVACCA, C.; DELSUC, F.; GALTIER, N. (Eds.). Phylogenetics in the Genomic Era. p. 1.5:1-1.5:16, 2020.
- LEE, T.-H. et al. SNPhylo: a pipeline to construct a phylogenetic tree from huge SNP data. BMC Genomics 15, 162, 2014.
- LI, L.; STOECKERT, C. J.; ROOS, D. S. OrthoMCL: Identification of Ortholog Groups for Eukaryotic Genomes. Genome Research, v. 13, n. 9, p. 2178–2189, 2003.
- LIU, C. et al. The community coevolution model with application to the study of evolutionary relationships between genes based on phylogenetic profiles. Systematic Biology, v. 72, n. 3, p. 559-574, 2023.
- LIU, S. et al. Three differential expression analysis methods for RNA sequencing: limma, EdgeR, DESeq2. JoVE (Journal of Visualized Experiments), n. 175, p. e62528, 2021.
- MITRA, K. et al. Integrative approaches for finding modular structure in biological networks. Nature Reviews Genetics, v. 14, n. 10, p. 719-732, 2013.
- PABINGER, S. et al. A survey of tools for variant analysis of next-generation genome sequencing data. Briefings in bioinformatics, v. 15, n. 2, p. 256-278, 2014.
- PEARSON, W. R. BLAST and FASTA similarity searching for multiple sequence alignment. Multiple sequence alignment methods, p. 75-101, 2014.
- PEREIRA, R.; OLIVEIRA, J.; SOUSA, M. Bioinformatics and computational tools for next-generation sequencing analysis in clinical genetics. Journal of clinical medicine, v. 9, n. 1, p. 132, 2020.
- POND, S. L. K.; FROST, S. D. W.; MUSE, S. V. HyPhy: hypothesis testing using phylogenies. Bioinformatics, v. 21, n. 5, p. 676–679, 2005.
- RAO, M. S. et al. Comparison of RNA-Seq and microarray gene expression platforms for the toxicogenomic evaluation of liver from short-term rat toxicity studies. Frontiers in genetics, v. 9, p. 636, 2019.
- SHAKYA, M. et al. Standardized phylogenetic and molecular evolutionary analysis applied to species across the microbial tree of life. Sci Rep 10, 1723, 2020.
- SPENCER, D. H. et al. Performance of common analysis methods for detecting low-frequency single nucleotide variants in targeted next-generation sequence data. The Journal of Molecular Diagnostics, v. 16, n. 1, p. 75-88, 2014.
- THOMPSON, D.; REGEV, A.; ROY, S. Comparative analysis of gene regulatory networks: from network reconstruction to evolution. Annual review of cell and developmental biology, v. 31, p. 399-428, 2015.
- TRAPNELL, C. et al. Differential analysis of gene regulation at transcript resolution with RNA-seq. Nature biotechnology, v. 31, n. 1, p. 46-53, 2013.
- VARET, H. et al. SARTools: a DESeq2-and EdgeR-based R pipeline for comprehensive differential analysis of RNA-Seq data. PloS one, v. 11, n. 6, p. e0157022, 2016.
- WHITE, M. H.; ADAMS, D. A.; BU, J. On the go with SONOS. IEEE Circuits and Devices Magazine, v. 16, n. 4, p. 22-31, 2000.
- YANG, Z. PAML 4: Phylogenetic Analysis by Maximum Likelihood. Molecular Biology and Evolution, v. 24, n. 8, p. 1586–1591, 2007.
- YAO, Z. et al. Evaluation of variant calling tools for large plant genome re-sequencing. BMC bioinformatics, v. 21, n. 1, p. 1-16, 2020.