報 告 人:佐治亞大學統(tǒng)計系暨生物信息研究所劉亮教授
報告題目:Species delimitation using machine learning
報告時間:2024年5月16日 上午9:30-10:30
報告地點:分析測試中心620
主辦單位:生命科學學院、比較基因組學省高校重點實驗室、江蘇省基因組學國際聯(lián)合研究中心、科學技術研究院
報告人簡介:
劉亮,美國佐治亞大學統(tǒng)計系暨生物信息研究所教授。國際分子系統(tǒng)發(fā)育基因組學研究領域新型物種樹方法的創(chuàng)始人之一,曾獲2008年度國際系統(tǒng)生物學家協(xié)會優(yōu)秀科研獎。長期擔任 Systematic Biology, Bioinformatics, Journal ofMathematic Biology, Molecular Biology and Evolution, Molecular Ecology等國際學術期刊的評委,在Science、PNAS、Systematic Biology、Molecular Biology andEvolution、Bioinformatics等國際學術期刊發(fā)表論文80余篇,論文總引用次數(shù)約3.5萬余次,單篇論文最高引用2.4萬余次。擔任美國國家自然科學基金委員會二審評委。
報告摘要:
In the realm of biology, species are identified through a classification systemthat groups together organisms with shared traits and the ability to reproducewith each other. There's a keen interest in understanding how species aredefined and whether their evolutionary roots can be traced through geneticsequences. Various techniques are employed for species identification, including Automated Barcode Gap Discovery, the General Mixed Yule Coalescentmethod, and the Poisson Tree Process method. Yet, these methods come withdrawbacks, such as time consumption and difficulty handling large datasets. In this talk, we delve into employing supervised machine learning techniques like Catboost, XGboost, Classification Tree, Support Vector Machine, and K-nearestNeighbors. Additionally, we explore unsupervised machine learning with K-meansClustering and a deep learning approach using Neural Networks for speciesdelimitation. We examine five distinct species trees as our test cases. Ourfindings reveal that supervised machine learning models exhibit superioraccuracy compared to unsupervised machine learning and deep learning models.