Sobre o Evento
Important tasks in the study of genomic data include the identification of groups of similar cells (for example by clustering), and visualisation of data summaries (for example by dimensional reduction). In this paper, we develop a novel approach to these tasks in the context of single-cell genomic data. To do so, we propose to model the observed genomic data count matrix X∈Zp×n≥0, by representing these measurements as a bipartite network with multi-edges. Utilising this first-principles network model of the raw data, we cluster single cells in a suitably identified d-dimensional Laplacian Eigenspace (LE) via a Gaussian mixture model (GMM-LE), and employ UMAP to non-linearly project the LE to two dimensions for visualisation (UMAP-LE). This LE representation of the data-points estimates transformed latent positions (of genes and cells), under a latent position statistical model of nodes in a bipartite stochastic network. We demonstrate how transformations of these estimated latent positions can enable fine-grained clustering and visualisation of single-cell genomic data, by application to data from three recent genomics studies in different biological contexts. In each data application, clusters of cells independently learned by our proposed methodology are found to correspond to cells expressing specific marker genes that were independently defined by domain experts. In this validation setting, our proposed clustering methodology outperforms the industry-standard for these data. Furthermore, we validate components of the LE decomposition of the data by contrasting healthy cells from normal and at-risk groups in a machine-learning model, thereby identifying an LE cancer biomarker that significantly predicts long-term patient survival outcome in two independent validation cohorts with data from 1904 and 1091 individuals.
Texto informado pelo autor.
* Os participantes dos seminários não poderão acessar às dependências da FGV usando bermuda, chinelos, blusa modelo top ou cropped, minissaia ou camiseta regata. O uso da máscara é facultativo, porém é obrigatória a apresentação do comprovante de vacinação (físico ou digital).
Apoiadores / Parceiros / Patrocinadores
Palestrantes
Thomas E. Bartlett
Dr Thomas Bartlett was appointed as lecturer (equivalent to assistant professor with tenure) in Statistical Science at University College London (UCL), UK, in 2020. Previously, during independent postdoctoral fellowships won by Dr Bartlett and during his interdisciplinary doctoral training (graduating in 2015), also at UCL, he worked on questions in both the mathematical and the biological sciences. During this time, Dr Bartlett has regularly participated in and peer-reviewed for conferences such as JSM, NeurIPS, ICML, and AISTATS. Dr Bartlett's main research interests cover mathematical-statistical models of networks, as well as computational-statistical methodology for analysis of large biomedical data-sets.
Local
Endereço
a) Opção presencial *
Praia de Botafogo, 190
5o andar, Auditório 537
b) Opção remota (via Zoom)
ID: 992 0612 5192
https://fgv-br.zoom.us/j/99206125192
Informações adicionais:
Tel: 55 21 3799-5917