Learning Object Representations to Predict and Explain Missing Associations
Data

We consider a dataset of object associations (pairs), where each object belongs to one of two types (e.g. drugs and side effects). In some cases, additional information for the objects is also available (e.g. drug structures). Our aim is to learn object representations that can be used to predict missing associations and encode interpretable object attributes that can explain the predictions. In this thesis, we propose three different approaches and apply them to real-world datasets, primarily from pharmacology, which vary in the type of data available in each.

Self-Matrix Factorization (SMF), is a method that learns object representations using solely association data. Exploiting the fact that, in general, objects lie in multiple linear manifolds embedded in high-dimensional space, SMF is able to learn similarities between objects---specifically, those that share a manifold---directly from the observed associations. Thus, SMF simultaneously learns object similarities and representations, constraining them to reflect underlying structures in the data. We tested SMF extensively on associations datasets containing user item ratings and drug side effect frequencies. Our results show that SMF outperforms competing methods in recovering missing associations and is also better at learning representations that capture meaningful object attributes.

In our second learning scenario, no explicit associations between objects are available—only the perturbations that objects (drugs and viruses) induce in an environment (protein-protein interaction network). Our approach learns the object representations through simultaneous matrix decompositions of different matrices. We show these representations encode interpretable attributes of the objects involved (drugs, viruses, etc) that can be used to predict effective antiviral treatments and that these predictions are explainable in terms of the learned object attributes.

Finally, in our third learning scenario, object associations (drug, side effects) are available together with some low level object features (drug molecular graphs). Object representations are learned through a deep learning model, called Features to Signatures (\(\phi 2 \sigma\)), and we show that these representations can be used to predicts drug side effect frequencies from molecular graphs. Importantly, (\(\phi 2 \sigma\)) can be used for ab initio prediction, to predict side effects frequency for compounds with previously undetected side effects.

Local

27 de março de 2025, às 11h30.

Link do zoom: https://fgv-br.zoom.us/j/92321152709?pwd=viA26eSk7JcTCVKbI4RdW5L8iazXSN.1

Membros da banca
Orientador: Alberto Paccanaro - EMAp
Membro Interno: Moacyr Alvim Horta Barbosa da Silva - EMAp
Membro Externo: Luis Gustavo Nonato - USP São Carlos
Membro Externo: Haixuan Yang - University of Galway, Irlanda
Membro Externo: Giorgio Valentini - University of Milan, Itália
Membro Externo: Kostas Stathis - Royal Holloway University of London, Reino Unido
A A A
High contrast

Nosso website coleta informações do seu dispositivo e da sua navegação e utiliza tecnologias como cookies para armazená-las e permitir funcionalidades como: melhorar o funcionamento técnico das páginas, mensurar a audiência do website e oferecer produtos e serviços relevantes por meio de anúncios personalizados. Para mais informações, acesse o nosso Aviso de Cookies e o nosso Aviso de Privacidade.