Streaming, Distributed, and Asynchronous Amortized Inference
Data

We address the problem of sampling from an unnormalized distribution defined in a compositional space, i.e., a continuous or discrete set whose elements can be sequentially constructed from an initial state through the application of simple actions. This definition accommodates the space of (directed acyclic) graphs, natural language sentences of bounded size, and Euclidean n-spaces, among others, and is at the core of many applications in (Bayesian) statistics and machine learning. In particular, we focus on Generative Flow Networks (GFlowNets), a family of amortized samplers which cast the problem of sampling as finding a flow assignment in a flow network such that the total flow reaching a sink node equals that node's unnormalized probability. Despite their remarkable success in drug discovery, structure learning, and natural language processing, important questions regarding the scalability, generalization, and limitations of these models remain largely underexplored. In view of this, this thesis contributes with both methodological and theoretical advances for a better usability and understanding of GFlowNets. From a computational perspective, we develop algorithms for learning these models in both streaming and distributed settings. We experimentally show that the resulting methods drastically reduce the training time of GFlowNets. Additionally, we revisit conventional assessment techniques for GFlowNets and demonstrate that they catastrophically misrepresent the distributional correctness of these models. In this sense, we propose the first sound and computationally amenable metric for quantifying the accuracy of a GFlowNet. From a theoretical point of view, we construct a family of graph generation problems that traditional GFlowNets cannot solve. To overcome this limitation, we present LA-GFlowNets, a provably expressivity-boosted GFlowNet variant that performs a local search at each iteration of the generative process. Finally, we introduce the first non-vacuous statistical guarantees for the generalization of GFlowNets, addressing a long-standing issue in the literature. 

Local

Quando: 20 de dezembro de 2024

Horário: 15h30min.

Link do zoom: https://fgv-br.zoom.us/j/6287410818?omn=93649091051

Membros da banca
Orientador: Diego Parente Paiva Mesquita - FGV EMAp
Membro Interno: Luiz Max Fagundes de Carvalho - FGV EMAp
Membro Externo: Fabio Gagliardi Cozman - USP
Membro Externo: Roberto Imbuzeiro Oliveira – IMPA
Membro Externo: Eduardo Sany Laber – PUC RJ
Conteúdo acessível em Libras usando o VLibras Widget com opções dos Avatares Ícaro, Hosana ou Guga. Conteúdo acessível em Libras usando o VLibras Widget com opções dos Avatares Ícaro, Hosana ou Guga.
A A A
High contrast

Nosso website coleta informações do seu dispositivo e da sua navegação e utiliza tecnologias como cookies para armazená-las e permitir funcionalidades como: melhorar o funcionamento técnico das páginas, mensurar a audiência do website e oferecer produtos e serviços relevantes por meio de anúncios personalizados. Para mais informações, acesse o nosso Aviso de Cookies e o nosso Aviso de Privacidade.