Victor-Alexandru Darvariu
I am a computer scientist working as a Postdoctoral Researcher at the Oxford Robotics Institute, University of Oxford, where I am a part of the GOALS group led by Nick Hawes. My primary goal is the development of artificial intelligence techniques for solving challenging decision-making problems effectively.
My research interests are at the intersection of reinforcement learning and planning, graph learning, combinatorial optimization, and multi-agent systems. I am broadly interested in both fundamental research and application areas spanning robotics, operations research, computer and communication systems, and causal inference.
News
[Oct 2024] Our paper Tree Search in DAG Space with Model-based Reinforcement Learning for Causal Discovery has been accepted for publication in Proceedings of the Royal Society A. We address the problem of discovering causal graphs with a model-based reinforcement learning method, which is powered by an incremental algorithm for determining cycle-inducing edges, and is shown to compare favorably to model-free RL methods and greedy search. Code for the proposed CD-UCT algorithm and benchmarks of causal discovery methods is publicly available.
[Sep 2024] New pre-print: Reinforcement Learning Discovers Efficient Decentralized Graph Path Search Strategies. Inspired by Milgram's small world experiment, we frame the problem of path search in graphs as a decentralized multi-agent decision-making process. We propose the GARDEN method and demonstrate its advantages over heuristic and learned baselines, including on real-world social media networks.
[Sep 2024] The paper Graph Reinforcement Learning for Combinatorial Optimization: A Survey and Unifying Perspective has been accepted for publication by TMLR. In this survey, we review works that have approached optimization problems over graphs with reinforcement learning techniques. Specifically, we focus on problems that currently do not have satisfactory exact or heuristic solutions, and for which RL can be advantageous as an approach for algorithm discovery.