Physics - High Energy Astrophysical Phenomena Physics - Instrumentation and Methods for Astrophysics
Joint observations in electromagnetic and gravitational waves shed light on
the physics of objects and surrounding environments with extreme gravity that
are otherwise unreachable via siloed observations in each messenger. However,
such detections remain challenging due to the rapid and faint nature of
counterparts. Protocols for discovery and inference still rely on human experts
manually inspecting survey alert streams and intuiting optimal usage of limited
follow-up resources. Strategizing an optimal follow-up program requires
adaptive sequential decision-making given evolving light curve data that (i)
maximizes a global objective despite incomplete information and (ii) is robust
to stochasticity introduced by detectors/observing conditions. Reinforcement
learning (RL) approaches allow agents to implicitly learn the physics/detector
dynamics and the behavior policy that maximize a designated objective through
experience.
To demonstrate the utility of such an approach for the kilonova follow-up
problem, we train a toy RL agent for the goal of maximizing follow-up
photometry for the true kilonova among several contaminant transient light
curves. In a simulated environment where the agent learns online, it achieves
3x higher accuracy compared to a random strategy. However, it is surpassed by
human agents by up to a factor of 2. This is likely because our hypothesis
function (Q that is linear in state-action features) is an insufficient
representation of the optimal behavior policy. More complex agents could
perform at par or surpass human experts. Agents like these could pave the way
for machine-directed software infrastructure to efficiently respond to next
generation detectors, for conducting science inference and optimally planning
expensive follow-up observations, scalably and with demonstrable performance
guarantees.