Computer Science - Learning Computer Science - Multiagent Systems
Web browsing agents powered by large language models (LLMs) have shown
tremendous potential in automating complex web-based tasks. Existing approaches
typically rely on large LLMs (e.g., GPT-4o) to explore web environments and
generate trajectory data, which is then used either for demonstration retrieval
(for large LLMs) or to distill small LLMs (e.g., Llama3) in a process that
remains decoupled from the exploration. In this paper, we propose
AgentSymbiotic, an iterative framework that couples data synthesis with
task-performance, yielding a "symbiotic improvement" for both large and small
LLMs. Our study uncovers a complementary dynamic between LLM types: while large
LLMs excel at generating high-quality trajectories for distillation, the
distilled small LLMs-owing to their distinct reasoning capabilities-often
choose actions that diverge from those of their larger counterparts. This
divergence drives the exploration of novel trajectories, thereby enriching the
synthesized data. However, we also observe that the performance of small LLMs
becomes a bottleneck in this iterative enhancement process. To address this, we
propose two innovations in LLM distillation: a speculative data synthesis
strategy that mitigates off-policy bias, and a multi-task learning approach
designed to boost the reasoning capabilities of the student LLM. Furthermore,
we introduce a Hybrid Mode for Privacy Preservation to address user privacy
concerns. Evaluated on the WEBARENA benchmark, AgentSymbiotic achieves SOTA
performance with both LLM types. Our best Large LLM agent reaches 52%,
surpassing the previous best of 45%, while our 8B distilled model demonstrates
a competitive 49%, exceeding the prior best of 28%. Code will be released upon
acceptance.
Metrics
11 Record Views
Details
Title
Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs