Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs

Ruichen Zhang; Mufan Qiu; Zhen Tan; Mohan Zhang; Vincent Lu; Jie Peng; Kaidi Xu; Leandro Z Agudelo; Peter Qian; Tianlong Chen

doi:10.48550/arxiv.2502.07942

Back

Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs

Preprint

Open access

Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs

Ruichen Zhang, Mufan Qiu, Zhen Tan, Mohan Zhang, Vincent Lu, Jie Peng, Kaidi Xu, Leandro Z Agudelo, Peter Qian and Tianlong Chen

11 Feb 2025

DOI: https://doi.org/10.48550/arxiv.2502.07942

Files and links (1)

url

https://arxiv.org/abs/2502.07942View

Preprint (Author's original)arXiv.org - Non-exclusive license to distribute, Open

Abstract

Computer Science - Learning

Computer Science - Multiagent Systems

Web browsing agents powered by large language models (LLMs) have shown tremendous potential in automating complex web-based tasks. Existing approaches typically rely on large LLMs (e.g., GPT-4o) to explore web environments and generate trajectory data, which is then used either for demonstration retrieval (for large LLMs) or to distill small LLMs (e.g., Llama3) in a process that remains decoupled from the exploration. In this paper, we propose AgentSymbiotic, an iterative framework that couples data synthesis with task-performance, yielding a "symbiotic improvement" for both large and small LLMs. Our study uncovers a complementary dynamic between LLM types: while large LLMs excel at generating high-quality trajectories for distillation, the distilled small LLMs-owing to their distinct reasoning capabilities-often choose actions that diverge from those of their larger counterparts. This divergence drives the exploration of novel trajectories, thereby enriching the synthesized data. However, we also observe that the performance of small LLMs becomes a bottleneck in this iterative enhancement process. To address this, we propose two innovations in LLM distillation: a speculative data synthesis strategy that mitigates off-policy bias, and a multi-task learning approach designed to boost the reasoning capabilities of the student LLM. Furthermore, we introduce a Hybrid Mode for Privacy Preservation to address user privacy concerns. Evaluated on the WEBARENA benchmark, AgentSymbiotic achieves SOTA performance with both LLM types. Our best Large LLM agent reaches 52%, surpassing the previous best of 45%, while our 8B distilled model demonstrates a competitive 49%, exceeding the prior best of 28%. Code will be released upon acceptance.

Metrics

11 Record Views

Details

Title: Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs
Creators: Ruichen Zhang
Mufan Qiu
Zhen Tan
Mohan Zhang
Vincent Lu
Jie Peng
Kaidi Xu
Leandro Z Agudelo
Peter Qian
Tianlong Chen
Resource Type: Preprint
Language: English
Academic Unit: Computer Science
Other Identifier: 991022028082404721

Symbiotic Cooperation for Web Agents: Harnessing Complementary Strengths of Large and Small LLMs

Files and links (1)

Abstract

Metrics

Details

Drexel University Social media