LogoAIAny
Icon for item

DroidRun

DroidRun is an open-source framework for controlling Android and iOS devices through LLM-driven agents. It enables automation of device interactions via natural language commands, supports multiple LLM providers (OpenAI, Anthropic, Gemini, Ollama, DeepSeek), offers planning for multi-step tasks, screenshot-based visual understanding, execution tracing, and an extensible Python API for custom automations.

Introduction

DroidRun — LLM-driven mobile automation framework

DroidRun is an open-source framework that gives LLM agents native-like control over physical and virtual mobile devices (Android and iOS). Designed for developers, QA engineers and product teams, DroidRun turns natural language instructions into concrete UI actions, combining planning, vision (screenshot analysis), and multi-provider LLM support to automate complex mobile tasks.

Key features
  • Multi-LLM support: Works with OpenAI, Anthropic, Google Gemini, Ollama, DeepSeek and other providers, allowing users to swap or combine models.
  • Natural-language control: Users or programs can describe desired outcomes in plain language; the agent plans and executes the necessary UI steps.
  • Planning & multi-step tasks: Built-in planning capabilities let agents decompose and execute multi-step workflows reliably.
  • Screenshot analysis: Visual understanding using screenshots helps the agent find UI elements and make context-aware decisions.
  • CLI & Python API: Provides an easy-to-use CLI for quick experimentation and a Python API for integrating into test suites, automation pipelines, and custom tooling.
  • Execution tracing & observability: Tracing integrations (e.g., Arize Phoenix referenced in the README) help inspect actions, debug failures, and audit agent behavior.
  • Security checks: Repository integrates tools like bandit and safety to detect common security issues in code and dependencies.
  • Extensible: Plugin-like or API-driven extensions let teams add custom steps, connectors, or adaptors for specific device farms or cloud providers.
Typical use cases
  • Automated UI testing of mobile apps (functional and regression tests driven by high-level specs).
  • Automating repetitive user tasks on mobile devices (e.g., data entry, routine app workflows).
  • Remote assistance and guided workflows for non-technical users.
  • Rapid prototyping of mobile-agent workflows that combine vision, planning, and LLM reasoning.
Quickstart & installation

Install via pip (examples from README):

pip install 'droidrun[google,anthropic,openai,deepseek,ollama,dev]'

Full quickstart and docs are provided at the official docs site; the repo includes demo videos and benchmark links for evaluating capabilities.

Demos & benchmark

The project README links several demo videos (e.g., accommodation booking, trend hunting, streak saver) and a public benchmark page showing performance metrics. There is also a cloud offering and documentation for getting started quickly.

License & contribution

DroidRun is licensed under the MIT License. The project welcomes contributions via GitHub pull requests and includes guidance for security scanning and testing.

Why it matters

DroidRun bridges LLM reasoning and real-world mobile UIs, enabling higher-level automation without hand-coding each UI step. By combining multi-model support, screenshot analysis, and planning, it enables robust agent-driven mobile automation workflows suitable for testing, productivity automation, and assistive technologies.

Information

  • Websitegithub.com
  • Authorsdroidrun (GitHub: droidrun)
  • Published date2025/04/12

More Items