Skip to main content

Documentation Index

Fetch the complete documentation index at: https://askui-rename-vision-agent-to-python-sdk.mintlify.app/llms.txt

Use this file to discover all available pages before exploring further.

This guide shows you how to configure and use different AI models with AskUI to optimize performance for your specific automation tasks. For a deeper understanding of how these models work and their capabilities, see AI Models.

Overview

AskUI supports multiple AI model providers, each with different strengths:
  • AskUI Models: Fast, production-ready models optimized for UI automation
  • Anthropic Models: Advanced language models for complex reasoning tasks
  • Hugging Face Models: Open-source models for experimentation
  • Self-Hosted Models: Custom models you host yourself

Quick Reference

PlatformAvailable Models
AskUIaskui, askui-combo, askui-pta, askui-ocr, askui-ai-element
Anthropicanthropic-claude-3-5-sonnet-20241022
Hugging FaceAskUI/PTA-1, OS-Copilot/OS-Atlas-Base-7B, showlab/ShowUI-2B, Qwen/Qwen2-VL-2B-Instruct, Qwen/Qwen2-VL-7B-Instruct
Self-HostedUI-Tars
For detailed specifications of all available AI models, see the AI Models Reference.

Step 1: Choose Your Model Provider

First, decide which model provider best fits your needs:

When to Use AskUI Models

  • Production environments: Fast, reliable, and enterprise-ready
  • Standard UI automation: Optimized for clicking, typing, and data extraction
  • Cost-conscious projects: Lower cost per operation

When to Use Anthropic Models

  • Complex reasoning tasks: Advanced decision-making capabilities
  • Natural language interactions: Better understanding of complex instructions
  • Experimental projects: Cutting-edge AI capabilities

When to Use Hugging Face Models

  • Open-source requirements: Community-driven development
  • Research and experimentation: Access to latest research models
  • Budget constraints: Free tier available (rate-limited)

Step 2: Set Up Authentication

Configure authentication for your chosen model provider:
AskUI models require workspace credentials from your AskUI account.Required Environment Variables:
  • ASKUI_WORKSPACE_ID
  • ASKUI_TOKEN
Get Your Credentials:
  1. Sign in to app.caesr.ai
  2. Navigate to your workspace settings
  3. Copy your Workspace ID and generate an access token

Step 3: Configure Environment Variables

Set the required environment variables for your operating system:
# For AskUI models
export ASKUI_WORKSPACE_ID=<your-workspace-id>
export ASKUI_TOKEN=<your-access-token>

# For Anthropic models
export ANTHROPIC_API_KEY=<your-api-key>
Add these to your ~/.bashrc or ~/.zshrc file to persist across sessions.

Step 4: Use Models in Your Code

Specify which model to use by adding the model parameter to your commands:

Basic Model Usage

from askui import VisionAgent

with VisionAgent() as agent:
    # Use AskUI's default model (recommended for most cases)
    agent.click("login button")
    
    # Use a specific AskUI model
    agent.click("search field", model="askui-ocr")
    
    # Use Anthropic's model for complex reasoning
    agent.act("Fill out this form with realistic test data", model="anthropic-claude-3-5-sonnet-20241022")

Model Selection Strategy

Choose models based on your task requirements:
# For simple element clicking - use fast AskUI models
agent.click("submit button", model="askui-pta")

# For text extraction - use OCR-optimized models
text = agent.get("user name field", model="askui-ocr")

# For complex multi-step tasks - use reasoning models
agent.act("Navigate to settings and enable dark mode", model="anthropic-claude-3-5-sonnet-20241022")

# For experimental features - try Hugging Face models
agent.click("menu icon", model="AskUI/PTA-1")

Step 5: Verify Your Configuration

Test your model configuration with a simple script:
from askui import VisionAgent

def test_model_configuration():
    with VisionAgent() as agent:
        try:
            # Test AskUI model
            agent.click("desktop", model="askui")
            print("✓ AskUI models configured correctly")
            
            # Test Anthropic model (if configured)
            agent.act("Take a screenshot", model="anthropic-claude-3-5-sonnet-20241022")
            print("✓ Anthropic models configured correctly")
            
        except Exception as e:
            print(f"✗ Configuration error: {e}")

if __name__ == "__main__":
    test_model_configuration()

Troubleshooting

  • Verify environment variables are set correctly
  • Check for extra spaces in API keys
  • Ensure API keys are valid and not expired
  • Restart your terminal/IDE after setting variables
  • Check model name spelling (case-sensitive)
  • Verify the model is supported for your command
  • Some models may not support all AskUI commands
  • Hugging Face models have rate limits
  • Consider upgrading to paid tiers for higher limits
  • Implement retry logic with exponential backoff

Next Steps

Now that you have models configured:
  1. Optimize Performance: Learn about model selection best practices
  2. Advanced Usage: Explore agentic workflows
  3. Production Deployment: Review enterprise considerations