Getting Started

What is IRIS

Iris is an advanced browser automation and Robotic Process Automation (RPA) platform that combines visual recognition, AI-driven analysis, and browser control capabilities. It allows you to:

Automate repetitive browser tasks with intelligent processing
Record and replay browser sessions with precise timing
Process videos of browser interactions for RPA analysis
Parameterize workflows for data-driven automation
Access browser sessions remotely through VNC interfaces

The platform is designed for both developers building automation tools and end-users who need to automate repetitive web tasks without coding.

Key FeaturesCopied!

Browser Automation: Control web browsers programmatically through an API
Session Recording: Capture and replay user interactions with browsers
Video Processing: Analyze recorded sessions with AI vision models
RPA Execution: Convert recorded sessions into replayable automation workflows
Parameterization: Create dynamic workflows with variable inputs
Remote Access: View and control automated browsers through VNC
RESTful API: Comprehensive API with OpenAPI documentation

PrerequisitesCopied!

Node.js 20.x or later
PNPM 8.x or later (recommended package manager)
Docker and Docker Compose (for containerized setup)

InstallationCopied!

Local Development Setup

Clone the repository:

git clone <repository-url>
cd iris

Install dependencies:

pnpm install

Set up environment variables:

cp .env.example .env

Edit the .env file with your configuration:

# Server Configuration
PORT=3000
HOST=0.0.0.0

# VLM Configuration
VLM_BASE_URL=...           # Visual Language Model API URL
VLM_API_KEY=...            # API key for VLM service
VLM_MODEL_NAME=tgi
VLM_PROVIDER=ui_tars_1_5

# Application Settings
LANGUAGE=en
MAX_LOOP_COUNT=10
LOOP_INTERVAL_MS=1000
DEFAULT_OPERATOR=browser

Docker Setup

For a containerized setup:

docker-compose up --build

Running the ApplicationCopied!

Development Mode

pnpm run start:dev

Production Mode

pnpm run build
pnpm run start:prod

Accessing the ApplicationCopied!

After starting the application, you can access:

Web UI: http://localhost:3000/operator-ui.html
API Documentation (Swagger): http://localhost:3000/api/docs
API Reference (Scalar): http://localhost:3000/api/reference
VNC Interface: http://localhost:6901/vnc.html (when using Docker)
Direct VNC connection: localhost:5901 (when using Docker)

Core FunctionalityCopied!

Browser Automation

Iris provides two main operator types for automation:

Browser Operator: For web browser automation
Computer Operator: For desktop automation (using @ui-tars/operator-nut-js)

You can create sessions, execute actions, and record interactions through the API.

RPA Workflow

Record a Session: Capture browser interactions as a recording
Process the Recording: Extract actions and metadata
Parameterize if Needed: Add variable inputs to the workflow
Execute RPA: Replay the recorded actions automatically
Monitor Execution: Track progress and handle errors

API Endpoints

/api/sessions - Session management
/api/config - Configuration management
/api/operators - Operator management
/api/rpa - RPA execution and management
/api/docs - Swagger API documentation
/api/reference - Scalar API Reference documentation

Example WorkflowCopied!

Here's a typical workflow for using Iris:

Create a new browser automation session:
```
POST /api/sessions
{
  "operatorType": "browser"
}
```

Execute browser actions:

POST /api/sessions/{sessionId}/execute
{
  "action": "navigate",
  "url": "https://example.com"
}

Record the session for later replay:

POST /api/sessions/{sessionId}/record
{
  "recordingName": "example-workflow"
}

Execute the recording as an RPA workflow:

POST /api/rpa/execute
{
  "recordingId": "example-workflow",
  "actionDelay": 1000
}

TestingCopied!

# Run unit tests
pnpm run test

# Run e2e tests
pnpm run test:e2e

# Run test coverage
pnpm run test:cov

Security ConsiderationsCopied!

When implementing and using the RPA video processing feature:

Validate all uploaded videos for potential security risks
Implement size and format restrictions for uploads
Ensure sensitive content in videos is handled appropriately
Implement access controls for generated RPA steps and recordings
Store API keys securely using environment variables
Sanitize user-supplied content before processing

TroubleshootingCopied!

For browser automation issues, inspect the VNC connection to see what's happening in real-time
Check the server logs for detailed error messages
Ensure your browser automation actions are compatible with the target website's structure