Getting Started
What is IRIS
Iris is an advanced browser automation and Robotic Process Automation (RPA) platform that combines visual recognition, AI-driven analysis, and browser control capabilities. It allows you to:
- 
      
Automate repetitive browser tasks with intelligent processing
 - 
      
Record and replay browser sessions with precise timing
 - 
      
Process videos of browser interactions for RPA analysis
 - 
      
Parameterize workflows for data-driven automation
 - 
      
Access browser sessions remotely through VNC interfaces
 
The platform is designed for both developers building automation tools and end-users who need to automate repetitive web tasks without coding.
Key FeaturesCopied!
- 
      
Browser Automation: Control web browsers programmatically through an API
 - 
      
Session Recording: Capture and replay user interactions with browsers
 - 
      
Video Processing: Analyze recorded sessions with AI vision models
 - 
      
RPA Execution: Convert recorded sessions into replayable automation workflows
 - 
      
Parameterization: Create dynamic workflows with variable inputs
 - 
      
Remote Access: View and control automated browsers through VNC
 - 
      
RESTful API: Comprehensive API with OpenAPI documentation
 
PrerequisitesCopied!
- 
      
Node.js 20.x or later
 - 
      
PNPM 8.x or later (recommended package manager)
 - 
      
Docker and Docker Compose (for containerized setup)
 
InstallationCopied!
Local Development Setup
- 
      
Clone the repository:
 
git clone <repository-url>
cd iris
  - 
      
Install dependencies:
 
pnpm install
  - 
      
Set up environment variables:
 
cp .env.example .env
  - 
      
Edit the
.envfile with your configuration: 
# Server Configuration
PORT=3000
HOST=0.0.0.0
# VLM Configuration
VLM_BASE_URL=...           # Visual Language Model API URL
VLM_API_KEY=...            # API key for VLM service
VLM_MODEL_NAME=tgi
VLM_PROVIDER=ui_tars_1_5
# Application Settings
LANGUAGE=en
MAX_LOOP_COUNT=10
LOOP_INTERVAL_MS=1000
DEFAULT_OPERATOR=browser
  Docker Setup
For a containerized setup:
docker-compose up --build
  Running the ApplicationCopied!
Development Mode
pnpm run start:dev
  Production Mode
pnpm run build
pnpm run start:prod
  Accessing the ApplicationCopied!
After starting the application, you can access:
- 
      
API Documentation (Swagger): http://localhost:3000/api/docs
 - 
      
API Reference (Scalar): http://localhost:3000/api/reference
 - 
      
VNC Interface: http://localhost:6901/vnc.html (when using Docker)
 - 
      
Direct VNC connection: localhost:5901 (when using Docker)
 
Core FunctionalityCopied!
Browser Automation
Iris provides two main operator types for automation:
- 
      
Browser Operator: For web browser automation
 - 
      
Computer Operator: For desktop automation (using @ui-tars/operator-nut-js)
 
You can create sessions, execute actions, and record interactions through the API.
RPA Workflow
- 
      
Record a Session: Capture browser interactions as a recording
 - 
      
Process the Recording: Extract actions and metadata
 - 
      
Parameterize if Needed: Add variable inputs to the workflow
 - 
      
Execute RPA: Replay the recorded actions automatically
 - 
      
Monitor Execution: Track progress and handle errors
 
API Endpoints
- 
      
/api/sessions- Session management - 
      
/api/config- Configuration management - 
      
/api/operators- Operator management - 
      
/api/rpa- RPA execution and management - 
      
/api/docs- Swagger API documentation - 
      
/api/reference- Scalar API Reference documentation 
Example WorkflowCopied!
Here's a typical workflow for using Iris:
- 
      
Create a new browser automation session:
POST /api/sessions { "operatorType": "browser" } - 
      
Execute browser actions:
POST /api/sessions/{sessionId}/execute { "action": "navigate", "url": "https://example.com" } - 
      
Record the session for later replay:
POST /api/sessions/{sessionId}/record { "recordingName": "example-workflow" } - 
      
Execute the recording as an RPA workflow:
POST /api/rpa/execute { "recordingId": "example-workflow", "actionDelay": 1000 } 
TestingCopied!
# Run unit tests
pnpm run test
# Run e2e tests
pnpm run test:e2e
# Run test coverage
pnpm run test:cov
  Security ConsiderationsCopied!
When implementing and using the RPA video processing feature:
- 
      
Validate all uploaded videos for potential security risks
 - 
      
Implement size and format restrictions for uploads
 - 
      
Ensure sensitive content in videos is handled appropriately
 - 
      
Implement access controls for generated RPA steps and recordings
 - 
      
Store API keys securely using environment variables
 - 
      
Sanitize user-supplied content before processing
 
TroubleshootingCopied!
- 
      
For browser automation issues, inspect the VNC connection to see what's happening in real-time
 - 
      
Check the server logs for detailed error messages
 - 
      
Ensure your browser automation actions are compatible with the target website's structure