Getting Started
What is IRIS
Iris is an advanced browser automation and Robotic Process Automation (RPA) platform that combines visual recognition, AI-driven analysis, and browser control capabilities. It allows you to:
-
Automate repetitive browser tasks with intelligent processing
-
Record and replay browser sessions with precise timing
-
Process videos of browser interactions for RPA analysis
-
Parameterize workflows for data-driven automation
-
Access browser sessions remotely through VNC interfaces
The platform is designed for both developers building automation tools and end-users who need to automate repetitive web tasks without coding.
Key FeaturesCopied!
-
Browser Automation: Control web browsers programmatically through an API
-
Session Recording: Capture and replay user interactions with browsers
-
Video Processing: Analyze recorded sessions with AI vision models
-
RPA Execution: Convert recorded sessions into replayable automation workflows
-
Parameterization: Create dynamic workflows with variable inputs
-
Remote Access: View and control automated browsers through VNC
-
RESTful API: Comprehensive API with OpenAPI documentation
PrerequisitesCopied!
-
Node.js 20.x or later
-
PNPM 8.x or later (recommended package manager)
-
Docker and Docker Compose (for containerized setup)
InstallationCopied!
Local Development Setup
-
Clone the repository:
git clone <repository-url>
cd iris
-
Install dependencies:
pnpm install
-
Set up environment variables:
cp .env.example .env
-
Edit the
.env
file with your configuration:
# Server Configuration
PORT=3000
HOST=0.0.0.0
# VLM Configuration
VLM_BASE_URL=... # Visual Language Model API URL
VLM_API_KEY=... # API key for VLM service
VLM_MODEL_NAME=tgi
VLM_PROVIDER=ui_tars_1_5
# Application Settings
LANGUAGE=en
MAX_LOOP_COUNT=10
LOOP_INTERVAL_MS=1000
DEFAULT_OPERATOR=browser
Docker Setup
For a containerized setup:
docker-compose up --build
Running the ApplicationCopied!
Development Mode
pnpm run start:dev
Production Mode
pnpm run build
pnpm run start:prod
Accessing the ApplicationCopied!
After starting the application, you can access:
-
API Documentation (Swagger): http://localhost:3000/api/docs
-
API Reference (Scalar): http://localhost:3000/api/reference
-
VNC Interface: http://localhost:6901/vnc.html (when using Docker)
-
Direct VNC connection: localhost:5901 (when using Docker)
Core FunctionalityCopied!
Browser Automation
Iris provides two main operator types for automation:
-
Browser Operator: For web browser automation
-
Computer Operator: For desktop automation (using @ui-tars/operator-nut-js)
You can create sessions, execute actions, and record interactions through the API.
RPA Workflow
-
Record a Session: Capture browser interactions as a recording
-
Process the Recording: Extract actions and metadata
-
Parameterize if Needed: Add variable inputs to the workflow
-
Execute RPA: Replay the recorded actions automatically
-
Monitor Execution: Track progress and handle errors
API Endpoints
-
/api/sessions
- Session management -
/api/config
- Configuration management -
/api/operators
- Operator management -
/api/rpa
- RPA execution and management -
/api/docs
- Swagger API documentation -
/api/reference
- Scalar API Reference documentation
Example WorkflowCopied!
Here's a typical workflow for using Iris:
-
Create a new browser automation session:
POST /api/sessions { "operatorType": "browser" }
-
Execute browser actions:
POST /api/sessions/{sessionId}/execute { "action": "navigate", "url": "https://example.com" }
-
Record the session for later replay:
POST /api/sessions/{sessionId}/record { "recordingName": "example-workflow" }
-
Execute the recording as an RPA workflow:
POST /api/rpa/execute { "recordingId": "example-workflow", "actionDelay": 1000 }
TestingCopied!
# Run unit tests
pnpm run test
# Run e2e tests
pnpm run test:e2e
# Run test coverage
pnpm run test:cov
Security ConsiderationsCopied!
When implementing and using the RPA video processing feature:
-
Validate all uploaded videos for potential security risks
-
Implement size and format restrictions for uploads
-
Ensure sensitive content in videos is handled appropriately
-
Implement access controls for generated RPA steps and recordings
-
Store API keys securely using environment variables
-
Sanitize user-supplied content before processing
TroubleshootingCopied!
-
For browser automation issues, inspect the VNC connection to see what's happening in real-time
-
Check the server logs for detailed error messages
-
Ensure your browser automation actions are compatible with the target website's structure