Technical_Implementation_Report

Technical_Implementation_Report

Last updated: 3/24/2025, 6:40:29 PM

AgentSociety: Technical Implementation Report

1. High-Level Architecture

AgentSociety's architecture is designed to support large-scale social simulations with thousands of LLM-driven agents interacting in a realistic societal environment. The system consists of three primary components that work together to create a comprehensive simulation platform:

1.1 Core Components

1.1.1 LLM-driven Social Generative Agents

The agent component represents individual social beings with:

  • Psychological states (emotions, needs, cognition)
  • Memory systems (profile, status, stream memory)
  • Behavioral capabilities (mobility, social interactions, economic activities)

1.1.2 Realistic Societal Environment

The environment provides the context for agent interactions through three integrated spaces:

  • Urban space (road networks, POIs, transportation systems)
  • Social space (social networks, online/offline interactions)
  • Economic space (firms, government, banks, economic indicators)

1.1.3 Large-Scale Simulation Engine

The engine enables efficient execution of simulations with:

  • Distributed computing capabilities
  • Agent messaging system
  • Monitoring and analysis tools
  • Social science research utilities

1.2 System Architecture

The system architecture consists of:

1.2.1 Shared Services

  • LLM API: Core intelligence for agents

    • Supports public services (OpenAI, DeepSeek) or local deployment (vllm, ollama)
    • Handles token management and response parsing
  • MQTT Server: Messaging system for inter-agent communication

    • Uses emqx implementation for reliability and scalability
    • Enables protocol-compliant message delivery
  • Database: PostgreSQL for storing simulation results

    • Optimized for high-performance batch writing
    • Stores agent states, interactions, and outcomes
  • Metric Recorder: mlflow-based system for tracking metrics

    • Centralized server capabilities for research collaboration
    • Records key performance indicators

1.2.2 Simulation Tasks

Each experiment corresponds to an Agent Simulation object that manages:

  • Environment Simulators: Run as subprocesses

    • Urban environment simulator
    • Social environment simulator
    • Economic environment simulator
  • Agent Groups: Organized as Ray actors in separate processes

    • Each group contains multiple agents sharing client connections
    • Enables distributed computing across machines

1.2.3 GUI Component (Optional)

  • Backend: Connects to database and MQTT server

    • Retrieves simulation data for visualization
    • Processes user inputs for agent interaction
  • Frontend: Provides visualization and interaction interface

    • Displays agent states, locations, and interactions
    • Enables direct communication with agents

2. Detailed Implementation

2.1 LLM-driven Social Generative Agents

2.1.1 Mental Process Implementation

Emotions Module:

  • Based on Shvo et al.'s emotion measurement framework
  • Implemented as a structured object with:
    • Keyword descriptor for current emotional state
    • Sentence-based thought related to emotion
    • Intensity ratings (0-10) for six core emotions
  • Updated through event processing and cognitive appraisal

Needs Module:

  • Based on Maslow's hierarchy of needs
  • Implemented as a prioritized queue with:
    • Need categories (physiological, safety, love, esteem, self-actualization)
    • Satisfaction levels for each need
    • Priority weights based on current context
  • Updated through:
    • Behavior outcomes
    • Environmental stimuli
    • Time-based decay functions

Cognition Module:

  • Implements reasoning and decision-making processes
  • Components include:
    • Attitude tracking system (0-10 ratings on topics)
    • Thought generation mechanism
    • Decision evaluation framework
  • Integrated with Theory of Planned Behavior for action selection

2.1.2 Memory System Implementation

Profile Memory:

  • Implemented as a static JSON structure
  • Contains demographic data, personality traits, background information
  • Loaded at initialization and remains constant

Status Memory:

  • Implemented as a key-value store with change tracking
  • Updated dynamically during simulation
  • Includes current location, financial status, relationship states

Stream Memory:

  • Implemented as two parallel linked lists:
    • Event Flow: chronological record of objective events
    • Perception Flow: subjective experiences linked to events
  • Each node contains:
    • Timestamp
    • Location data
    • Content description
    • Relevance score
  • Retrieval mechanism uses:
    • Recency weighting
    • Relevance scoring
    • Context-based filtering

2.1.3 Behavior Implementation

Mobility Behavior:

  • Hierarchical decision framework implemented as a pipeline:
    1. Intention extraction using LLM-based need analysis
    2. Place type selection through POI category matching
    3. Radius decision using weighted constraint evaluation
    4. Place selection using gravity model implementation
  • Gravity model formula: P_ij = (S_j / D_ij^β) / ∑(S_k / D_ik^β)
    • Implemented with configurable distance decay coefficient (β)
    • Location attractiveness (S_j) calculated from POI metadata

Social Behavior:

  • Social relationships implemented as weighted graph structure
  • Interaction selection using multi-factor decision algorithm:
    • Relationship strength weighting
    • Need satisfaction potential
    • Context relevance scoring
  • Message generation pipeline:
    1. Intent determination based on needs
    2. Content generation using LLM with relationship context
    3. Tone adjustment based on emotional state
    4. Response handling with history integration

Economic Behavior:

  • Work propensity implemented as a dynamic variable influenced by:
    • Need satisfaction levels
    • Economic conditions
    • Personal preferences
  • Consumption modeled using:
    • Budget allocation algorithm
    • Utility maximization function
    • Price sensitivity parameters
  • Integration with economic environment through transaction API

2.1.4 Agent Workflow Implementation

  • Implemented as a continuous loop with the following steps:

    1. State assessment (needs, emotions, context)
    2. Action determination using weighted decision matrix
    3. Action execution through behavior modules
    4. Feedback processing from environment
    5. Memory update with event and perception recording
    6. State update (emotions, needs, cognition)
  • LLM prompting strategy:

    • Context-rich prompts with relevant memory retrieval
    • Structured output formats for consistent parsing
    • Chain-of-thought reasoning for complex decisions
    • Role-specific instruction tuning

2.2 Realistic Societal Environment

2.2.1 Urban Space Implementation

Road Network:

  • Implemented using directed graph data structure
  • Nodes represent junctions, edges represent road segments
  • Attributes include:
    • Lane count and direction
    • Speed limits
    • Traffic signals
    • Pedestrian accessibility
  • Data sourced from OpenStreetMap with topological simplification

Points of Interest (POIs):

  • Implemented as geospatial database with:
    • Coordinates (latitude, longitude)
    • Category classification
    • Capacity attributes
    • Operating hours
    • Attractiveness metrics
  • Data sourced from SafeGraph with additional attributes

Transportation Simulation:

  • Implemented using discrete time-stepping mechanism
  • Vehicle movement follows:
    • IDM (Intelligent Driver Model) for acceleration
    • MOBIL model for lane-changing decisions
  • Pedestrian simulation uses:
    • Constant speed model on sidewalks
    • Traffic signal compliance at crossings
  • Public transit implemented with:
    • Fixed schedules and routes
    • Boarding/alighting time calculations
    • Capacity constraints

API Interface:

  • RESTful API for agent-environment interaction
  • Endpoints include:
    • Position updates
    • Path planning
    • Travel time estimation
    • POI information retrieval

2.2.2 Social Space Implementation

Social Network:

  • Implemented as a weighted directed graph
  • Nodes represent agents, edges represent relationships
  • Edge attributes include:
    • Relationship type (family, friend, colleague)
    • Strength value (0-100)
    • Interaction history references
  • Stored within agent data structures for efficient access

Interaction System:

  • Message-based communication implemented through MQTT
  • Topic structure:
    • exps/<exp_uuid>/agents/<agent_uuid>/agent-chat
    • exps/<exp_uuid>/agents/<agent_uuid>/user-chat
    • exps/<exp_uuid>/agents/<agent_uuid>/user-survey
  • Message processing pipeline:
    1. Content generation by sender
    2. Supervisor filtering (if enabled)
    3. Delivery to recipient
    4. Response generation

Supervisor System:

  • Implemented as preprocessing middleware
  • Content analysis using:
    • Keyword filtering
    • LLM-based content evaluation
    • Rule-based policy enforcement
  • Actions include:
    • Message filtering/blocking
    • User suspension
    • Connection removal

2.2.3 Economic Space Implementation

Account System:

  • Double-entry bookkeeping implementation
  • Tracks transactions between economic entities
  • Supports:
    • Income recording
    • Expense tracking
    • Savings accumulation
    • Tax calculation

Economic Entities:

  • Firms: Implemented as production functions

    • Labor input to goods output conversion
    • Dynamic wage and price adjustment
    • Revenue and profit calculation
  • Government: Implemented as fiscal policy engine

    • Progressive tax structure
    • Redistribution mechanisms
    • Policy intervention capabilities
  • Banks: Implemented as financial intermediaries

    • Interest calculation based on Taylor Rule
    • Savings management
    • Liquidity provision

National Bureau of Statistics:

  • Data aggregation system for economic indicators
  • Metrics include:
    • Real GDP calculation
    • Income distribution analysis
    • Consumption patterns
    • Employment statistics
  • Implemented with time-series database for trend analysis

2.3 Large-Scale Simulation Engine

2.3.1 Distributed Execution Implementation

Ray Framework Integration:

  • Implemented using Ray 2.0+ for distributed computing
  • Components:
    • Ray actors for agent groups
    • Task-based parallelism for environment simulation
    • Resource management for efficient allocation
  • Configuration options:
    • CPU/GPU allocation per actor
    • Memory limits
    • Network bandwidth constraints

Agent Grouping Strategy:

  • Implemented as a load balancing algorithm
  • Groups created based on:
    • Available computational resources
    • Network topology
    • Expected agent interaction patterns
  • Each group contains:
    • Multiple agent instances
    • Shared client connections
    • Local state management

Asynchronous Execution:

  • Implemented using Python's asyncio
  • Key components:
    • Event loop management
    • Coroutine scheduling
    • I/O operation optimization
  • Specific optimizations:
    • Connection pooling for LLM API calls
    • Batched database operations
    • Concurrent environment interactions

2.3.2 MQTT Messaging System Implementation

MQTT Server Configuration:

  • Based on emqx 5.8.1
  • Configured for:
    • High throughput (44,702 msg/s)
    • Reliable delivery
    • Topic-based routing
    • QoS level 1 (at least once delivery)

Client Implementation:

  • Asynchronous MQTT client using paho-mqtt
  • Features:
    • Automatic reconnection
    • Message buffering
    • Subscription management
    • Topic filtering

Topic Structure:

  • Hierarchical design for efficient routing:
    • exps/<exp_uuid>/agents/<agent_uuid>/agent-chat
    • exps/<exp_uuid>/agents/<agent_uuid>/user-chat
    • exps/<exp_uuid>/agents/<agent_uuid>/user-survey
  • Wildcards for subscription optimization:
    • Agents subscribe to exps/<exp_uuid>/agents/<agent_uuid>/#

Message Format:

  • JSON-based payload structure
  • Fields include:
    • Sender ID
    • Timestamp
    • Message type
    • Content
    • Metadata

2.3.3 Utilities Implementation

LLM API Adapter:

  • Unified interface for multiple LLM providers
  • Supports:
    • OpenAI-compatible APIs
    • DeepSeek
    • ChatGLM
    • Local deployment (vllm, ollama)
  • Features:
    • Token counting and management
    • Rate limiting
    • Error handling
    • Response validation

Retry Mechanism:

  • Exponential backoff implementation
  • Configurable parameters:
    • Maximum retry count (default: 3)
    • Initial delay
    • Backoff factor
    • Jitter

JSON Parser:

  • Robust parsing of LLM responses
  • Features:
    • Markdown code block extraction
    • JSON validation
    • Error recovery
    • Schema enforcement

Metric Recorder:

  • Based on mlflow with custom extensions
  • Metrics tracked:
    • Agent state statistics
    • Interaction counts
    • Economic indicators
    • Performance measurements
  • Implementation includes:
    • Thread-safe logging
    • Batched updates
    • Experiment tracking

Logging and Saving:

  • Dual storage approach:
    • AVRO format for local files
    • PostgreSQL for online storage
  • Schema design:
    • Agent profiles
    • State histories
    • Interaction records
    • Experimental metadata
  • Optimization for:
    • High-volume writes
    • Efficient querying
    • Storage compression

2.3.4 Social Science Toolbox Implementation

Intervention Tools:

  • Agent Configuration: JSON-based configuration system

    • Pre-simulation parameter setting
    • Profile customization
    • Behavioral tendency adjustment
  • State Manipulation: Runtime state modification API

    • Direct memory access
    • Emotion state adjustment
    • Need satisfaction level control
  • Message Notification: MQTT-based notification system

    • Targeted message delivery
    • Event triggering
    • Environmental change announcements

Interview System:

  • MQTT-based question delivery
  • Processing pipeline:
    1. Question reception and parsing
    2. Context retrieval from memory
    3. Response generation using LLM
    4. Answer formatting and delivery
  • Features:
    • Non-blocking execution
    • Context-aware responses
    • Conversation history tracking

Survey System:

  • Structured questionnaire implementation
  • Components:
    • Question schema definition
    • Response format specification
    • Data collection and aggregation
  • Survey types supported:
    • Multiple-choice
    • Likert scale
    • Ranking
    • Open-ended

3. Performance Optimization

3.1 Computational Efficiency

3.1.1 LLM Optimization

  • Prompt Engineering:

    • Optimized prompt templates for token efficiency
    • Context window management
    • Output format standardization
  • Batching Strategy:

    • Request batching for similar agent types
    • Priority-based scheduling
    • Adaptive batch sizing based on load
  • Caching Mechanism:

    • Response caching for common queries
    • Embedding-based similarity lookup
    • Cache invalidation strategies

3.1.2 Distributed Computing Optimization

  • Load Balancing:

    • Dynamic agent redistribution
    • Resource utilization monitoring
    • Adaptive group sizing
  • Communication Optimization:

    • Message compression
    • Selective subscription
    • Bandwidth management
  • Memory Management:

    • Shared memory for common resources
    • Garbage collection optimization
    • Memory-mapped file usage

3.2 Scalability Solutions

3.2.1 Horizontal Scaling

  • Cluster Configuration:

    • Multi-node Ray cluster setup
    • Resource allocation strategies
    • Network topology optimization
  • Shard-based Distribution:

    • Geographic sharding for urban space
    • Social network partitioning
    • Economic entity distribution

3.2.2 Vertical Scaling

  • Resource Utilization:

    • CPU/GPU optimization
    • Memory usage efficiency
    • I/O throughput maximization
  • Algorithm Optimization:

    • Spatial indexing for urban queries
    • Graph algorithms for social network
    • Numerical methods for economic calculations

3.3 Performance Metrics

3.3.1 System Performance

  • Throughput:

    • MQTT messaging: 44,702 msg/s
    • Environment simulation: 0.1680s per step for 10^6 agents
    • LLM processing: Variable based on model and provider
  • Latency:

    • Agent decision cycle: 2.94s average per call
    • Environment response: 9.55-33.55ms per call
    • End-to-end interaction: Variable based on complexity
  • Scalability:

    • Linear scaling up to 32 processes
    • Efficient handling of 10,000+ agents
    • Primary bottleneck: LLM API calls

4. Implementation Challenges and Solutions

4.1 Technical Challenges

4.1.1 TCP Port Exhaustion

Challenge: Individual agent processes would exhaust available TCP ports (65,535 limit)

Solution: Group-based execution with connection sharing

  • Multiple agents operate within single processes
  • Shared client connections to services
  • Connection pooling implementation

4.1.2 LLM API Latency

Challenge: LLM API calls introduce significant latency

Solution: Asynchronous execution and parallelization

  • Concurrent LLM requests through asyncio
  • Multi-process execution through Ray
  • Adaptive retry and timeout mechanisms

4.1.3 Inter-agent Communication

Challenge: Efficient message routing between thousands of agents

Solution: MQTT-based messaging system

  • Lightweight publish/subscribe architecture
  • Topic-based routing for efficient delivery
  • Optimized subscription patterns

4.1.4 Data Management

Challenge: Storing and analyzing massive simulation data

Solution: Hybrid storage approach

  • PostgreSQL with COPY FROM optimization
  • AVRO format for local file storage
  • Selective logging with importance sampling

4.2 Implementation Tradeoffs

4.2.1 Realism vs. Performance

Tradeoff: More realistic simulations require more computational resources

Solution: Multi-level simulation fidelity

  • Configurable detail levels for different components
  • Importance-based resource allocation
  • Simplified models for less critical aspects

4.2.2 Centralization vs. Distribution

Tradeoff: Centralized control simplifies coordination but limits scalability

Solution: Hybrid architecture

  • Centralized coordination for critical operations
  • Distributed execution for agent processing
  • Hierarchical organization with local autonomy

4.2.3 Generality vs. Optimization

Tradeoff: General-purpose code vs. optimized implementations

Solution: Layered architecture

  • Core framework with general interfaces
  • Specialized implementations for performance-critical components
  • Plugin system for domain-specific optimizations

5. Future Technical Directions

5.1 Architecture Enhancements

  • Adaptive Load Balancing: Dynamic agent redistribution based on interaction patterns
  • Hierarchical Simulation: Multi-level simulation with varying fidelity
  • Hybrid Cloud-Edge Deployment: Distributed processing across cloud and edge resources

5.2 Performance Improvements

  • Local LLM Deployment: On-premise inference for reduced latency
  • Specialized Hardware Acceleration: Custom FPGA/ASIC for environment simulation
  • Optimized Data Structures: Custom implementations for agent memory and environment

5.3 Feature Extensions

  • Enhanced Market Dynamics: Detailed modeling of goods and labor markets
  • Richer Urban Environments: Interior spaces and urban microenvironments
  • Platform-Specific Social Dynamics: Differentiated social media platforms
  • Cultural Variation: Integration of cultural differences in social behaviors

6. Conclusion

AgentSociety's technical implementation represents a significant advancement in large-scale social simulation, addressing key challenges in scalability, communication, and computational efficiency. The architecture successfully integrates LLM-driven agents with a realistic societal environment through a powerful simulation engine, enabling unprecedented scale and realism in agent-based social modeling.

The system's design choices—particularly the group-based distributed execution model, MQTT messaging system, and comprehensive utilities—create a flexible and extensible platform for social science research. While LLM API latency remains the primary bottleneck, the overall architecture provides a solid foundation for future enhancements and optimizations.

As computational resources and LLM capabilities continue to advance, this architecture provides a blueprint for even more ambitious simulations of human society, potentially scaling to millions of agents with increasingly realistic behaviors and interactions.