Skip to main content

Scaling Architecture

This document outlines the scaling strategies and architecture for the Stayzr hotel management system, covering horizontal and vertical scaling approaches for different system components.

Scaling Overview

The Stayzr platform is designed to scale efficiently across multiple dimensions:

  • Horizontal Scaling: Adding more instances to distribute load
  • Vertical Scaling: Increasing resources for existing instances
  • Database Scaling: Read replicas and sharding strategies
  • Caching Strategies: Multi-layer caching for performance
  • Auto-scaling: Dynamic resource allocation based on demand

Application Scaling

Horizontal Scaling Strategy

Stateless Application Design

// Stateless session management
export class SessionManager {
constructor(private redisClient: Redis) {}

async createSession(userId: string, data: SessionData): Promise<string> {
const sessionId = generateSecureId();
const sessionKey = `session:${sessionId}`;

await this.redisClient.setex(
sessionKey,
SESSION_TIMEOUT,
JSON.stringify({
userId,
...data,
createdAt: new Date(),
lastAccess: new Date()
})
);

return sessionId;
}

async getSession(sessionId: string): Promise<SessionData | null> {
const sessionKey = `session:${sessionId}`;
const sessionData = await this.redisClient.get(sessionKey);

if (!sessionData) return null;

// Update last access time
const parsed = JSON.parse(sessionData);
parsed.lastAccess = new Date();

await this.redisClient.setex(
sessionKey,
SESSION_TIMEOUT,
JSON.stringify(parsed)
);

return parsed;
}
}

Load Balancing Configuration

# nginx.conf for load balancing
upstream backend {
least_conn;
server app1:3000 weight=1 max_fails=3 fail_timeout=30s;
server app2:3000 weight=1 max_fails=3 fail_timeout=30s;
server app3:3000 weight=1 max_fails=3 fail_timeout=30s;

# Health check
keepalive 32;
}

server {
listen 80;

location / {
proxy_pass http://backend;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;

# Connection settings
proxy_connect_timeout 30s;
proxy_send_timeout 30s;
proxy_read_timeout 30s;

# Retry logic
proxy_next_upstream error timeout http_500 http_502 http_503 http_504;
proxy_next_upstream_tries 3;
}

# Health check endpoint
location /health {
proxy_pass http://backend;
access_log off;
}
}

Database Scaling

Read Replica Strategy

Database Architecture:
Primary Database:
- Handles all write operations
- Single source of truth
- Automatic failover setup

Read Replicas:
- Handle read-only queries
- Geographic distribution
- Load balancing across replicas

Connection Routing:
- Write operations → Primary
- Read operations → Replicas
- Fallback to primary if replicas unavailable

Database Connection Management

// Database connection pool with read/write splitting
export class DatabaseManager {
private writePool: Pool;
private readPools: Pool[];

constructor() {
this.writePool = new Pool({
connectionString: process.env.PRIMARY_DATABASE_URL,
max: 20,
idleTimeoutMillis: 30000,
connectionTimeoutMillis: 2000
});

this.readPools = [
new Pool({
connectionString: process.env.READ_REPLICA_1_URL,
max: 15,
idleTimeoutMillis: 30000
}),
new Pool({
connectionString: process.env.READ_REPLICA_2_URL,
max: 15,
idleTimeoutMillis: 30000
})
];
}

async executeWrite(query: string, params: any[]): Promise<any> {
const client = await this.writePool.connect();
try {
return await client.query(query, params);
} finally {
client.release();
}
}

async executeRead(query: string, params: any[]): Promise<any> {
// Round-robin selection of read replicas
const poolIndex = Math.floor(Math.random() * this.readPools.length);
const pool = this.readPools[poolIndex];

const client = await pool.connect();
try {
return await client.query(query, params);
} catch (error) {
// Fallback to primary if replica fails
console.warn('Read replica failed, falling back to primary', error);
return await this.executeWrite(query, params);
} finally {
client.release();
}
}
}

Database Sharding Strategy

Sharding Approach:
Shard Key: organization_id (hotel ID)
Reason: Natural tenant isolation

Shard Distribution:
Shard 1: Hotels 1-1000
Shard 2: Hotels 1001-2000
Shard 3: Hotels 2001-3000
# Auto-scaling based on demand

Cross-Shard Queries:
- Avoid when possible
- Use aggregation services
- Implement query federation
// Shard routing logic
export class ShardRouter {
private shards: Map<string, DatabaseConnection> = new Map();

constructor() {
this.initializeShards();
}

private initializeShards() {
// Initialize connections to different shards
this.shards.set('shard1', new DatabaseConnection(process.env.SHARD1_URL));
this.shards.set('shard2', new DatabaseConnection(process.env.SHARD2_URL));
this.shards.set('shard3', new DatabaseConnection(process.env.SHARD3_URL));
}

getShardForHotel(hotelId: number): DatabaseConnection {
const shardKey = this.calculateShardKey(hotelId);
const connection = this.shards.get(shardKey);

if (!connection) {
throw new Error(`No shard found for hotel ${hotelId}`);
}

return connection;
}

private calculateShardKey(hotelId: number): string {
if (hotelId <= 1000) return 'shard1';
if (hotelId <= 2000) return 'shard2';
return 'shard3';
}
}

Caching Strategy

Multi-Layer Caching Architecture

Caching Implementation

// Multi-level cache implementation
export class CacheManager {
private redisClient: Redis;
private memoryCache: Map<string, any> = new Map();

constructor(redisClient: Redis) {
this.redisClient = redisClient;
}

async get<T>(key: string, fallback: () => Promise<T>, ttl: number = 300): Promise<T> {
// Level 1: Memory cache
if (this.memoryCache.has(key)) {
return this.memoryCache.get(key);
}

// Level 2: Redis cache
const redisValue = await this.redisClient.get(key);
if (redisValue) {
const parsed = JSON.parse(redisValue);
this.memoryCache.set(key, parsed);
return parsed;
}

// Level 3: Database fallback
const value = await fallback();

// Cache in both levels
await this.set(key, value, ttl);

return value;
}

async set(key: string, value: any, ttl: number = 300): Promise<void> {
// Cache in memory (with shorter TTL)
this.memoryCache.set(key, value);
setTimeout(() => this.memoryCache.delete(key), Math.min(ttl, 60) * 1000);

// Cache in Redis
await this.redisClient.setex(key, ttl, JSON.stringify(value));
}

async invalidate(pattern: string): Promise<void> {
// Clear memory cache
for (const key of this.memoryCache.keys()) {
if (key.includes(pattern)) {
this.memoryCache.delete(key);
}
}

// Clear Redis cache
const keys = await this.redisClient.keys(`*${pattern}*`);
if (keys.length > 0) {
await this.redisClient.del(...keys);
}
}
}

Cache Strategies by Data Type

Static Data (Long TTL - 24 hours):
- Hotel information
- Room types and amenities
- Service catalogs
- Configuration settings

Dynamic Data (Medium TTL - 1 hour):
- Room availability
- Pricing information
- Staff schedules
- Guest preferences

Real-time Data (Short TTL - 5 minutes):
- Room status updates
- Active reservations
- Current occupancy
- Live pricing

Session Data (Redis with expiration):
- User sessions
- Shopping cart data
- Temporary form data
- Authentication tokens

Auto-scaling Configuration

Kubernetes Horizontal Pod Autoscaler

# hpa.yaml
apiVersion: autoscaling/v2
kind: HorizontalPodAutoscaler
metadata:
name: stayzr-app-hpa
spec:
scaleTargetRef:
apiVersion: apps/v1
kind: Deployment
name: stayzr-app
minReplicas: 3
maxReplicas: 20
metrics:
- type: Resource
resource:
name: cpu
target:
type: Utilization
averageUtilization: 70
- type: Resource
resource:
name: memory
target:
type: Utilization
averageUtilization: 80
- type: Pods
pods:
metric:
name: http_requests_per_second
target:
type: AverageValue
averageValue: "100"
behavior:
scaleUp:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 100
periodSeconds: 15
- type: Pods
value: 4
periodSeconds: 15
scaleDown:
stabilizationWindowSeconds: 300
policies:
- type: Percent
value: 10
periodSeconds: 60

Application Deployment for Scaling

# deployment.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: stayzr-app
spec:
replicas: 3
selector:
matchLabels:
app: stayzr-app
template:
metadata:
labels:
app: stayzr-app
spec:
containers:
- name: stayzr-app
image: stayzr:latest
ports:
- containerPort: 3000
env:
- name: NODE_ENV
value: "production"
- name: REDIS_URL
valueFrom:
secretKeyRef:
name: stayzr-secrets
key: redis-url
resources:
requests:
memory: "512Mi"
cpu: "250m"
limits:
memory: "1Gi"
cpu: "500m"
livenessProbe:
httpGet:
path: /health
port: 3000
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /health/ready
port: 3000
initialDelaySeconds: 5
periodSeconds: 5

Performance Optimization

Database Query Optimization

-- Index optimization for common queries
CREATE INDEX CONCURRENTLY idx_reservations_hotel_dates
ON reservations(hotel_id, check_in_date, check_out_date);

CREATE INDEX CONCURRENTLY idx_guests_hotel_email
ON guests(hotel_id, email);

CREATE INDEX CONCURRENTLY idx_rooms_hotel_status
ON rooms(hotel_id, status) WHERE status IN ('available', 'occupied');

-- Partitioning strategy for large tables
CREATE TABLE reservations_2024 PARTITION OF reservations
FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');

CREATE TABLE reservations_2025 PARTITION OF reservations
FOR VALUES FROM ('2025-01-01') TO ('2026-01-01');

Connection Pool Optimization

// Optimized connection pool configuration
export const createDatabasePool = (config: DatabaseConfig) => {
return new Pool({
host: config.host,
port: config.port,
database: config.database,
user: config.user,
password: config.password,

// Connection pool settings
min: 5, // Minimum connections
max: 20, // Maximum connections
acquireTimeoutMillis: 10000, // Max time to get connection
createTimeoutMillis: 10000, // Max time to create connection
destroyTimeoutMillis: 5000, // Max time to destroy connection
idleTimeoutMillis: 30000, // Close idle connections after 30s
reapIntervalMillis: 1000, // Check for idle connections every 1s
createRetryIntervalMillis: 200, // Retry interval for failed connections

// Connection validation
validate: (client: any) => {
return !client.connection._socket.destroyed;
},

// Logging
log: (message: string, level: string) => {
logger.info(`DB Pool ${level}: ${message}`);
}
});
};

Monitoring Scaling Metrics

Key Scaling Metrics

Application Metrics:
- Response time (P95, P99)
- Request rate (requests/second)
- Error rate (errors/minute)
- CPU utilization per instance
- Memory usage per instance
- Active connections per instance

Database Metrics:
- Query execution time
- Connection pool usage
- Lock contention
- Replication lag
- Cache hit ratio
- Disk I/O utilization

Infrastructure Metrics:
- Node CPU and memory usage
- Network bandwidth utilization
- Disk usage and I/O
- Load balancer health
- Auto-scaling events

Scaling Alerts

# scaling-alerts.yml
groups:
- name: scaling
rules:
- alert: HighCPUUsage
expr: avg(cpu_usage_percent) by (instance) > 80
for: 5m
annotations:
summary: "High CPU usage detected"
description: "CPU usage is {{ $value }}% on {{ $labels.instance }}"

- alert: HighMemoryUsage
expr: avg(memory_usage_percent) by (instance) > 85
for: 3m
annotations:
summary: "High memory usage detected"
description: "Memory usage is {{ $value }}% on {{ $labels.instance }}"

- alert: DatabaseConnectionPoolExhausted
expr: db_connections_active / db_connections_max > 0.9
for: 2m
annotations:
summary: "Database connection pool nearly exhausted"
description: "{{ $value }}% of database connections in use"

- alert: ScalingEventFrequent
expr: rate(hpa_scaling_events_total[10m]) > 0.1
for: 5m
annotations:
summary: "Frequent auto-scaling events"
description: "Auto-scaler triggered {{ $value }} times per minute"

Cost Optimization

Resource Right-Sizing

Cost Optimization Strategies:
Compute Resources:
- Use appropriate instance types
- Implement spot instances for development
- Schedule scaling for predictable patterns
- Use reserved instances for baseline load

Database Resources:
- Optimize queries to reduce CPU usage
- Use read replicas strategically
- Archive old data regularly
- Implement connection pooling

Storage Resources:
- Use appropriate storage classes
- Implement data lifecycle policies
- Compress backups and logs
- Clean up unused resources

Scaling Policies

// Custom scaling logic based on business metrics
export class BusinessMetricScaler {
async shouldScale(): Promise<ScalingDecision> {
const metrics = await this.collectMetrics();

// Business-aware scaling logic
const checkInsPerMinute = metrics.checkInsPerMinute;
const avgResponseTime = metrics.avgResponseTime;
const errorRate = metrics.errorRate;

// Scale up conditions
if (checkInsPerMinute > 50 && avgResponseTime > 2000) {
return { action: 'scale-up', reason: 'High check-in volume with slow response' };
}

if (errorRate > 0.05) {
return { action: 'scale-up', reason: 'High error rate detected' };
}

// Scale down conditions
if (checkInsPerMinute < 10 && avgResponseTime < 500) {
return { action: 'scale-down', reason: 'Low traffic with fast response' };
}

return { action: 'no-change', reason: 'Metrics within normal range' };
}
}

Disaster Recovery and Scaling

Multi-Region Deployment

Multi-Region Strategy:
Primary Region (US-East):
- Full application stack
- Primary database with backups
- Real-time monitoring

Secondary Region (US-West):
- Standby application instances
- Database read replica
- Reduced monitoring

Failover Process:
- Automatic DNS failover
- Database promotion
- Application restart
- Data synchronization

Scaling During Incidents

Incident Response Scaling:
Immediate Actions:
- Increase replica count
- Enable emergency caching
- Route traffic to healthy instances
- Disable non-essential features

Recovery Actions:
- Gradual traffic restoration
- Performance monitoring
- Normal scaling restoration
- Post-incident analysis

Testing Scaling Strategies

Load Testing

// Load testing script using k6
import http from 'k6/http';
import { check, sleep } from 'k6';

export let options = {
stages: [
{ duration: '5m', target: 100 }, // Ramp up
{ duration: '10m', target: 100 }, // Stay at 100 users
{ duration: '5m', target: 200 }, // Ramp to 200 users
{ duration: '10m', target: 200 }, // Stay at 200 users
{ duration: '5m', target: 0 }, // Ramp down
],
};

export default function() {
// Test check-in endpoint under load
let response = http.post('https://api.stayzr.com/v1/guest/checkin', {
bookingReference: 'TEST123',
guestName: 'Load Test User'
});

check(response, {
'status is 200': (r) => r.status === 200,
'response time < 2000ms': (r) => r.timings.duration < 2000,
});

sleep(1);
}

Chaos Engineering

Chaos Testing Scenarios:
- Random pod termination
- Database connection failures
- Network latency injection
- Memory pressure testing
- Disk space exhaustion
- Load balancer failures