Implementing Role-Based Access Control with SSO in Multi-Tenant Applications

During my time at VMware, one of the most complex and critical projects I worked on was implementing Role-Based Access Control (RBAC) with Single Sign-On (SSO) for our SD-WAN orchestrator. As a multi-tenant application serving enterprise customers worldwide, we needed to support diverse authentication requirements while maintaining security, scalability, and ease of use.

The Challenge

Our SD-WAN orchestrator was rapidly growing from a single-tenant application to a multi-tenant SaaS platform. Each enterprise customer had unique requirements:

Diverse Identity Providers: Customers used different SSO providers (Active Directory, Okta, Azure AD, SAML, etc.)
Complex Role Hierarchies: Different organizational structures required flexible role definitions
Tenant Isolation: Complete separation of data and permissions between customers
Compliance Requirements: SOC 2, GDPR, and industry-specific security standards
Scale: Supporting thousands of users across hundreds of enterprise customers

Architecture Overview

We designed a flexible RBAC system that could adapt to various enterprise authentication patterns:

// Core RBAC entities
interface User {
  id: string
  email: string
  tenantId: string
  roles: Role[]
  identityProvider: string
  externalId: string
}

interface Role {
  id: string
  name: string
  tenantId: string
  permissions: Permission[]
  isSystemRole: boolean
}

interface Permission {
  resource: string
  actions: string[]
  conditions?: AccessCondition[]
}

interface Tenant {
  id: string
  name: string
  ssoConfig: SSOConfiguration
  customRoles: Role[]
}

SSO Integration Strategy

1. Multi-Protocol Support

We implemented support for multiple authentication protocols to accommodate different customer preferences:

interface SSOConfiguration {
  tenantId: string
  protocol: 'SAML' | 'OIDC' | 'LDAP'
  providerName: string
  configuration: SAMLConfig | OIDCConfig | LDAPConfig
  attributeMapping: AttributeMapping
  isActive: boolean
}

interface AttributeMapping {
  email: string
  firstName: string
  lastName: string
  groups?: string
  roles?: string
  department?: string
}

2. Dynamic Provider Registration

Customers could configure their own identity providers through our admin interface:

import { Repository } from 'typeorm'
import { Injectable } from '@nestjs/common'
import * as crypto from 'crypto'
import { ConfigService } from '@nestjs/config'

@Injectable()
export class SSOConfigurationService {
  constructor(
    private readonly ssoConfigRepository: Repository<SSOConfiguration>,
    private readonly configService: ConfigService
  ) {}

  async registerProvider(tenantId: string, config: SSOConfiguration): Promise<void> {
    // Validate configuration
    await this.validateSSOConfig(config)
    
    // Encrypt sensitive data
    const encryptedSecrets = this.encryptSecrets(config.secrets)
    
    // Store configuration securely
    const ssoConfig = this.ssoConfigRepository.create({
      ...config,
      tenantId,
      createdAt: new Date(),
      encryptedSecrets
    })
    
    await this.ssoConfigRepository.save(ssoConfig)
    
    // Update runtime configuration
    await this.updateRuntimeConfig(tenantId, config)
    
    // Test connection
    await this.testSSOConnection(config)
  }
  
  private async validateSSOConfig(config: SSOConfiguration): Promise<void> {
    switch (config.protocol) {
      case 'SAML':
        await this.validateSAMLConfig(config.configuration as SAMLConfig)
        break
      case 'OIDC':
        await this.validateOIDCConfig(config.configuration as OIDCConfig)
        break
      case 'LDAP':
        await this.validateLDAPConfig(config.configuration as LDAPConfig)
        break
      default:
        throw new Error(`Unsupported SSO protocol: ${config.protocol}`)
    }
  }

  private encryptSecrets(secrets: any): string {
    const algorithm = 'aes-256-gcm'
    const key = this.configService.get<string>('ENCRYPTION_KEY')
    const iv = crypto.randomBytes(16)
    
    const cipher = crypto.createCipher(algorithm, key)
    let encrypted = cipher.update(JSON.stringify(secrets), 'utf8', 'hex')
    encrypted += cipher.final('hex')
    
    return `${iv.toString('hex')}:${encrypted}`
  }
}

RBAC Implementation

1. Hierarchical Role System

We implemented a flexible role hierarchy that could accommodate different organizational structures:

import { Injectable, Logger } from '@nestjs/common'
import { InjectRepository } from '@nestjs/typeorm'
import { Repository } from 'typeorm'
import { Redis } from 'ioredis'
import * as minimatch from 'minimatch'

@Injectable()
export class RBACService {
  private readonly logger = new Logger(RBACService.name)

  constructor(
    @InjectRepository(User)
    private readonly userRepository: Repository<User>,
    @InjectRepository(Role)
    private readonly roleRepository: Repository<Role>,
    private readonly redis: Redis
  ) {}

  async checkPermission(
    userId: string, 
    resource: string, 
    action: string, 
    context?: AccessContext
  ): Promise<boolean> {
    try {
      // Check cache first
      const cacheKey = `rbac:${userId}:${resource}:${action}`
      const cached = await this.redis.get(cacheKey)
      
      if (cached !== null) {
        return cached === 'true'
      }

      const user = await this.userRepository.findOne({
        where: { id: userId },
        relations: ['roles', 'roles.permissions']
      })

      if (!user || !user.isActive) {
        await this.redis.setex(cacheKey, 300, 'false')
        return false
      }
      
      // Check all user roles
      for (const role of user.roles) {
        if (await this.roleHasPermission(role, resource, action, context)) {
          await this.redis.setex(cacheKey, 300, 'true')
          return true
        }
      }
      
      await this.redis.setex(cacheKey, 300, 'false')
      return false
    } catch (error) {
      this.logger.error(`Error checking permission for user ${userId}`, error)
      return false
    }
  }
  
  private async roleHasPermission(
    role: Role, 
    resource: string, 
    action: string, 
    context?: AccessContext
  ): Promise<boolean> {
    for (const permission of role.permissions) {
      // Use minimatch for resource pattern matching
      if (minimatch(resource, permission.resource) &&
          permission.actions.includes(action)) {
        
        // Check conditions if present
        if (permission.conditions && permission.conditions.length > 0) {
          return await this.evaluateConditions(permission.conditions, context)
        }
        
        return true
      }
    }
    
    return false
  }

  private async evaluateConditions(
    conditions: AccessCondition[], 
    context?: AccessContext
  ): Promise<boolean> {
    // Implement condition evaluation logic
    for (const condition of conditions) {
      switch (condition.type) {
        case 'tenant':
          if (context?.tenantId !== condition.value) return false
          break
        case 'time':
          if (!this.isWithinTimeRange(condition.value)) return false
          break
        case 'ip':
          if (!this.isAllowedIP(context?.ipAddress, condition.value)) return false
          break
      }
    }
    return true
  }
}

2. Dynamic Role Assignment

We supported both static role assignment and dynamic role mapping from SSO attributes:

import { Injectable, Logger } from '@nestjs/common'
import { InjectRepository } from '@nestjs/typeorm'
import { Repository, EntityManager } from 'typeorm'
import { EventEmitter2 } from '@nestjs/event-emitter'

@Injectable()
export class UserProvisioningService {
  private readonly logger = new Logger(UserProvisioningService.name)

  constructor(
    @InjectRepository(User)
    private readonly userRepository: Repository<User>,
    @InjectRepository(Role)
    private readonly roleRepository: Repository<Role>,
    @InjectRepository(SSOConfiguration)
    private readonly ssoConfigRepository: Repository<SSOConfiguration>,
    private readonly entityManager: EntityManager,
    private readonly eventEmitter: EventEmitter2
  ) {}

  async provisionUserFromSSO(
    ssoResponse: SSOResponse, 
    tenantId: string
  ): Promise<User> {
    return await this.entityManager.transaction(async (transactionManager) => {
      const ssoConfig = await this.getSSOConfig(tenantId)
      const userAttributes = this.mapAttributes(ssoResponse, ssoConfig.attributeMapping)
      
      let user = await transactionManager.findOne(User, {
        where: { 
          externalId: userAttributes.externalId, 
          tenantId 
        },
        relations: ['roles']
      })
      
      if (!user) {
        user = await this.createUser(userAttributes, tenantId, transactionManager)
        this.logger.log(`Created new user: ${user.email} for tenant: ${tenantId}`)
      } else {
        // Update user attributes
        user.email = userAttributes.email
        user.firstName = userAttributes.firstName
        user.lastName = userAttributes.lastName
        user.lastLoginAt = new Date()
        await transactionManager.save(user)
      }
      
      // Update roles based on SSO groups/attributes
      const newRoles = await this.mapSSOGroupsToRoles(
        userAttributes.groups || [], 
        tenantId,
        transactionManager
      )
      
      // Only update if roles have changed
      if (this.rolesHaveChanged(user.roles, newRoles)) {
        user.roles = newRoles
        await transactionManager.save(user)
        
        // Emit event for audit logging
        this.eventEmitter.emit('user.roles.updated', {
          userId: user.id,
          tenantId,
          oldRoles: user.roles.map(r => r.name),
          newRoles: newRoles.map(r => r.name)
        })
      }
      
      return user
    })
  }
  
  private async mapSSOGroupsToRoles(
    ssoGroups: string[], 
    tenantId: string,
    manager: EntityManager
  ): Promise<Role[]> {
    if (!ssoGroups || ssoGroups.length === 0) {
      // Return default role for tenant
      const defaultRole = await manager.findOne(Role, {
        where: { tenantId, isDefault: true }
      })
      return defaultRole ? [defaultRole] : []
    }

    const roleMappings = await manager.find(RoleMapping, {
      where: { tenantId },
      relations: ['role']
    })
    
    const roles: Role[] = []
    
    for (const group of ssoGroups) {
      const mapping = roleMappings.find(m => 
        m.ssoGroup.toLowerCase() === group.toLowerCase()
      )
      if (mapping && mapping.role) {
        roles.push(mapping.role)
      }
    }
    
    // Ensure user has at least one role
    if (roles.length === 0) {
      const defaultRole = await manager.findOne(Role, {
        where: { tenantId, isDefault: true }
      })
      if (defaultRole) roles.push(defaultRole)
    }
    
    return roles
  }

  private rolesHaveChanged(currentRoles: Role[], newRoles: Role[]): boolean {
    if (currentRoles.length !== newRoles.length) return true
    
    const currentRoleIds = new Set(currentRoles.map(r => r.id))
    const newRoleIds = new Set(newRoles.map(r => r.id))
    
    return !this.setsEqual(currentRoleIds, newRoleIds)
  }

  private setsEqual<T>(set1: Set<T>, set2: Set<T>): boolean {
    return set1.size === set2.size && [...set1].every(x => set2.has(x))
  }
}

Security Considerations

1. Token Management

We implemented secure token handling with proper validation and refresh mechanisms:

import { Injectable, Logger } from '@nestjs/common'
import { JwtService } from '@nestjs/jwt'
import { ConfigService } from '@nestjs/config'
import * as jwt from 'jsonwebtoken'
import * as jwksClient from 'jwks-rsa'

@Injectable()
export class TokenService {
  private readonly logger = new Logger(TokenService.name)
  private readonly jwksClients = new Map<string, jwksClient.JwksClient>()

  constructor(
    private readonly jwtService: JwtService,
    private readonly configService: ConfigService,
    private readonly userService: UserService
  ) {}

  async validateToken(token: string, tenantId: string): Promise<TokenValidationResult> {
    try {
      // Get signing key for tenant
      const signingKey = await this.getSigningKey(tenantId)
      
      // Decode and validate JWT
      const decoded = jwt.verify(token, signingKey, {
        algorithms: ['RS256', 'HS256'],
        issuer: await this.getExpectedIssuer(tenantId),
        audience: this.configService.get<string>('JWT_AUDIENCE')
      }) as any
      
      // Check token expiration
      if (decoded.exp && decoded.exp < Date.now() / 1000) {
        return { valid: false, reason: 'Token expired' }
      }
      
      // Validate tenant context
      if (decoded.tenant !== tenantId) {
        return { valid: false, reason: 'Invalid tenant' }
      }
      
      // Check if user is still active
      const user = await this.userService.getUser(decoded.sub)
      if (!user || !user.isActive) {
        return { valid: false, reason: 'User inactive' }
      }
      
      // Check if user belongs to the tenant
      if (user.tenantId !== tenantId) {
        return { valid: false, reason: 'User not in tenant' }
      }
      
      return { valid: true, user, claims: decoded }
    } catch (error) {
      this.logger.warn(`Token validation failed: ${error.message}`)
      return { valid: false, reason: 'Invalid token' }
    }
  }

  private async getSigningKey(tenantId: string): Promise<string> {
    const ssoConfig = await this.getSSOConfig(tenantId)
    
    if (ssoConfig.protocol === 'OIDC') {
      // For OIDC, get key from JWKS endpoint
      return this.getJWKSKey(ssoConfig.jwksUri, tenantId)
    } else {
      // For SAML or custom, use configured key
      return this.decryptKey(ssoConfig.signingKey)
    }
  }

  private async getJWKSKey(jwksUri: string, tenantId: string): Promise<string> {
    if (!this.jwksClients.has(tenantId)) {
      this.jwksClients.set(tenantId, jwksClient({
        jwksUri,
        cache: true,
        cacheMaxAge: 600000, // 10 minutes
        rateLimit: true,
        jwksRequestsPerMinute: 10
      }))
    }

    const client = this.jwksClients.get(tenantId)!
    
    return new Promise((resolve, reject) => {
      client.getSigningKey('key-id', (err, key) => {
        if (err) {
          reject(err)
        } else {
          resolve(key.getPublicKey())
        }
      })
    })
  }

  async generateToken(user: User, tenantId: string): Promise<string> {
    const payload = {
      sub: user.id,
      email: user.email,
      tenant: tenantId,
      roles: user.roles.map(role => role.name),
      iat: Math.floor(Date.now() / 1000),
      exp: Math.floor(Date.now() / 1000) + (60 * 60 * 8) // 8 hours
    }

    return this.jwtService.sign(payload, {
      secret: this.configService.get<string>('JWT_SECRET'),
      algorithm: 'HS256'
    })
  }
}

2. Audit Logging

Comprehensive audit logging was crucial for compliance and security monitoring:

class AuditService {
  async logAccess(event: AccessEvent): Promise<void> {
    const auditLog: AuditLog = {
      timestamp: new Date(),
      tenantId: event.tenantId,
      userId: event.userId,
      action: event.action,
      resource: event.resource,
      result: event.result,
      ipAddress: event.ipAddress,
      userAgent: event.userAgent,
      sessionId: event.sessionId
    }
    
    // Store in secure audit database
    await this.auditRepository.save(auditLog)
    
    // Send to SIEM if configured
    if (this.siemConfig.enabled) {
      await this.sendToSIEM(auditLog)
    }
  }
}

Multi-Tenant Isolation

1. Data Isolation

We ensured complete data isolation between tenants at multiple levels:

class TenantAwareRepository<T> {
  async findByTenant(tenantId: string, filters?: any): Promise<T[]> {
    return this.repository.find({
      where: {
        tenantId,
        ...filters
      }
    })
  }
  
  async save(entity: T & { tenantId: string }): Promise<T> {
    // Ensure tenant ID is always set
    if (!entity.tenantId) {
      throw new Error('Tenant ID is required')
    }
    
    return this.repository.save(entity)
  }
}

2. Permission Scoping

All permissions were automatically scoped to the tenant context:

class PermissionMiddleware {
  async checkPermission(req: Request, res: Response, next: NextFunction) {
    const { tenantId, userId } = req.user
    const { resource, action } = this.extractPermissionInfo(req)
    
    const hasPermission = await this.rbacService.checkPermission(
      userId,
      resource,
      action,
      { tenantId, requestContext: req }
    )
    
    if (!hasPermission) {
      return res.status(403).json({ error: 'Insufficient permissions' })
    }
    
    // Add tenant context to request
    req.tenantContext = { tenantId }
    next()
  }
}

Performance Optimizations

1. Permission Caching

We implemented intelligent caching to reduce database queries:

import { Injectable } from '@nestjs/common'
import { Redis } from 'ioredis'
import { InjectRedis } from '@liaoliaots/nestjs-redis'

@Injectable()
export class CachedRBACService extends RBACService {
  constructor(
    @InjectRedis() private readonly redis: Redis,
    // ... other dependencies
  ) {
    super(/* parent dependencies */)
  }
  
  async checkPermission(
    userId: string, 
    resource: string, 
    action: string, 
    context?: AccessContext
  ): Promise<boolean> {
    const cacheKey = `rbac:permission:${userId}:${resource}:${action}`
    
    try {
      // Check Redis cache first
      const cached = await this.redis.get(cacheKey)
      if (cached !== null) {
        return cached === 'true'
      }
      
      // If not in cache, check permission
      const result = await super.checkPermission(userId, resource, action, context)
      
      // Cache the result for 5 minutes
      await this.redis.setex(cacheKey, 300, result.toString())
      
      return result
    } catch (error) {
      // If cache fails, fall back to direct check
      this.logger.warn(`Cache error for permission check: ${error.message}`)
      return super.checkPermission(userId, resource, action, context)
    }
  }

  async invalidateUserPermissions(userId: string): Promise<void> {
    try {
      const pattern = `rbac:permission:${userId}:*`
      const keys = await this.redis.keys(pattern)
      
      if (keys.length > 0) {
        await this.redis.del(...keys)
      }
    } catch (error) {
      this.logger.error(`Failed to invalidate permissions for user ${userId}`, error)
    }
  }

  async invalidateTenantPermissions(tenantId: string): Promise<void> {
    try {
      // Get all users for tenant and invalidate their permissions
      const users = await this.userRepository.find({
        where: { tenantId },
        select: ['id']
      })

      const pipeline = this.redis.pipeline()
      
      for (const user of users) {
        const pattern = `rbac:permission:${user.id}:*`
        const keys = await this.redis.keys(pattern)
        if (keys.length > 0) {
          pipeline.del(...keys)
        }
      }
      
      await pipeline.exec()
    } catch (error) {
      this.logger.error(`Failed to invalidate permissions for tenant ${tenantId}`, error)
    }
  }
}

2. Bulk Operations

For operations involving multiple users or resources, we implemented bulk processing:

class BulkRBACService {
  async checkMultiplePermissions(
    requests: PermissionRequest[]
  ): Promise<PermissionResult[]> {
    // Group by user to minimize database queries
    const userGroups = this.groupByUser(requests)
    const results: PermissionResult[] = []
    
    for (const [userId, userRequests] of userGroups) {
      const user = await this.userService.getUser(userId)
      const userResults = await this.checkUserPermissions(user, userRequests)
      results.push(...userResults)
    }
    
    return results
  }
}

Lessons Learned

1. Start Simple, Scale Gradually

We initially over-engineered the permission system. Starting with basic roles and gradually adding complexity based on actual customer needs proved more effective.

2. Standardize Early

Establishing clear patterns for permission naming and resource identification early saved significant refactoring effort later.

3. Test with Real Data

Mock data doesn't reveal the complexity of real enterprise directory structures. Testing with actual customer data uncovered edge cases we hadn't considered.

4. Monitor Performance

RBAC checks happen on every request. Implementing comprehensive monitoring and caching strategies from the beginning is crucial.

Results

After implementing the RBAC system:

Customer Onboarding: Reduced from weeks to hours for SSO integration
Security Compliance: Achieved SOC 2 Type II certification
Performance: Sub-10ms permission checks even with complex role hierarchies
Scalability: Successfully supported 500+ enterprise customers with 50,000+ users
Flexibility: Customers could implement their exact organizational structure

Conclusion

Building RBAC with SSO for a multi-tenant application is complex, but the investment pays off in customer satisfaction, security, and scalability. The key is balancing flexibility with performance, and security with usability.

The system we built at VMware became a competitive advantage, enabling us to win enterprise deals that required sophisticated authentication and authorization capabilities. It also provided a solid foundation for future features and compliance requirements.

Interested in discussing authentication and authorization patterns? Connect with me on LinkedIn or email me to share your experiences with enterprise security implementations.