Implementing Role-Based Access Control with SSO in Multi-Tenant Applications
How we built a flexible RBAC system with SSO integration for VMware's SD-WAN orchestrator, supporting multiple identity providers and complex enterprise requirements.
Implementing Role-Based Access Control with SSO in Multi-Tenant Applications
During my time at VMware, one of the most complex and critical projects I worked on was implementing Role-Based Access Control (RBAC) with Single Sign-On (SSO) for our SD-WAN orchestrator. As a multi-tenant application serving enterprise customers worldwide, we needed to support diverse authentication requirements while maintaining security, scalability, and ease of use.
The Challenge
Our SD-WAN orchestrator was rapidly growing from a single-tenant application to a multi-tenant SaaS platform. Each enterprise customer had unique requirements:
- Diverse Identity Providers: Customers used different SSO providers (Active Directory, Okta, Azure AD, SAML, etc.)
- Complex Role Hierarchies: Different organizational structures required flexible role definitions
- Tenant Isolation: Complete separation of data and permissions between customers
- Compliance Requirements: SOC 2, GDPR, and industry-specific security standards
- Scale: Supporting thousands of users across hundreds of enterprise customers
Architecture Overview
We designed a flexible RBAC system that could adapt to various enterprise authentication patterns:
// Core RBAC entities
interface User {
id: string
email: string
tenantId: string
roles: Role[]
identityProvider: string
externalId: string
}
interface Role {
id: string
name: string
tenantId: string
permissions: Permission[]
isSystemRole: boolean
}
interface Permission {
resource: string
actions: string[]
conditions?: AccessCondition[]
}
interface Tenant {
id: string
name: string
ssoConfig: SSOConfiguration
customRoles: Role[]
}
SSO Integration Strategy
1. Multi-Protocol Support
We implemented support for multiple authentication protocols to accommodate different customer preferences:
interface SSOConfiguration {
tenantId: string
protocol: 'SAML' | 'OIDC' | 'LDAP'
providerName: string
configuration: SAMLConfig | OIDCConfig | LDAPConfig
attributeMapping: AttributeMapping
isActive: boolean
}
interface AttributeMapping {
email: string
firstName: string
lastName: string
groups?: string
roles?: string
department?: string
}
2. Dynamic Provider Registration
Customers could configure their own identity providers through our admin interface:
import { Repository } from 'typeorm'
import { Injectable } from '@nestjs/common'
import * as crypto from 'crypto'
import { ConfigService } from '@nestjs/config'
@Injectable()
export class SSOConfigurationService {
constructor(
private readonly ssoConfigRepository: Repository<SSOConfiguration>,
private readonly configService: ConfigService
) {}
async registerProvider(tenantId: string, config: SSOConfiguration): Promise<void> {
// Validate configuration
await this.validateSSOConfig(config)
// Encrypt sensitive data
const encryptedSecrets = this.encryptSecrets(config.secrets)
// Store configuration securely
const ssoConfig = this.ssoConfigRepository.create({
...config,
tenantId,
createdAt: new Date(),
encryptedSecrets
})
await this.ssoConfigRepository.save(ssoConfig)
// Update runtime configuration
await this.updateRuntimeConfig(tenantId, config)
// Test connection
await this.testSSOConnection(config)
}
private async validateSSOConfig(config: SSOConfiguration): Promise<void> {
switch (config.protocol) {
case 'SAML':
await this.validateSAMLConfig(config.configuration as SAMLConfig)
break
case 'OIDC':
await this.validateOIDCConfig(config.configuration as OIDCConfig)
break
case 'LDAP':
await this.validateLDAPConfig(config.configuration as LDAPConfig)
break
default:
throw new Error(`Unsupported SSO protocol: ${config.protocol}`)
}
}
private encryptSecrets(secrets: any): string {
const algorithm = 'aes-256-gcm'
const key = this.configService.get<string>('ENCRYPTION_KEY')
const iv = crypto.randomBytes(16)
const cipher = crypto.createCipher(algorithm, key)
let encrypted = cipher.update(JSON.stringify(secrets), 'utf8', 'hex')
encrypted += cipher.final('hex')
return `${iv.toString('hex')}:${encrypted}`
}
}
RBAC Implementation
1. Hierarchical Role System
We implemented a flexible role hierarchy that could accommodate different organizational structures:
import { Injectable, Logger } from '@nestjs/common'
import { InjectRepository } from '@nestjs/typeorm'
import { Repository } from 'typeorm'
import { Redis } from 'ioredis'
import * as minimatch from 'minimatch'
@Injectable()
export class RBACService {
private readonly logger = new Logger(RBACService.name)
constructor(
@InjectRepository(User)
private readonly userRepository: Repository<User>,
@InjectRepository(Role)
private readonly roleRepository: Repository<Role>,
private readonly redis: Redis
) {}
async checkPermission(
userId: string,
resource: string,
action: string,
context?: AccessContext
): Promise<boolean> {
try {
// Check cache first
const cacheKey = `rbac:${userId}:${resource}:${action}`
const cached = await this.redis.get(cacheKey)
if (cached !== null) {
return cached === 'true'
}
const user = await this.userRepository.findOne({
where: { id: userId },
relations: ['roles', 'roles.permissions']
})
if (!user || !user.isActive) {
await this.redis.setex(cacheKey, 300, 'false')
return false
}
// Check all user roles
for (const role of user.roles) {
if (await this.roleHasPermission(role, resource, action, context)) {
await this.redis.setex(cacheKey, 300, 'true')
return true
}
}
await this.redis.setex(cacheKey, 300, 'false')
return false
} catch (error) {
this.logger.error(`Error checking permission for user ${userId}`, error)
return false
}
}
private async roleHasPermission(
role: Role,
resource: string,
action: string,
context?: AccessContext
): Promise<boolean> {
for (const permission of role.permissions) {
// Use minimatch for resource pattern matching
if (minimatch(resource, permission.resource) &&
permission.actions.includes(action)) {
// Check conditions if present
if (permission.conditions && permission.conditions.length > 0) {
return await this.evaluateConditions(permission.conditions, context)
}
return true
}
}
return false
}
private async evaluateConditions(
conditions: AccessCondition[],
context?: AccessContext
): Promise<boolean> {
// Implement condition evaluation logic
for (const condition of conditions) {
switch (condition.type) {
case 'tenant':
if (context?.tenantId !== condition.value) return false
break
case 'time':
if (!this.isWithinTimeRange(condition.value)) return false
break
case 'ip':
if (!this.isAllowedIP(context?.ipAddress, condition.value)) return false
break
}
}
return true
}
}
2. Dynamic Role Assignment
We supported both static role assignment and dynamic role mapping from SSO attributes:
import { Injectable, Logger } from '@nestjs/common'
import { InjectRepository } from '@nestjs/typeorm'
import { Repository, EntityManager } from 'typeorm'
import { EventEmitter2 } from '@nestjs/event-emitter'
@Injectable()
export class UserProvisioningService {
private readonly logger = new Logger(UserProvisioningService.name)
constructor(
@InjectRepository(User)
private readonly userRepository: Repository<User>,
@InjectRepository(Role)
private readonly roleRepository: Repository<Role>,
@InjectRepository(SSOConfiguration)
private readonly ssoConfigRepository: Repository<SSOConfiguration>,
private readonly entityManager: EntityManager,
private readonly eventEmitter: EventEmitter2
) {}
async provisionUserFromSSO(
ssoResponse: SSOResponse,
tenantId: string
): Promise<User> {
return await this.entityManager.transaction(async (transactionManager) => {
const ssoConfig = await this.getSSOConfig(tenantId)
const userAttributes = this.mapAttributes(ssoResponse, ssoConfig.attributeMapping)
let user = await transactionManager.findOne(User, {
where: {
externalId: userAttributes.externalId,
tenantId
},
relations: ['roles']
})
if (!user) {
user = await this.createUser(userAttributes, tenantId, transactionManager)
this.logger.log(`Created new user: ${user.email} for tenant: ${tenantId}`)
} else {
// Update user attributes
user.email = userAttributes.email
user.firstName = userAttributes.firstName
user.lastName = userAttributes.lastName
user.lastLoginAt = new Date()
await transactionManager.save(user)
}
// Update roles based on SSO groups/attributes
const newRoles = await this.mapSSOGroupsToRoles(
userAttributes.groups || [],
tenantId,
transactionManager
)
// Only update if roles have changed
if (this.rolesHaveChanged(user.roles, newRoles)) {
user.roles = newRoles
await transactionManager.save(user)
// Emit event for audit logging
this.eventEmitter.emit('user.roles.updated', {
userId: user.id,
tenantId,
oldRoles: user.roles.map(r => r.name),
newRoles: newRoles.map(r => r.name)
})
}
return user
})
}
private async mapSSOGroupsToRoles(
ssoGroups: string[],
tenantId: string,
manager: EntityManager
): Promise<Role[]> {
if (!ssoGroups || ssoGroups.length === 0) {
// Return default role for tenant
const defaultRole = await manager.findOne(Role, {
where: { tenantId, isDefault: true }
})
return defaultRole ? [defaultRole] : []
}
const roleMappings = await manager.find(RoleMapping, {
where: { tenantId },
relations: ['role']
})
const roles: Role[] = []
for (const group of ssoGroups) {
const mapping = roleMappings.find(m =>
m.ssoGroup.toLowerCase() === group.toLowerCase()
)
if (mapping && mapping.role) {
roles.push(mapping.role)
}
}
// Ensure user has at least one role
if (roles.length === 0) {
const defaultRole = await manager.findOne(Role, {
where: { tenantId, isDefault: true }
})
if (defaultRole) roles.push(defaultRole)
}
return roles
}
private rolesHaveChanged(currentRoles: Role[], newRoles: Role[]): boolean {
if (currentRoles.length !== newRoles.length) return true
const currentRoleIds = new Set(currentRoles.map(r => r.id))
const newRoleIds = new Set(newRoles.map(r => r.id))
return !this.setsEqual(currentRoleIds, newRoleIds)
}
private setsEqual<T>(set1: Set<T>, set2: Set<T>): boolean {
return set1.size === set2.size && [...set1].every(x => set2.has(x))
}
}
Security Considerations
1. Token Management
We implemented secure token handling with proper validation and refresh mechanisms:
import { Injectable, Logger } from '@nestjs/common'
import { JwtService } from '@nestjs/jwt'
import { ConfigService } from '@nestjs/config'
import * as jwt from 'jsonwebtoken'
import * as jwksClient from 'jwks-rsa'
@Injectable()
export class TokenService {
private readonly logger = new Logger(TokenService.name)
private readonly jwksClients = new Map<string, jwksClient.JwksClient>()
constructor(
private readonly jwtService: JwtService,
private readonly configService: ConfigService,
private readonly userService: UserService
) {}
async validateToken(token: string, tenantId: string): Promise<TokenValidationResult> {
try {
// Get signing key for tenant
const signingKey = await this.getSigningKey(tenantId)
// Decode and validate JWT
const decoded = jwt.verify(token, signingKey, {
algorithms: ['RS256', 'HS256'],
issuer: await this.getExpectedIssuer(tenantId),
audience: this.configService.get<string>('JWT_AUDIENCE')
}) as any
// Check token expiration
if (decoded.exp && decoded.exp < Date.now() / 1000) {
return { valid: false, reason: 'Token expired' }
}
// Validate tenant context
if (decoded.tenant !== tenantId) {
return { valid: false, reason: 'Invalid tenant' }
}
// Check if user is still active
const user = await this.userService.getUser(decoded.sub)
if (!user || !user.isActive) {
return { valid: false, reason: 'User inactive' }
}
// Check if user belongs to the tenant
if (user.tenantId !== tenantId) {
return { valid: false, reason: 'User not in tenant' }
}
return { valid: true, user, claims: decoded }
} catch (error) {
this.logger.warn(`Token validation failed: ${error.message}`)
return { valid: false, reason: 'Invalid token' }
}
}
private async getSigningKey(tenantId: string): Promise<string> {
const ssoConfig = await this.getSSOConfig(tenantId)
if (ssoConfig.protocol === 'OIDC') {
// For OIDC, get key from JWKS endpoint
return this.getJWKSKey(ssoConfig.jwksUri, tenantId)
} else {
// For SAML or custom, use configured key
return this.decryptKey(ssoConfig.signingKey)
}
}
private async getJWKSKey(jwksUri: string, tenantId: string): Promise<string> {
if (!this.jwksClients.has(tenantId)) {
this.jwksClients.set(tenantId, jwksClient({
jwksUri,
cache: true,
cacheMaxAge: 600000, // 10 minutes
rateLimit: true,
jwksRequestsPerMinute: 10
}))
}
const client = this.jwksClients.get(tenantId)!
return new Promise((resolve, reject) => {
client.getSigningKey('key-id', (err, key) => {
if (err) {
reject(err)
} else {
resolve(key.getPublicKey())
}
})
})
}
async generateToken(user: User, tenantId: string): Promise<string> {
const payload = {
sub: user.id,
email: user.email,
tenant: tenantId,
roles: user.roles.map(role => role.name),
iat: Math.floor(Date.now() / 1000),
exp: Math.floor(Date.now() / 1000) + (60 * 60 * 8) // 8 hours
}
return this.jwtService.sign(payload, {
secret: this.configService.get<string>('JWT_SECRET'),
algorithm: 'HS256'
})
}
}
2. Audit Logging
Comprehensive audit logging was crucial for compliance and security monitoring:
class AuditService {
async logAccess(event: AccessEvent): Promise<void> {
const auditLog: AuditLog = {
timestamp: new Date(),
tenantId: event.tenantId,
userId: event.userId,
action: event.action,
resource: event.resource,
result: event.result,
ipAddress: event.ipAddress,
userAgent: event.userAgent,
sessionId: event.sessionId
}
// Store in secure audit database
await this.auditRepository.save(auditLog)
// Send to SIEM if configured
if (this.siemConfig.enabled) {
await this.sendToSIEM(auditLog)
}
}
}
Multi-Tenant Isolation
1. Data Isolation
We ensured complete data isolation between tenants at multiple levels:
class TenantAwareRepository<T> {
async findByTenant(tenantId: string, filters?: any): Promise<T[]> {
return this.repository.find({
where: {
tenantId,
...filters
}
})
}
async save(entity: T & { tenantId: string }): Promise<T> {
// Ensure tenant ID is always set
if (!entity.tenantId) {
throw new Error('Tenant ID is required')
}
return this.repository.save(entity)
}
}
2. Permission Scoping
All permissions were automatically scoped to the tenant context:
class PermissionMiddleware {
async checkPermission(req: Request, res: Response, next: NextFunction) {
const { tenantId, userId } = req.user
const { resource, action } = this.extractPermissionInfo(req)
const hasPermission = await this.rbacService.checkPermission(
userId,
resource,
action,
{ tenantId, requestContext: req }
)
if (!hasPermission) {
return res.status(403).json({ error: 'Insufficient permissions' })
}
// Add tenant context to request
req.tenantContext = { tenantId }
next()
}
}
Performance Optimizations
1. Permission Caching
We implemented intelligent caching to reduce database queries:
import { Injectable } from '@nestjs/common'
import { Redis } from 'ioredis'
import { InjectRedis } from '@liaoliaots/nestjs-redis'
@Injectable()
export class CachedRBACService extends RBACService {
constructor(
@InjectRedis() private readonly redis: Redis,
// ... other dependencies
) {
super(/* parent dependencies */)
}
async checkPermission(
userId: string,
resource: string,
action: string,
context?: AccessContext
): Promise<boolean> {
const cacheKey = `rbac:permission:${userId}:${resource}:${action}`
try {
// Check Redis cache first
const cached = await this.redis.get(cacheKey)
if (cached !== null) {
return cached === 'true'
}
// If not in cache, check permission
const result = await super.checkPermission(userId, resource, action, context)
// Cache the result for 5 minutes
await this.redis.setex(cacheKey, 300, result.toString())
return result
} catch (error) {
// If cache fails, fall back to direct check
this.logger.warn(`Cache error for permission check: ${error.message}`)
return super.checkPermission(userId, resource, action, context)
}
}
async invalidateUserPermissions(userId: string): Promise<void> {
try {
const pattern = `rbac:permission:${userId}:*`
const keys = await this.redis.keys(pattern)
if (keys.length > 0) {
await this.redis.del(...keys)
}
} catch (error) {
this.logger.error(`Failed to invalidate permissions for user ${userId}`, error)
}
}
async invalidateTenantPermissions(tenantId: string): Promise<void> {
try {
// Get all users for tenant and invalidate their permissions
const users = await this.userRepository.find({
where: { tenantId },
select: ['id']
})
const pipeline = this.redis.pipeline()
for (const user of users) {
const pattern = `rbac:permission:${user.id}:*`
const keys = await this.redis.keys(pattern)
if (keys.length > 0) {
pipeline.del(...keys)
}
}
await pipeline.exec()
} catch (error) {
this.logger.error(`Failed to invalidate permissions for tenant ${tenantId}`, error)
}
}
}
2. Bulk Operations
For operations involving multiple users or resources, we implemented bulk processing:
class BulkRBACService {
async checkMultiplePermissions(
requests: PermissionRequest[]
): Promise<PermissionResult[]> {
// Group by user to minimize database queries
const userGroups = this.groupByUser(requests)
const results: PermissionResult[] = []
for (const [userId, userRequests] of userGroups) {
const user = await this.userService.getUser(userId)
const userResults = await this.checkUserPermissions(user, userRequests)
results.push(...userResults)
}
return results
}
}
Lessons Learned
1. Start Simple, Scale Gradually
We initially over-engineered the permission system. Starting with basic roles and gradually adding complexity based on actual customer needs proved more effective.
2. Standardize Early
Establishing clear patterns for permission naming and resource identification early saved significant refactoring effort later.
3. Test with Real Data
Mock data doesn't reveal the complexity of real enterprise directory structures. Testing with actual customer data uncovered edge cases we hadn't considered.
4. Monitor Performance
RBAC checks happen on every request. Implementing comprehensive monitoring and caching strategies from the beginning is crucial.
Results
After implementing the RBAC system:
- Customer Onboarding: Reduced from weeks to hours for SSO integration
- Security Compliance: Achieved SOC 2 Type II certification
- Performance: Sub-10ms permission checks even with complex role hierarchies
- Scalability: Successfully supported 500+ enterprise customers with 50,000+ users
- Flexibility: Customers could implement their exact organizational structure
Conclusion
Building RBAC with SSO for a multi-tenant application is complex, but the investment pays off in customer satisfaction, security, and scalability. The key is balancing flexibility with performance, and security with usability.
The system we built at VMware became a competitive advantage, enabling us to win enterprise deals that required sophisticated authentication and authorization capabilities. It also provided a solid foundation for future features and compliance requirements.
Interested in discussing authentication and authorization patterns? Connect with me on LinkedIn or email me to share your experiences with enterprise security implementations.