user-guides

Azure Traffic Manager (ATM) Architecture for RC System – End-to-End Communication Flow

Overview

This document describes how to architect the RC (Remote Control) system using Azure Traffic Manager (ATM) instead of a Cross-Region Load Balancer (CR-LB). The architecture shows how the Controller and Target connect to Brokers in different regions, with ATM serving as the global DNS-based traffic distributor. It also explains the key trade-offs compared to a CR-LB-based architecture.


1. Architecture Diagram

High Level(Simple) image

Flow Level image

2. Key Components


How this architecture works? (flow, simplified)

What is Azure Traffic Manager?

Azure Traffic Manager (ATM) is like a smart phone directory for our broker servers. When a user (Target or Controller) wants to connect to a broker, they ask ATM, “Which broker should I connect to?” ATM looks at where the user is located, checks which brokers are healthy, and gives them the best broker’s address.


Real-World Example: Connecting to Broker B3

Let’s walk through how Target (in India) connects to Broker B3 using Azure Traffic Manager.


Step 1: Target Wants to Connect


Step 2: Target Asks ATM for the Broker’s Address

Think of it like:


Step 3: ATM Gives Target the Best Broker Address

Think of it like:


Step 4: Target Connects Directly to the Broker

Think of it like:


Step 5: Load Balancer Passes the Call to Broker B3

Think of it like:


Step 6: Target is Connected to Broker B3


The Same Process for Controller (USA)

Controller (USA) wants to connect to the same Broker B3:

  1. Controller asks ATM: “What’s the address for broker-in-3.company.com?”
  2. ATM responds: “It’s at 20.192.45.100” (same India Load Balancer)
  3. Controller connects directly to 20.192.45.100 from USA over the internet
  4. Load Balancer forwards to Broker B3
  5. Controller is now connected to Broker B3

Now both Target (India) and Controller (USA) are connected to the same Broker B3, and they can have their remote session.


Key Points (Simplified)

What Explanation
ATM’s Job Acts like a phone directory—tells you which broker to connect to
How it works You ask for a broker’s name, ATM gives you its IP address
ATM is NOT in the call After giving you the address, ATM steps away—you connect directly
Smart routing ATM picks the best broker based on your location and broker health
DNS-based Works using DNS (like how websites work—you type a name, get an IP)

Why Do We Use ATM?

Benefits:

Automatic routing: Users get connected to the nearest healthy broker
No special IPs needed: Works with regular Azure Load Balancers
Health checks: Automatically avoids broken brokers
Global coverage: Works anywhere in the world

Limitations:

Not the fastest option: Traffic goes over public internet (not Azure’s private network)
Slower failover: If a broker fails, it takes time for users to switch (DNS caching)
Variable performance: Users might not always connect to the absolute closest broker


When Should We Use ATM?

Use ATM when:

Use Cross-Region Load Balancer (better option) when:


Azure Traffic Manager is like a smart directory that helps users find and connect to the best broker server. It works by:

  1. User asks: “Which broker should I use?”
  2. ATM answers: “Use this IP address (the best one for you)”
  3. User connects directly to that broker
  4. Session established between Target and Controller

ATM only handles the “finding” part—it doesn’t touch the actual connection.


3. End-to-End Communication Flow (bit more depth)

A. Setup

  1. Each Broker DNS name (e.g., broker-us-1.company.com) is configured as an endpoint in Azure Traffic Manager.
  2. Azure Traffic Manager profile contains all broker endpoints pointing to Regional Standard LB Public IPs:
    • broker-in-3.company.com → ATM endpoint → slb-b3-in Public IP
    • broker-in-4.company.com → ATM endpoint → slb-b4-in Public IP
    • broker-us-1.company.com → ATM endpoint → slb-b1-us Public IP
    • broker-us-2.company.com → ATM endpoint → slb-b2-us Public IP
  3. Routing Method: ‘Performance’ or ‘Priority’; ‘Performance’ recommended.
  4. Health probing (TCP/443) checks endpoint status.

B. Controller/Target Connectivity

Step 1: Controller (C1, USA) needs to initiate a remote session.

Step 2: Target (T1, India) prepares for remote session.

Step 3: Brokers coordinate via mesh and connect Controller and Target as needed.


4. Traffic Flow Clarification

Important: ATM is DNS-Only, Not in Data Path

Step-by-step flow:

1. Target/Controller → DNS Query for "broker-us-1.company.com" → Azure Traffic Manager
2. Azure Traffic Manager → DNS Response: "slb-b1-us Public IP = 20.x.x.x" → Target/Controller
3. Target/Controller → TCP+TLS connection directly to 20.x.x.x (slb-b1-us) → slb-b1-us → Broker B1

Key Points:

Flow Summary Table

Step Action Goes Through ATM?
1. DNS Query Client queries broker hostname ✅ Yes (DNS only)
2. DNS Response ATM returns regional SLB IP ✅ Yes (DNS only)
3. TCP Connection Client connects to SLB IP ❌ No (direct to SLB)
4. Traffic Flow SLB forwards to Broker VM ❌ No (regional only)

5. Sequence Diagram of Traffic Manager-based Routing

Client (C1) or Target (T1)
    |
    |---(DNS query for broker-us-1.company.com)---> Azure Traffic Manager (ATM)
    |<--(ATM returns slb-b1-us public IP: 20.x.x.x)---
    |
    |---(TCP+TLS connect directly to 20.x.x.x)---> Regional Standard LB (slb-b1-us)
    |                                                      |
    |                                                      v
    |                                                 Broker B1 VM
    |
    |<---------[Remote Session Established via Broker Mesh]-----------------

6. Trade-Offs: Azure Traffic Manager vs. Cross-Region Load Balancer (CR-LB)

Feature CR-LB (Ideal) ATM (Fallback/Current)
Entry Point to Azure Nearest Azure Edge POP (Anycast IP) Traffic enters at broker’s home region (regional public IP)
Global Single IP ✅ (Anycast/Global IP per Broker) ❌ (Returns regional IP per DNS response)
Routing Control L4, in-path; instant failover, Azure backbone routing DNS-based; depends on DNS cache, latency to DNS resolver
Protocol Support TCP, mTLS, any L4 TCP, mTLS, any protocol (since ATM is DNS-only)
Optimal Latency Yes (always nearest POP, private backbone) Not always (may not enter Azure at nearest edge)
Session Failover Speed Fast (in-path health checks, instant reroute) Slower (DNS TTL, cache delay)
DNS Complexity Single A record per Broker ATM profile replaces A record, points to regional IPs
Client Experience Consistent, seamless May experience more variable latency, slow failover
Azure Dependency Requires global public IP inventory No Global IP required
Scalability Native Anycast, easy glob. expansion Add regional IPs to ATM profile
Data Path CR-LB is in the data path (L4 proxy) ATM is NOT in data path (DNS only)

7. Summary and Guidance