Global Server Load Balancing
Overview
Global Server Load Balancing (GSLB) improves the performance and resiliency of your applications by distributing traffic to the nearest healthy point of presence, measured by latency, from the perspective of the connecting client. Every application you deliver with ngrok is automatically accelerated with our always-on, zero-configuration GSLB.
Conceptually, a GSLB is simple. A GSLB accelerates traffic by adjusting DNS resolution so that when clients connect to your endpoints on the internet, they connect to the nearest point of presence (PoP). A GSLB adds resiliency by adjusting that same DNS resolution process to steer traffic away from failing PoPs.
vs. Traditional GSLBs
ngrok operates differently from a traditional GSLB to give you some important benefits.
-
In a traditional GSLB deployment, you are responsible for running geographically redundant copies of your application and configuring networking technologies to distribute and failover traffic among these deployments. By contrast, ngrok runs a global delivery network of PoPs for you. This means that even if you only deploy your upstream service to a single geography, your apps still get faster from Connection Acceleration and Module Acceleration.
-
Unlike a traditional GSLB deployment, when you deploy your application services to a new geographic location, ngrok automatically routes nearby traffic to those services without any configuration. Traditional GSLB deployments require many network configuration updates to bring these new locations online like IP provisioning and DNS record changes. With ngrok, there is zero configuration, just start your new application services with their ngrok agents and you're done.
-
Unlike a traditional reverse proxy, ngrok forwards traffic to your upstream services over secure connections established by the ngrok agent, Agent SDKs or Kubernetes Controller. So which PoPs do those agents connect to? ngrok also applies GSLB principles to those connections as well to ensure that your agents connect to the geographically closest PoPs.
Endpoints
ngrok's GSLB improves the performance and resiliency of connections to your
ngrok endpoints (e.g. your-app.ngrok.dev
).
Connection Acceleration
When a client (like a web browser) connects to your ngrok endpoint it is routed to the closest point of presence. DNS resolution of your endpoint's address returns the IP addresses of ngrok's closest point of presence. The closest point of presence is determined by the shortest latency from the resolving DNS server to our points of presence. This ensures that clients are routed to the fastest point of presence even as internet routing conditions change.
Connecting to the closest point of presence accelerates your traffic by reducing the initialization time for TCP and TLS connections. TCP and TLS connection set up requires network round-trips. ngrok accelerates connections by reducing the latency of these round-trip times (RTTs) between the client and your endpoint by routing them to the closest point of presence.
Module Acceleration
In addition to accelerating connection initialization, ngrok also accelerates the execution of Modules that you configure on your endpoints. Modules execute where the request is first received by the ngrok edge, at the geographically-closest point of presence. This means that any behaviors you define in your modules are automatically globally accelerated for all customers.
Geo-Aware Load Balancing
When you are using a Tunnel Group Backend with Edges, ngrok can automatically distribute traffic to the closest upstream if you run geographically distributed copies of your upstream application service.
First, consider a case without geo-aware load-balancing. Let's say you have instances of your application running in Japan. When someone in Belgium makes a request to your app's endpoint, that request will first be routed to the closest ngrok point of presence in Europe, like Frankfurt. From Frankfurt, it will then be routed through ngrok's internal network and finally to your upstream services in Japan.
Then let's say you start additional copies of your application service in France. With zero changes in configuration, those same requests from Belgium will still be routed to ngrok's point of presence in Frankfurt, but instead of routing to your upstream services in Japan, they will instead be routed to your new upstreams in France. That's Geo-Aware Load Balancing.
Losing an Upstream
Consider the example in Geo-Aware Load Balancing where you are running upstream application services in both Japan and France. Let's then assume that all of your upstream services in Japan fail. Requests that were being routed to your Japanese upstream services will instead be routed to your remaining upstream services in France. Requests that were already being routed to your upstreams in France will be unaffected.
Losing a Point of Presence
In losing an upstream, we considered a case where your upstream services fail. But what if an ngrok point of presence fails?
If an ngrok point of presence fails, health checks automatically detect that failure and update our DNS resolution such that clients will attempt to connect to your endpoints via the next-closest healthy point of presence.
Disabling GSLB for an Endpoint
There is no way to disable GSLB for HTTPS and TLS endpoints. All endpoints you create in ngrok are available on ngrok's worldwide global delivery network and inbound requests and connections are routed to the geographically closest point of presence.
TCP Endpoints
TCP Endpoints do not support GSLB. Unlike domains, TCP addresses are provisioned for a specific point of presence. When you create a TCP Address, you must select the point of presence where it will accept traffic. Unlike ngrok's globally routed domains, traffic to TCP addresses always enters ngrok's network from that point of presence.
Agents
ngrok's GSLB automatically improves the performance and resiliency of your ngrok agents, agent SDKs and the Kubernetes Controller. Agents automatically connect to ngrok's geographically-closest healthy point of presence.
Acceleration
Similar to endpoints, when an ngrok agent connects, DNS resolution returns the IPs of the closest ngrok point of presence. Connecting to the closest ngrok point of presence reduces agent initialization time and also reduces the latency of connections that are routed to that agent.
Losing a Point of Presence
If an ngrok point of presence fails, health checks automatically detect that failure and update our DNS resolution such that agents will attempt to connect to the next-closest healthy point of presence.
The ngrok agent connects to a single region at a time. If you want the ngrok agent to simultaneously connect to multiple regions, you must instead run multiple ngrok agents and explicitly disable agent GSLB.
Disabling GSLB in the Agent
You should always prefer to allow the ngrok agent to use GSLB and connect to the closest point of presence. However, the ngrok agent does support disabling GSLB and explicitly choosing which point of presence to connect to.
If you wish to configure the agent to use a specific point of presence you may do so with the following configuration:
- Agent CLI
- Agent Config
- SSH
- Go
- Javascript
- Python
- Rust
- Kubernetes Controller
ngrok http 80 --region eu
region: eu
ssh -R 0:localhost:80 connect.eu.ngrok-agent.com http
import (
"context"
"net"
"golang.ngrok.com/ngrok"
"golang.ngrok.com/ngrok/config"
)
func ngrokListener(ctx context.Context) (net.Listener, error) {
return ngrok.Listen(ctx,
config.HTTPEndpoint(),
ngrok.WithRegion("eu"),
ngrok.WithAuthtokenFromEnv(),
)
}
Go Package Docs:
const ngrok = require("@ngrok/ngrok");
(async function () {
const session = await new ngrok.SessionBuilder()
.authtokenFromEnv()
.serverAddr("connect.eu.ngrok-agent.com:443")
.connect();
const listener = await session
.httpEndpoint()
.listenAndForward("http://localhost:8080");
console.log(`Ingress established at: ${listener.url()}`);
})();
Javascript SDK Docs:
import ngrok, asyncio
async def create_listener() -> ngrok.Listener:
session: ngrok.Session = (
await ngrok.SessionBuilder()
.authtoken_from_env()
.server_addr("connect.eu.ngrok-agent.com:443")
.connect()
)
return (await session.http_endpoint()
.listen_and_forward("http://localhost:8080"))
listener = asyncio.run(create_listener())
print(f"Ingress established at: {listener.url()}");
Python SDK Docs:
use ngrok::prelude::*;
async fn listen_ngrok() -> anyhow::Result<impl Tunnel> {
let sess = ngrok::Session::builder()
.authtoken_from_env()
.server_addr("connect.eu.ngrok-agent.com:443"),
.connect()
.await?;
let tun = sess
.http_endpoint()
.listen()
.await?;
println!("Listening on URL: {:?}", tun.url());
Ok(tun)
}
Rust Crate Docs:
In your Helm values.yaml file, set
serverAddr: "connect.eu.ngrok-agent.com:443"
Legacy Agent Behavior
Beginning with the ngrok v3 agent, the agent uses GSLB to connect to the lowest-latency point point of presence. Prior to that, v2 ngrok agents would always connect to the US region by default.