Service availability - Temporal Cloud

The operating envelope of Temporal Cloud includes availability, regions, throughput, latency, and limits. If you need more details, contact us.

Available regions

Where is Temporal Cloud available?

Developers and applications can access Temporal Cloud from any location with internet connectivity, irrespective of where the Temporal Cloud resources (Namespaces) are located.

Temporal Cloud is compatible with applications deployed in various cloud environments or data centers.

To minimize latency, we advise creating your Namespace in a region geographically close to your Workers' hosting location.

Currently, Temporal Cloud operates in several regions on Amazon Web Services (AWS):

Area	Code	Region
Asia Pacific	ap-northeast-1	Tokyo
Asia Pacific	ap-northeast-2	Seoul
Asia Pacific	ap-south-1	Mumbai
Asia Pacific	ap-southeast-1	Singapore
Asia Pacific	ap-southeast-2	Sydney
Europe	eu-central-1	Frankfurt
Europe	eu-west-1	Ireland
Europe	eu-west-2	London
North America	ca-central-1	Central Canada
North America	us-east-1	Northern Virginia
North America	us-east-2	Ohio
North America	us-west-2	Oregon
South America	sa-east-1	São Paulo

Your Workers and Client code aren't required to be hosted on AWS.

Throughput expectations

What kind of throughput can I get with Temporal Cloud?

Each Namespace has a rate limit, which is measured in Actions per second (APS). A Namespace's default limit is set at 400 APS and automatically adjusts based on recent usage (over the prior 7 days). Your throughput limit will never fall below this default value.

When your Action rate exceeds your quota, Temporal Cloud throttles Actions until the rate matches your quota. Throttling means limiting the rate at which Actions are performed to prevent the Namespace from exceeding its APS limit.

Critical calls to external events, such starting or Signaling a Workflow, are always prioritized and never throttled. There are four priority levels for Temporal Cloud API calls:

External events
Workflow progress updates
Visibility API calls
Cloud operations such as Namespace creation

When you exceed your APS limits, you might receive warnings about throttling. However, requests are never dropped, and high-priority calls are never delayed. Workers might take longer to complete Workflows.

If your usage grows slowly, your throughput limit grows with your usage. At times, you may hit a maximum throughput threshold and need to switch to a higher consumption tier. Learn more about our tiers by visiting our information page or reach out to our team to help size your number of Actions. Temporal Cloud can provide more than 150,000 Actions per second at its highest tier.

MEASURING THROUGHPUT WITH APS AND RPS

APS and RPS are both measures of throughput, but apply to different aspects of Temporal.

APS, or Actions Per Second, is specific to Temporal Cloud. It measures the rate at which Actions, like starting or signaling a Workflow, can be performed in a specific Namespace. Temporal Cloud uses APS to manage and throttle Actions, preventing a Namespace from exceeding its limit. APS measures how many high-level operations (Actions) a user can perform in Temporal Cloud each second.

RPS, or Requests Per Second, is used in the Temporal Service, both in self-hosted Temporal and Temporal Cloud. It measures and controls the rate of gRPC requests to the Service. This is a lower-level measure that manages rates at the service level, such as the Frontend, History, or Matching Services.

In summary, APS is a higher-level measure to limit and mitigate Action spikes in Temporal Cloud. RPS is a lower-level measure to control and balance request rates at the service level.

Latency Service Level Objective (SLO)

What kind of latency can I expect from Temporal Cloud?

Temporal Cloud has a p99 latency SLO of 200ms per region.

In March 2024, latency over a week-long period for starting and signaling Workflow Executions was as follows:

Operation	p90	p99
`StartWorkflowExecution`	24ms	54ms
`SignalWorkflowExecution`	14ms	40ms
`SignalWithStartWorkflowExecution`	24ms	61ms

As Temporal continues working on improving latencies, these numbers will progressively decrease.

Latency observed from the Temporal Client is influenced by other system components like the Codec Server, egress proxy, and the network itself. Also, concurrent operations on the same Workflow Execution may result in higher latency.

Available regions​

Throughput expectations​

Latency Service Level Objective (SLO)​

Available regions

Throughput expectations

Latency Service Level Objective (SLO)