Why network is not a bottleneck in microservice system architectures

6 min readApr 16, 2021

I’ve been reading a lot about microservice architectures for server applications, and have been wondering how the internal network usage is not a bottleneck or a significant disadvantage compared to a monolith architecture. And below are my verdict. So, grab a chair.

In the world of microservice architecture, we build out an application via a collection of services. Each service in the collection tends to meet the following criteria:

Loosely coupled
Maintainable and testable
Can be independently deployed

Each service in a microservice architecture solves a business problem in the application or at least supports one. A single team is responsible and accountable for one or more services in the application.

Microservice architectures can unlock a number of different benefits.

They are often easier to build and maintain
Services are organized around business problems
They increase productivity and speed
They encourage autonomous, independent teams

These benefits are a big reason microservices are increasing in popularity. But potholes exist that can derail all these benefits. Hit those and you’ll get an architecture that amounts to nothing more than distributed technical debt.

Communication between microservices is one such pothole that can wreak havoc if not considered ahead of time.

The goal of this architecture is to create loosely coupled services, and communication plays a key role in achieving that. In this article, we are going to focus on RESTful communication over HTTP/HTTPS.

Internal networks often use 1 Gbps connections, or faster. Optical fiber connections or bonding allow much higher bandwidths between the servers. Now imagine the average size of a JSON response from an API. How much of such responses can be transmitted over a 1 Gbps connection in one second?

Let’s actually do the math. 1 Gbps is 131,072 KB per second. If an average JSON response is 5 KB (which is quite a lot!), you can send 26 214 responses per second through the wire with just one pair of machines. Not so bad, isn’t it?

This is why network connection is usually not the bottleneck.

Another aspect of microservices is that you can scale easily. Imagine two servers, one hosting the API, another one consuming it. If ever the connection becomes the bottleneck, just add two other servers and you can double the performance.

This is when our earlier 26,214 responses per second become too small for the scale of the app. You add other nine pairs, and you are now able to serve 262,140 responses.

But let’s get back to our pair of servers and do some comparisons.

If an average non-cached query to a database takes 10 ms., you’re limited to 100 queries per second. 100 queries. 26,214 responses. Achieving the speed of 26 214 responses per second requires a great amount of caching and optimization (if the response actually needs to do something useful, like querying a database; “Hello World”-style responses don’t qualify).
On my computer, right now, DOMContentLoaded for Google’s home page happened 394 ms. after the request was sent. That’s less than 3 requests per second. For the Programmers. SE home page, it happened 603 ms. after the request was sent. That’s not even 2 requests per second. By the way, I have a 100 Mbps internet connection and a fast computer: many users will wait longer.
If the bottleneck is the network speed between the servers, those two sites could literally do thousands of calls to different APIs while serving the page.

Those two cases show that the network probably won’t be your bottleneck in theory (in practice, you should do the actual benchmarks and profiling to determine the exact location of the bottleneck of your particular system hosted on particular hardware). The time spent doing the actual work (would it be SQL queries, compression, whatever) and sending the result to the end-user is much more important.

Think about databases

Usually, databases are hosted separately from the web application using them. This can raise a concern: what about the connection speed between the server hosting the application and the server hosting the database?

It appears that there are cases where indeed, the connection speed becomes problematic, that is when you store huge amounts of data that don’t need to be processed by the database itself and should be available right now (that is large binary files). But such situations are rare: in most cases, the transfer speed is not that big compared to the speed of processing the query itself.

When the transfer speed actually matters is when a company is hosting large data sets on a NAS, and the NAS is accessed by multiple clients at the same time. This is where a SAN can be a solution. This being said, this is not the only solution. Cat 6 cables can support speeds up to 10 Gbps; bonding can also be used to increase the speed without changing the cables or network adapters. Other solutions exist, involving data replication across multiple NAS.

Forget about speed; think about scalability

An important point of a web app is to be able to scale. While the actual performances matter (because nobody wants to pay for more powerful servers), scalability is much more important, because it let you throw additional hardware when needed.

If you have a not particularly fast app, you’ll lose money because you will need more powerful servers.
If you have a fast app that can’t scale, you’ll lose customers because you won’t be able to respond to increasing demand.

In the same way, virtual machines were a decade ago perceived as a huge performance issue. Indeed, hosting an application on a server vs. hosting it on a virtual machine had an important performance impact. While the gap is much smaller today, it still exists.

Despite this performance loss, virtual environments became very popular because of the flexibility they give.

As with the network speed, you may find that VM is the actual bottleneck, and given your actual scale, you will save billions of dollars by hosting your app directly, without the VMs. But this is not what happens for 99.9% of the apps: their bottleneck is somewhere else, and the drawback of a loss of a few microseconds because of the VM is easily compensated by the benefits of hardware abstraction and scalability.

In conclusion, it’s not about network bottlenecks. It’s more about network brittleness. So the first step is to avoid synchronous communication. It’s easier than it sounds. All you need is services with the right boundaries. Right boundaries result in services being autonomous, loosely coupled, and highly cohesive. Good service doesn’t need information from another service, it already has it. The only way good services communicate is via events. Good services are eventually consistent as well, so there are no distributed transactions.

The way to achieve this goodness is to identify your business capabilities first. A business capability is a specific business responsibility. Some contributions to overall business value. So here is my step sequence that I take when thinking about system boundaries:

Identify higher-level business responsibilities. There will be a few of them. Treat these services as steps your organization should walk through to achieve its business goal.
Delve deeper within each service. Identify lower-level services comprising a parent one.
Alongside the first two points think about service communication. They should do it primarily via events, just to notify each other about their business-process result. Events should not be considered as data conveyors.

Keep in mind that business service includes people, applications, business processes. Usually, only part of it is represented as a technical authority.

This could sound a bit abstract, so probably an example of service boundaries identification would be of some interest.

Why network is not a bottleneck in microservice system architectures

Think about databases

Forget about speed; think about scalability

Written by Samson Oyetola