Kubernetes Service vs. Headless Service Explained: Use Cases, Differences, and Client-Side Load Balancing
- Jasper Zhang
- Jun 17
- 5 min read
Updated: Jun 28
When I first started deploying Timeplus on Kubernetes (also abbreviated as K8s), I realized that understanding the fundamental concepts of K8s networking was crucial for a successful deployment. Among all the concepts I encountered, the distinction between Service and Headless Service was confusing. This confusion isn't unique to me. The question "What is a headless service, what does it do/accomplish, and what are some legitimate use cases for it?" on Stack Overflow has garnered over 250 upvotes, indicating that many developers struggle with this concept.
As I dove deeper into distributed systems architecture and started working with stateful applications like databases, message queues, and stream processing systems, I discovered that Headless Service solves real and common problems that Service simply cannot address.
In this article, I'll bridge that gap by explaining not just what these services are, but more importantly, when and why you'd use each type, backed by real-world examples and practical use cases that you're likely to encounter in production environments.
Key Differences
Before we go through the comparison table, I assume you’ve already read the official Kubernetes documentation on Service. It introduces the basics of what Service is and how it helps expose a group of Pods through a stable network endpoint.
The difference between Service and Headless Service can still be unclear, especially if you’re new to the topic. The table below outlines the key differences. It’s okay if you don’t fully understand everything yet. We’ll cover real-world examples in the next section to help make these concepts more concrete.
Service | Headless Service | Notes | |
Who is responsible for doing the load balancing | Service itself | Either client or server | Headless Service doesn’t have a ClusterIP; client must resolve and select endpoints manually. |
Typical backend type | Deployment | StatefulSet | StatefulSet supports stable identity and persistent storage. |
DNS response | Returns single ClusterIP | Returns all Pod IPs (A records) | |
Use case | Stateless services (e.g., web frontend, APIs) | Stateful services (e.g., DBs, message queues) | Use based on whether identity and storage need to persist. |
Service discovery | Hides Pod details behind a common endpoint | Exposes full Pod IP list to client |
Use Case Differences
Case 1: Service for Stateless Applications
Let's consider a typical RESTful API server deployment - think of a simple user management API or a product catalog service.
Requirements for the Deployment:
The client doesn’t need to talk to a specific server. It just wants to send a request and get a response. Simplicity on the client side is important.
All backend servers are stateless and interchangeable. Requests should be evenly distributed across all of them.
Why Service is a good fit:
Service hides the details of the backend Pods under a single DNS name. This makes the client code much simpler because it only needs to connect to one stable address.
Load balancing is handled automatically by Kubernetes (by kube-proxy and iptables), which ensures that traffic is evenly distributed to all healthy Pods.
The Service acts as a perfect abstraction layer - clients see one endpoint, but the traffic gets distributed across multiple backend instances automatically.
The following diagram illustrates this architecture. The client interacts with a single ClusterIP (10.244.0.100), while the Service transparently forwards requests to one of the backend Pods (Pod1, Pod2, or Pod3). Kubernetes handles the load balancing and service discovery behind the scenes.

In short, if each backend pod is identical or you don’t care about which exact pod to talk to, Service would be your go-to option.
Case 2: Headless Service for Stateful Applications
Now let's look at a different scenario: deploying Timeplus, a stream processing engine that uses the Raft consensus algorithm. No worry, you don’t have to understand how Raft works here. The key difference compared to the API servers in case 1 is that each Timeplus server has its own state. Each server needs to talk to the other servers via DNS names in order to exchange data and maintain its state.
That being said, Service is no longer an option for us because each timeplus server needs to talk to other timeplus servers specifically. This is where Headless Service comes into play. The diagram below shows the overall architecture (the green text indicates the difference compared to Service).

With Headless Service*, each pod behind this service gets a unique DNS name. The naming convention is <hostname>.<service>.<ns>.svc.<zone>. We can then add two environment variables to the StatefulSet:
ADVERTISED_HOST=<hostname>.timeplusd-svc.one.svc.cluster.local
METADATA_NODE_QUORUM=timeplusd-0.timeplusd-svc.timeplus.svc.cluster.local:8464,timeplusd-1.timeplusd-svc.timeplus.svc.cluster.local:8464,timeplusd-2.timeplusd-svc.timeplus.svc.cluster.local:8464
Those two variables help individual servers figure out:
Who am I
How to connect to others in the cluster
With these two pieces of information, a raft cluster can be properly formed by the servers.
However, besides the DNS record for each pod, you may also notice that the result of nslookup changes. Instead of resolving only one ClusterIP, now it returns all three records behind the service. What if I run `curl timeplusd-svc.timeplus`? Will it always hit a particular server, or will there be any magic to load balance the request to 3 different servers?
This brings us to our next topic: load balancing.
* It depends on the implementation of CNI. If you are using Amazon EKS, you will find that each pod behind Service gets a unique DNS name as well. However, this is not the case for Google GKE.
Load Balancing
Unfortunately, when you run `curl timeplusd-svc-timeplus`, the request will always hit the first live server. That means if your backend server doesn’t have a mechanism to balance the workload by itself, you will see one busy and two idle servers in your cluster.
To be more precise, just like the name “headless” already indicates, the service won’t do any magic here. You have to handle the load balancing either on the client or on the server side. For most of the tools or http libraries, it will just try to send the request to the first available host behind the headless service.
For example, if you are using the net/http library of Go to do something like res, err := http.Get("http://timeplusd-svc.timeplus"), the Go net.Resolver typically uses only the first IP and caches it based on the DNS caching policy. As a result, all requests go to the same Pod until the cache expires.
To solve this issue, we have implemented a DNS-Name-based Round Robin Dialer that automatically distributes connections across all available IPs resolved by this DNS name. It supports TTL, supports connection timeouts, and is fully pluggable into the Go http.Client. You can find the source code here.
Here’s how you can use it to fix the issue:
import (
"net/http"
rrd "myapp/roundrobindialer"
)
client := &http.Client{
Transport: &http.Transport{
DialContext: rrd.NewRoundRobinDialer(
rrd.WithDialTimeout(3*time.Second),
rrd.WithKeepAlive(10*time.Second),
rrd.WithDNSTTL(30*time.Second),
).DialContext(),
DisableKeepAlives: true,
},
}
resp, err := client.Get("http://timeplusd-svc.timeplus")
With this solution, each outgoing connection is round-robin distributed across available Pods, enabling proper load balancing at the client level.
Recap
Through this exploration of Kubernetes Service vs Headless Service, we've uncovered several key insights:
The Fundamental Architecture Decision
The choice between Service and Headless Service isn't just a configuration detail—it reflects a fundamental architectural decision about how your application handles state and communication:
Stateless applications (REST APIs, web servers) benefit from the abstraction layer that Service provides
Stateful applications (databases, message queues, consensus systems) require the direct pod-to-pod communication that Headless Service enables
Load Balancing Responsibility Shift
We discovered that Headless Service creates a responsibility shift in load balancing:
Service: Kubernetes handles load balancing automatically via kube-proxy and kernel modules (iptables/ipvs)
Headless Service: Load balancing becomes the application's responsibility—either server-side or client-side
This shift isn't a limitation; it's a design choice that gives applications more control over traffic distribution when they need it.
References