We need to build a service which gives different Id every time, making sure that, the Ids are sorted, It’s highly available, also the Id should be different across all geolocations, and should be able to handle 10k req/sec
Approach 1: Timestamp-Based ID Generation
Use time.Now() or to be more precise use epoch() time stamp which gives random Id Each time, this is ok for very low user base, but the problem is server can handle many requests per second, all of them will give same Id, even if you play with microsecond strings, It’s difficult as server handle 100req per millisecond, we can go down and down to nano seconds, but again it’s a game of probability and there is a possibility of two users getting same Id
Implementation:
- Microsecond Precision with Server Counter: Use
time.Now().UnixNano()for nanosecond precision combined with an in-memory atomic counter that resets every millisecond - Thread-Safe Counter: Implement an atomic counter that increments for each request within the same nanosecond
- Collision Detection: Add a small retry mechanism with microsecond delays if collision is detected
var serverCounter uint64
func generateID() uint64 {
timestamp := time.Now().UnixNano()
counter := atomic.AddUint64(&serverCounter, 1)
return (timestamp << 16) | (counter & 0xFFFF)
}Limitations:
- Still vulnerable to clock synchronization issues across distributed systems
- Performance bottleneck due to atomic operations
- Limited scalability beyond single server deployments
Approach 2: Multi-Component ID Generation
- Derive the Id from these four constraints that will be generated by the server - epoch() timestamp + location Id + server specific Id + A incrementor. This approach works, like it’s unique, scalable, and can handle multiple requests. But there are two problems in it
- How can we handle the multiple instances problem in same geo-location, like then the server specific Id should be unique
- And to solve one Unique Id Generator Problem, we need create an another service which should allocate an unique Id when a new server pings it
- Limited to 1024 servers per location
- Clock synchronization requirements across locations
| 41 bits: Timestamp | 10 bits: Location+Server ID | 12 bits: Sequence |- It’s a good approach for decent scale and when you have less users across all geo locations, and we can derive user information from his Id itself, decreasing the overhead of storing them, rather we can derive them from Id itself
type IDGenerator struct {
locationID uint64 // Pre-assigned per datacenter
serverID uint64 // Pre-assigned per server
sequence uint64
}
func (g *IDGenerator) generateID() uint64 {
now := uint64(time.Now().UnixMilli())
atomic.AddUint64(&g.sequence, 1)
machineID := (g.locationID << 5) | g.serverID // 5+5 bits
seq := g.sequence & 0xFFF // 12 bits
return (now << 22) | (machineID << 12) | seq
}Approach 3: Snowflake Algorithm Implementation
The Twitter Snowflake algorithm is the industry standard for distributed unique ID generation, designed to handle exactly your requirements. It can handle 10k req/sec, Globally Unique, Sorted and Highly available because it’s snowflake
64-bit Snowflake Structure
| 1 bit: Sign | 41 bits: Timestamp | 10 bits: Machine ID | 12 bits: Sequence |Specs:
- Timestamp: 41 bits = ~69 years of IDs (milliseconds since custom epoch)
- Machine ID: 10 bits = 1024 unique machines (can be split into datacenter + worker ID)
- Sequence: 12 bits = 4096 IDs per millisecond per machine
type Snowflake struct {
machineID uint64
sequence uint64
lastTime uint64
}
func (s *Snowflake) generateID() uint64 {
now := uint64(time.Now().UnixMilli())
if now == s.lastTime {
s.sequence = (s.sequence + 1) & 0xFFF
} else {
s.sequence = 0
s.lastTime = now
}
return (now << 22) | (s.machineID << 12) | s.sequence
}Approach 4: Database Sequence with Sharding
- Use database auto-increment with step size equal to number of servers
- Server 1: IDs 1, 4, 7, 10… (step=3)
- Server 2: IDs 2, 5, 8, 11… (step=3)
- Server 3: IDs 3, 6, 9, 12… (step=3)
- It guarantees Uniqueness and also leverages acid properties
-- Server 1
ALTER TABLE id_generator AUTO_INCREMENT_INCREMENT = 3;
ALTER TABLE id_generator AUTO_INCREMENT_OFFSET = 1;
-- Server 2
ALTER TABLE id_generator AUTO_INCREMENT_INCREMENT = 3;
ALTER TABLE id_generator AUTO_INCREMENT_OFFSET = 2;Limitations:
- Difficult to scale when adding/removing servers
- Single point of failure
Approach 5: Instagram’s Sharded Database Approach
Instagram uses a modified Snowflake approach with PostgreSQL schemas
Snowflake Structure
| 41 bits: Timestamp | 13 bits: Shard ID | 10 bits: Sequence |Implementation:
- Each logical shard gets unique 13-bit ID
- Uses PostgreSQL’s
nextval()function for sequence generation - Multiple logical shards can exist on single physical database
-- Create sequence per shard
CREATE SEQUENCE shard_001_seq;
CREATE SEQUENCE shard_002_seq;
-- Generate ID
SELECT (EXTRACT(EPOCH FROM NOW()) * 1000)::bigint << 23 |
1 << 10 | -- Shard ID
nextval('shard_001_seq') & 1023;