0%

一致性 hash 算法理解与实现

前言

近段时间在了解分布式时,经常绕不开一个算法: 一致性哈希算法。于是在了解并实践这个算法后,就有了此文章。

算法间的对比

在分布式分片中,存在着几种算法: 取模,分段,一致性 hash。

取模 分段 一致性哈希
上层是否感知
迁移成本 低,只涉及相邻节点
单点故障影响 低,只影响相邻节点
算法复杂度
热点数据 存在 存在 存在

一致性哈希主要解决问题

从上述对比可知,一致性哈希主要降低节点上下线中带来的数据迁移成本,同时节点数量的变化与分片原则于上层无感,使上层更专注于领域内逻辑的编写,使整体架构更加灵活。

一致性 hash 原理

  1. 基本数据结构

​ 基本数据类型因人而已,常规的哈希取模采用大多采用将数据 hash 到 2^32 - 1的空间中,一致性哈希通常在此基础上将空间变成一个环。如下图所示。

​ 本次实现采用的是 key 按大小排列的哈希表。原打算使用数组承接数据,但排序成本随 key 的增多而加大,遂放弃。

  1. 数据存储

    数据存储与哈希取模算法大致相同,都是通过计算存入数据的哈希值放入对应的哈希槽上。但一致性哈希差异之处在于当计算 hash 不在环上,数据存入首个 hash 槽中。

    场景假设: 现已上线 4 节点(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有5个数据(hash1 ~ 5)于存入节点中,结果如下图所示。

​ 本次实现采用的思路是

1
2
3
1. 计算存入数据的 hash 值
2. 寻找最近的(比数据 hash 值大的最小的节点 hash)节点并写入
3. 若 2 中未能寻找服务器,则写入第一个(hash 最小)节点中
  1. 节点上线

​ 新节点加入一致性哈希环中,原理是通过计算节点所代表的 hash 值,并根据计算值将节点映射在环上,最后迁移相邻节点数据到新节点上。

​ 场景假设: 现已上线 4 台服务器(server1 ~ 4),对应 hash 值为 hash1 ~ 4。现有一个新节点(hash5)节点上线到环上。结果如下图所示。

​ 本次实现采用的思路是

1
2
3
4
5
1. 计算上线节点 hash 值
2. 计算上线节点所新增的虚拟节点的 hash 值(若初始化指定虚拟节点数量)
3. 寻找最近的(比上线节点与虚拟节点 hash 值大的最小的节点 hash)节点,取出节点数据
4. 将1 2点节点加入到 hash 环中
5. 将 3 中取出的数据重新放入到 hash 环上
  1. 节点下线

​ 已有节点下线,原理是通过计算节点所代表的 hash 值,取出节点所含数据,下线节点,将取出数据重新放入 hash 环上。

​ 场景假设: 现已上线 5 台服务器(server1 ~ 5),对应 hash 值为 hash1 ~ 5。现节点 server4 下线。结果如下图所示。

​ 本次实现采用的思路是

1
2
3
4
1. 计算下线节点 hash 值
2. 取出下线节点以及虚拟节点(若初始化指定虚拟节点数量)存储数据
3. 将下线节点以及虚拟节点(若初始化指定虚拟节点数量)从 hash 环上移除
4. 将 2 中数据重新放入到环上

代码实现

一致性哈希分为两个方案: 不带虚拟节点与带虚拟节点。而两个方案实现类似,所以本次实现将两种方案合在一起实现。实现如下。

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;
import java.util.List;

/**
* consistentHashing interface
* @author cartoon
* @since 10/01/2021
* @version 1.1
*/
public interface ConsistentHashing {

/**
* put data to hash loop
* @param data data list
* @return put result
*/
boolean putData(List<String> data);

/**
* put data to hash loop
* @param data data
* @return put result
*/
boolean putData(String data);

/**
* remove node from hash loop
* @param node removing node
* @return remove result
*/
boolean removeNode(String node);

/**
* add node to hash loop
* @param node adding node
* @return add result
*/
boolean addNode(String node);

/**
* inject hash method to hash loop
* @param hashService hash method
* @throws UnsupportedOperationException if loop already has node
*/
void setHashMethod(HashService hashService);

/**
* print all data in loop according ascending hash value with nodes
*/
void printAllData();

}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
package org.CommonAlgorithms.ConsistentHash;

import org.CommonAlgorithms.HashAlgorithm.HashService;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;
import java.util.*;

/**
* consistent hash achieve
* @author cartoon
* @since 2021/01/17
*/
public class ConsistentHashingImpl implements ConsistentHashing {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingImpl.class);

/**
* virtual node name template
*/
private static final String virtualNodeFormat = "%s&&%d";

/**
* real node and its virtual node mapping
*/
private SortedMap<String, List<String>> realNodeToVirtualNode;

/**
* hash and its node mapping
*/
private SortedMap<Integer, String> hashToNodes;

/**
* node and its data mapping
*/
private Map<String, List<String>> nodeToData;

/**
* determine virtual node's number of each node
*/
private int virtualNodeNum;

/**
* inject hash method, if null, use loop default hash method
*/
private HashService hashService;


public ConsistentHashingImpl() {
this(0, new String[0]);
}

public ConsistentHashingImpl(String... nodes) {
this(0, nodes);
}

public ConsistentHashingImpl(int virtualNodeNum) {
this(virtualNodeNum, new String[0]);
}

public ConsistentHashingImpl(int virtualNodeNum, String... nodes) {
//1. intercept virtual num smaller than 0
if(virtualNodeNum < 0){
log.error("virtual num is not allow smaller than 0");
throw new IllegalArgumentException();
}
//2. initialize loop member attributes
this.virtualNodeNum = virtualNodeNum;
realNodeToVirtualNode = new TreeMap<>();
hashToNodes = new TreeMap<>();
nodeToData = new HashMap<>();
for(String server : nodes){
hashToNodes.put(getHash(server), server);
nodeToData.put(server, new LinkedList<>());
}
//3. if virtual node number bigger than 0, add virtual node
if(virtualNodeNum > 0){
for(String server : nodes){
addVirtualNode(server);
}
}
}

@Override
public boolean putData(List<String> data) {
//1. circulate call put data method to add data to loop
for(String incomingData : data){
if(!putData(incomingData)){
return false;
}
}
return true;
}

@Override
public boolean putData(String data) {
if(hashToNodes.isEmpty()){
log.error("put data, usable server is empty");
return false;
}
//1. calculate data's hash value
int currentHash = getHash(data);
//2. get usual node(node's hash value is bigger than data's hash value), if usual node list is empty, get first node in loop
SortedMap<Integer, String> usableNodes = hashToNodes.tailMap(currentHash);
String node = usableNodes.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : usableNodes.get(usableNodes.firstKey());
//3. add data to node
List<String> dataList = nodeToData.get(node);
dataList.add(data);
log.info("put data, data {} is placed to server {}, hash: {}", data, node, currentHash);
return true;
}

@Override
public boolean removeNode(String node) {
//1. calculate hash value of removing node
int removeServerHash = getHash(node);
if(!hashToNodes.containsKey(removeServerHash)){
log.error("remove server, current server is not in server list, please check server ip");
return false;
}
//2. get data from removing node
List<String> removeServerData = nodeToData.get(node);
//3. get removing node's virtual node data, remove all virtual node with removing node
if(virtualNodeNum != 0){
for(String virtualNode : realNodeToVirtualNode.get(node)){
removeServerData.addAll(nodeToData.get(virtualNode));
hashToNodes.remove(getHash(virtualNode));
nodeToData.remove(virtualNode);
}
}
//4. remove node from hash loop
hashToNodes.remove(removeServerHash);
nodeToData.remove(node);
if(hashToNodes.size() == 0){
log.info("remove server, after remove, server list is empty");
return true;
}
//5. put data to loop by call put data method
putData(removeServerData);
log.info("remove server, remove server {} success", node);
return true;
}

@Override
public boolean addNode(String node) {
//1, calculate adding node's hash value
int addServerHash = getHash(node);
//2. add node and migrate data
if(hashToNodes.isEmpty()){
//2.1 add node and its virtual node to loop directly when current loop is empty
hashToNodes.put(addServerHash, node);
nodeToData.put(node, new LinkedList<>());
if(virtualNodeNum > 0){
addVirtualNode(node);
}
} else{
//2.2.1 get data to be migrated from loop
SortedMap<Integer, String> greatServers = hashToNodes.tailMap(addServerHash);
String greatServer = greatServers.isEmpty() ? hashToNodes.get(hashToNodes.firstKey()) : greatServers.get(greatServers.firstKey());
List<String> firstGreatServerData = new LinkedList<>(nodeToData.get(greatServer));
//2.2.2 add node and its virtual node to loop
hashToNodes.put(addServerHash, node);
nodeToData.put(greatServer, new LinkedList<>());
nodeToData.put(node, new LinkedList<>());
if(virtualNodeNum != 0){
addVirtualNode(node);
}
//2.2.3 migrate 2.2.1 data to loop by call put data method
putData(firstGreatServerData);
}
log.info("add server, server {} has been added", node);
return true;
}

@Override
public void printAllData() {
nodeToData.forEach((server, data) -> log.info("server {} contains data {}", server, data));
}

@Override
public void setHashMethod(HashService hashService) {
if(!hashToNodes.isEmpty()){
throw new UnsupportedOperationException();
}
this.hashService = hashService;
}

private void addVirtualNode(String realNode){
if(virtualNodeNum > 0){
List<String> virtualNodeList = new LinkedList<>();
for(int cnt = 0; cnt < this.virtualNodeNum; cnt++){
//1. generate virtual node name by default format
String virtualNodeName = String.format(virtualNodeFormat, realNode, cnt);
//2. calculate each virtual node's hash value
int virtualNodeHash = getHash(virtualNodeName);
//3. current node already exist in loop, continue
if(hashToNodes.containsKey(virtualNodeHash)){
continue;
}
//4. add node to loop
virtualNodeList.add(virtualNodeName);
hashToNodes.put(virtualNodeHash, virtualNodeName);
nodeToData.put(virtualNodeName, new LinkedList<>());
}
//5. map virtual node to real node
realNodeToVirtualNode.put(realNode, virtualNodeList);
}
}


private int getHash(String data){
return hashService == null ? defaultGetHash(data) : hashService.getHash(data);
}

private int defaultGetHash(String data){
int res = 0;
for(char tempChar : data.toCharArray()){
if(tempChar >= '0' && tempChar <= '9'){
res += tempChar;
}
}
return res;
}
}

测试结果

不带虚拟节点的一致性哈希
测试代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;
import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* @author cartoon
* @date 2020/12/27
*/
public class ConsistentHashingWithoutVirtualNodeTest {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithoutVirtualNodeTest.class);

private ConsistentHashing consistentHashing;

private String[] servers;

private String[] data;

@Before
public void before(){
servers = new String[]{"000", "111", "222", "333", "555"};
consistentHashing = new ConsistentHashingImpl(servers);
data = new String[]{"000", "111", "222", "333", "555"};
}

@Test
public void testConsistentHashing(){
for(String str : data){
Assert.assertTrue(consistentHashing.putData(str));
}
consistentHashing.removeNode("333");
consistentHashing.addNode("444");
consistentHashing.putData("444");
consistentHashing.printAllData();
}
}
测试结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]
含虚拟节点的一致性哈希测试
测试代码
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
package ConsistentHash;

import org.CommonAlgorithms.ConsistentHash.ConsistentHashing;
import org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl;
import org.junit.Assert;
import org.junit.Before;
import org.junit.Test;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

/**
* @author cartoon
* @date 2021/01/17
*/
public class ConsistentHashingWithVirtualNodeTest {

private static final Logger log = LoggerFactory.getLogger(ConsistentHashingWithVirtualNodeTest.class);

private ConsistentHashing consistentHashing;

private String[] servers;

private String[] data;

@Before
public void before(){
servers = new String[]{"000", "111", "222", "333", "555"};
consistentHashing = new ConsistentHashingImpl(3, servers);
data = new String[]{"000", "111", "222", "333", "555"};
}

@Test
public void testConsistentHashing(){
for(String str : data){
Assert.assertTrue(consistentHashing.putData(str));
}
consistentHashing.removeNode("333");
consistentHashing.addNode("444");
consistentHashing.putData("444");
consistentHashing.putData("555&&0");
consistentHashing.printAllData();
}
}
测试结果
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 000 is placed to server 000, hash: 144
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 111 is placed to server 111, hash: 147
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 222 is placed to server 222, hash: 150
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 333, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 555, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - remove server, remove server 333 success
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555 is placed to server 555, hash: 159
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 333 is placed to server 444, hash: 153
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - add server, server 444 has been added
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 444 is placed to server 444, hash: 156
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - put data, data 555&&0 is placed to server 555&&0, hash: 207
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000&&0 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&0 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444&&2 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555&&0 contains data [555&&0]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 000 contains data [000]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111 contains data [111]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222 contains data [222]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&0 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 444 contains data [333, 444]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 555 contains data [555]
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 222&&1 contains data []
[main] INFO org.CommonAlgorithms.ConsistentHash.ConsistentHashingImpl - server 111&&0 contains data []

实现存在的缺陷

1
2
3
1. 哈希算法过于简单,哈希冲突概率较大
2. 真实节点含有虚拟节点的数量不均
3. 节点上线时真实节点与已存在的虚拟节点的顺序冲突尚未解决

后记

本次实现的所有代码已全部上传到 https://github.com/cartoonYu/CommonAlgorithms,项目主要包含一些常用的算法,如排序算法,限流算法的简单实现,欢迎提 issue。

本文首发于cartoon的博客

转载请注明出处:https://cartoonyu.github.io