nginx负载均衡实验笔记

本文是nginx负载均衡实验的一些笔记。

概述

前面已经完成了转发的程序,也尝试了一些负载均衡算法,本文对 nginx 的负载均衡做一些简单的测试,有部分实验是为了解答笔者与同事交流时产生的疑惑。

程序

本文所用程序,为笔者之前实现的转发程序,实际上任何能响应 post 请求的程序均可。

环境

本文实验环境如下:
虚拟机 Linux 运行容器。
虚拟机 Windows 发送 POST 请求。

本文使用镜像centos/nginx-116-centos7进行测试。
启动命令如下:

1
2
# $PWD/bin保存了后端的服务,故挂载之
docker run -itd --name nginx -p 8080:8080 -v $PWD/bin:/home/latelee/bin centos/nginx-116-centos7 nginx -g "daemon off;"

为了配置 nginx,需 root 权限,故使用如下命令进入容器:

1
sudo docker exec -u root -it nginx bash

后端服务运行命令如下:

1
2
3
/home/latelee/bin/httpforward_back.exe -p 9001 -i "hello in 9001"
/home/latelee/bin/httpforward_back.exe -p 9002 -i "hello in 9002"
/home/latelee/bin/httpforward_back.exe -p 9003 -i "hello in 9003"

重启 nginx 命令如下:

1
nginx -s reload

nginx 配置文件如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
cat > /etc/nginx/nginx.conf <<-EOF 
worker_processes auto;
error_log /var/opt/rh/rh-nginx116/log/nginx/error.log;
pid /var/opt/rh/rh-nginx116/run/nginx/nginx.pid;

# Load dynamic modules. See /opt/rh/rh-nginx116/root/usr/share/doc/README.dynamic.
include /opt/rh/rh-nginx116/root/usr/share/nginx/modules/*.conf;

events {
worker_connections 1024;
}

http {
log_format main '[$time_local] $remote_addr: "$request" '
'$status "$http_referer" '
'"$http_user_agent" [$upstream_addr $upstream_status $upstream_response_time ms $request_time ms]';

access_log /var/opt/rh/rh-nginx116/log/nginx/access.log main;

sendfile on;
tcp_nopush on;
tcp_nodelay on;
keepalive_timeout 65;
types_hash_max_size 2048;
proxy_connect_timeout 10;

include /etc/opt/rh/rh-nginx116/nginx/mime.types;
default_type application/octet-stream;

# Load modular configuration files from the /etc/nginx/conf.d directory.
# See http://nginx.org/en/docs/ngx_core_module.html#include
# for more information.
include /opt/app-root/etc/nginx.d/*.conf;

server {
listen 8080 default_server;
listen [::]:8080 default_server;
server_name _;
root /opt/app-root/src;

# Load configuration files for the default server block.
include /opt/app-root/etc/nginx.default.d/*.conf;

location / {
proxy_pass http://foobar;
proxy_set_header Host $proxy_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
}

location /fee/sleep {
proxy_pass http://foobar;
}
}

upstream foobar {
server 127.0.0.1:9001;
server 127.0.0.1:9002;
}

#upstream foobar {
# server 127.0.0.1:9001 weight=3;
# server 127.0.0.1:9002 weight=1;
#}

#upstream foobar {
# ip_hash;
# server 127.0.0.1:9001;
# server 127.0.0.1:9002;
# server 127.0.0.1:9003;
#}
}
EOF

该配置文件主要设置了上游服务foobar的IP和端口。再进行具体 URL 的映射,如下:

1
2
3
4
5
6
7
location / {
proxy_pass http://foobar;
]

location /fee/sleep {
proxy_pass http://foobar;
]

下面主要针对upstream foobar部分进行修改,达到使用不同算法的实验目的。

实验

为了进行实验,需启动若干个终端进入容器,如:修改配置并重启 nginx,执行程序,观察日志,等等。
查看访问日志:

1
tail -f /var/opt/rh/rh-nginx116/log/nginx/access.log

在另一终端(可在Windows或虚拟机中)执行如下请求命令:

1
curl http://192.168.28.11:8080/ -X POST -F  "file=@sample.json"

下面给出配置和相应的日志和观察到的现象。

基础的实验

默认轮询

配置:

1
2
3
4
upstream foobar {
server 127.0.0.1:9001;
server 127.0.0.1:9002;
}

日志:

1
2
3
4
5
6
[21/Sep/2021:04:47:26 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.000 ms 0.001 ms]
[21/Sep/2021:04:47:27 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.001 ms 0.001 ms]
[21/Sep/2021:04:47:29 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.002 ms 0.003 ms]
[21/Sep/2021:04:47:31 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:04:47:34 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:04:47:35 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.001 ms 0.001 ms]

结论: 9001 和 9002 依次出现。

加权轮询

配置:

1
2
3
4
upstream foobar {
server 127.0.0.1:9001 weight=4;
server 127.0.0.1:9002 weight=1;
}

日志:

1
2
3
4
5
[21/Sep/2021:04:52:17 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.000 ms 0.000 ms]
[21/Sep/2021:04:52:18 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.001 ms 0.001 ms]
[21/Sep/2021:04:52:18 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.000 ms 0.000 ms]
[21/Sep/2021:04:52:19 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.001 ms 0.001 ms]
[21/Sep/2021:04:52:20 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]

结论: 9001 和 9002 依次出现 4 次和 1 次。
注:经测试发现,nginx 加权轮询本身就是平滑加权轮询,此处为了演示,特意将权重值扩大。

平滑加权轮询

配置:

1
2
3
4
5
upstream foobar {
server 127.0.0.1:9001 weight=2;
server 127.0.0.1:9002 weight=5;
server 127.0.0.1:9003 weight=3;
}

日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
第一轮:
[21/Sep/2021:05:00:39 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:00:39 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:00:40 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9003 200 0.001 ms 0.000 ms]
[21/Sep/2021:05:00:41 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9003 200 0.000 ms 0.001 ms]
[21/Sep/2021:05:00:42 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.008 ms 0.009 ms]
[21/Sep/2021:05:00:43 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:00:44 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:00:44 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.000 ms 0.001 ms]
[21/Sep/2021:05:00:45 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.000 ms 0.001 ms]
[21/Sep/2021:05:00:46 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9003 200 0.001 ms 0.001 ms]

第二轮
[21/Sep/2021:05:03:24 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:03:25 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:03:26 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.000 ms 0.000 ms]
[21/Sep/2021:05:03:27 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9003 200 0.000 ms 0.000 ms]
[21/Sep/2021:05:03:28 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.000 ms 0.000 ms]
[21/Sep/2021:05:03:29 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:03:30 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9003 200 0.002 ms 0.001 ms]
[21/Sep/2021:05:03:31 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9001 200 0.001 ms 0.001 ms]
[21/Sep/2021:05:03:31 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.000 ms 0.000 ms]
[21/Sep/2021:05:03:32 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]

结论: 从权重上看,9001、9002 和 9003 保持配置中的值,即依次出现 2、5、3 次。从顺序上看,权重大的服务器并没有集中出现,三者轮询相对较均匀。从2轮实验结果看,每一次的轮询,某个服务器出现的顺序并不相同。

与前面自实现平滑算法对比如下:

1
2
3
4
5
自实现:
2 3 1 2 2 3 2 1 3 2
nginx的:
2 2 3 3 1 2 2 1 2 3
2 1 2 3 2 2 3 1 2 2

可以看到,二者还是有区别的。

ip_hash 轮询

配置:

1
2
3
4
5
6
upstream foobar {
ip_hash;
server 127.0.0.1:9001;
server 127.0.0.1:9002;
server 127.0.0.1:9003;
}

为模拟不同的 IP 访问,在虚拟机、物理机以及其它的容器中发送 POST 请求,观察日志。如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
[21/Sep/2021:06:59:55 +0000] 127.0.0.1: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9003 200 0.002 ms 0.042 ms]
[21/Sep/2021:06:59:55 +0000] 127.0.0.1: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9003 200 0.001 ms 0.041 ms]
[21/Sep/2021:06:59:56 +0000] 127.0.0.1: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9003 200 0.001 ms 0.041 ms]
[21/Sep/2021:07:02:20 +0000] 172.17.0.1: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.001 ms 0.041 ms]
[21/Sep/2021:07:02:23 +0000] 172.17.0.1: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.001 ms 0.044 ms]
[21/Sep/2021:07:03:20 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:07:03:21 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:07:03:22 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.000 ms 0.000 ms]
[21/Sep/2021:07:03:22 +0000] 192.168.28.5: "POST / HTTP/1.1" 200 "-" "curl/7.73.0" [127.0.0.1:9002 200 0.001 ms 0.001 ms]
[21/Sep/2021:07:03:54 +0000] 172.17.0.3: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.000 ms 0.042 ms]
[21/Sep/2021:07:04:07 +0000] 172.17.0.3: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.002 ms 0.042 ms]
[21/Sep/2021:07:04:08 +0000] 172.17.0.3: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.002 ms 0.044 ms]
[21/Sep/2021:07:04:42 +0000] 192.168.28.11: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.001 ms 0.042 ms]
[21/Sep/2021:07:04:49 +0000] 192.168.28.11: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.001 ms 0.042 ms]
[21/Sep/2021:07:04:50 +0000] 192.168.28.11: "POST / HTTP/1.1" 200 "-" "curl/7.29.0" [127.0.0.1:9002 200 0.002 ms 0.043 ms]

结论:其中127.0.0.1为 nginx 所在容器的本地 IP,172.17.0.3是另一容器,192.168.28.5是物理机,192.168.28.11是虚拟机。从日志中看,每个源 IP 均由相同端口的服务响应。但不知为何,9001 端口服务没有被轮询到。

自定义的实验

下面是一些笔者一直想做的实验。

后端服务未启动访问-返回502

模拟场合:所有的后端服务均未启动,但在 nginx 配置文件中指定了后端服务。
nginx 访问日志:

1
[21/Sep/2021:06:43:48 +0000] 192.168.28.5: "POST / HTTP/1.1" 502 "-" "curl/7.73.0" [127.0.0.1:9003, 127.0.0.1:9002, 127.0.0.1:9001 502, 502, 502 0.000, 0.000, 0.001 ms 0.001 ms]

curl 请求返回:

1
2
3
4
5
6
7
8
9
10
$ curl http://192.168.28.11:8080/ -X POST -F  "file=@sample.json"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 493 100 157 100 336 78500 164k --:--:-- --:--:-- --:--:-- 481k<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>

结论:返回 502,注意,访问日志中提示了所有的后端服务均返回 502,猜测 nginx已经做了一次轮询。

服务处理中突然-返回502,当次服务失败

模拟场合:某服务在处理请求中突然停止服务(如出现段错误或断电)。
为了模拟该情况,特意实现一个 sleep 请求,该请求中延时 4 秒,以方便停止服务。
nginx 访问日志:

1
[21/Sep/2021:06:49:18 +0000] 192.168.28.5: "POST /fee/sleep HTTP/1.1" 502 "-" "curl/7.73.0" [127.0.0.1:9001 502 3.035 ms 3.035 ms]

可以看到,其响应处理耗时 3 秒多,因为笔者在大概 3 秒时才停止后端服务。

curl 请求返回:

1
2
3
4
5
6
7
8
9
10
$ curl http://192.168.28.11:8080/fee/sleep -X POST -F  "file=@sample.json"
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 493 100 157 100 336 40 85 0:00:03 0:00:03 --:--:-- 125<html>
<head><title>502 Bad Gateway</title></head>
<body>
<center><h1>502 Bad Gateway</h1></center>
<hr><center>nginx/1.16.1</center>
</body>
</html>

结论:对于请求者而言,返回信息与上一实验相同,但在访问日志看到,只有 9001 才提示 502,但还有其它后端服务在运行,因此,再请求时,能返回正常,此时 nginx 会找正常工作的机器。

在处理中重新配置 nginx-会等待该处理完成

模拟场合:在多台后端服务中,需停止部分并升级,再启动,再升级其它的服务。
先将两台服务器权重扩大,如 9001 为 10, 9002 为 1,保证请求大部分转发到 9001 端口。在请求处理中,修改 nginx 配置,去掉 9001 服务,再重启 nginx。观察。

结论:nginx 等待 9001 服务处理完请求,后续请求不再转发该服务。因此确保在处理中的请求一定能处理完毕。

设置超时响应时间

配置:

1
2
3
4
5
6
7
8
9
location / {
proxy_pass http://foobar;
proxy_set_header Host $proxy_host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_connect_timeout 1;
proxy_read_timeout 1;
proxy_send_timeout 1;
}

注: 不知如何实验。因为是来 gin 框架实现,有对应的响应函数,一旦进入,就认为响应了,直接用前面 sleep 模式的,也不行。

知识

不能在upstream里面的 IP 地址加上 URL 后缀。否则提示

1
nginx: [emerg] invalid host in upstream "127.0.0.1:9001/foobar" in /etc/opt/rh/rh-nginx116/nginx/nginx.conf:54

可以在location地址添加对应的 URL,如location /foobar

为方便观察请求日志,需要设置 nginx 日志,本文配置如下:

1
2
3
log_format  main  '[$time_local] $remote_addr: "$request" '
'$status "$http_referer" '
'"$http_user_agent" [$upstream_addr $upstream_status $upstream_response_time ms $request_time ms]';

备忘

是否可以用 nginx 屏蔽客户端对真实Web服务器的直接访问?好像网上还没有相关方案。

小结

许久前,在看分布式的视频时,里面介绍了负载均衡,雪花算法,一致性哈希算法,等,让笔者大开眼界,趁着中秋佳节无法外出,集中夜晚时间研究研究,从自实现的基于请求内容的转发工具,到 nginx 的负载均衡算法,基本过了一次。至于其它的知识,暂时未有计划。

2021.9.23 凌晨