KubeEdge ARM平台部署

本文记录 KubeEdge 部署到 armv7l 平台的过程。

docker 环境

ARM 系统需要搭建 docker 环境。需要添加驱动的支持。过程较复杂,此处不提。

交叉编译

只需要编译边缘端即可:
安装交叉编译器:

1
sudo apt-get install gcc-arm-linux-gnueabihf

设置环境变量并编译:

1
2
3
4
5
6
7
export GOARCH=arm
export GOOS="linux"
export GOARM=7
export CGO_ENABLED=1
export CC=arm-linux-gnueabihf-gcc
export GO111MODULE=off // 禁止 go module
make all WHAT=edgecore

注意,如果机器内存过小,编译 go 会出现 Killed 错误。(在编译时查看内存,最大占用2GB,官方不建议在目标板编译,也是这个原因)。
另外,交叉编译器也可以使用非 hf 版本。

1
2
3
sudo apt-get install gcc-arm-linux-gnueabihf
export CC=arm-linux-gnueabi-gcc
// 其它相同

注意:交叉编译器最好与目标板系统构建的交叉编译器版本一致,如果不一致,也需要进行测试。本文目标板使用gcc 8构建,但用 ubuntu 系统自带版本也可以在目标板上运行。

部署

部署过程,同 x86 平台一样。
但要注意,docker需要修改。edge.yaml文件的容器需要修改。主要是节点名称,如url、node-id、hostname-override等出现的节点名称要保持一致(且与集群中其它节点区别)。另外 podsandbox-image 需要使用 arm 版本。

云端查看:

1
2
3
4
# kubectl get nodes                
NAME STATUS ROLES AGE VERSION
edge-node-arm Ready edge 46h v1.15.3-kubeedge-v1.1.0-beta.0.323+52dd841358b292
ubuntu Ready master 7d23h v1.17.0

边缘端:

1
2
3
4
# docker ps
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
70fa72886761 nginx "nginx -g 'daemon ..." 23 seconds ago Up 17 seconds k8s_nginx_nginx-deployment-77698bff7d-q5shs_default_236ddc3e-17de-4d61-92bc-4dd86d9dad92_0
4a4013a1a488 kubeedge/pause-arm:3.1 "/pause" 3 minutes ago Up 3 minutes 0.0.0.0:80->80/tcp k8s_POD_nginx-deployment-77698bff7d-q5shs_default_236ddc3e-17de-4d61-92bc-4dd86d9dad92_0

所遇问题

官方编译的arm版本

使用官方编译的arm版本,无法在板子上跑起来,提示coredump。目测是编译器、链接库版本兼容性引起的。故需要自行交叉编译。

交叉编译

默认 GO111MODULE=auto,编译下载依赖包,有些无法下载,失败。
关闭,再编译,可成功(存疑:此时似乎没有下载依赖包,但亦能编译通过,暂未知原因)。

运行问题

12.31日编译:
边缘端执行信息:

1
2
RoundRobin.
W1231 17:01:33.736283 625 proxy.go:78] [L4 Proxy] create Device is failed : operation not supported

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
2019-12-31 17:01:33.732 +08:00 INFO core/core.go:23 starting module servicebus
I1231 17:01:33.739747 625 tcp.go:27] start listening at 172.17.0.1:8080
I1231 17:01:34.023391 625 cpu_manager.go:155] [cpumanager] starting with none policy
I1231 17:01:34.025776 625 cpu_manager.go:156] [cpumanager] reconciling every 0s
I1231 17:01:34.027894 625 policy_none.go:42] [cpumanager] none policy: Start
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x14 pc=0x1d1440c]

goroutine 77 [running]:
github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm.(*containerManagerImpl).enforceNodeAllocatableCgroups(0x3cec3c0, 0x6, 0x0)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm/node_container_manager_linux.go:78 +0x144
github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm.(*containerManagerImpl).setupNode(0x3cec3c0, 0x40905a0, 0x0, 0x11)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm/container_manager_linux.go:437 +0xa8
github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm.(*containerManagerImpl).Start(0x3cec3c0, 0x0, 0x40905a0, 0x2899eb0, 0x3b2daa0, 0xa539cd58, 0x3e77800, 0x28e30a0, 0x3dc0fe0, 0x0, ...)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/k8s.io/kubernetes/pkg/kubelet/cm/container_manager_linux.go:588 +0xd4
github.com/kubeedge/kubeedge/edge/pkg/edged.(*edged).initializeModules(0x3caa000, 0x3e77800, 0x3d48100)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edge/pkg/edged/edged.go:639 +0xf4
github.com/kubeedge/kubeedge/edge/pkg/edged.(*edged).Start(0x3caa000, 0x3e61380)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edge/pkg/edged/edged.go:294 +0x35c
created by github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core.StartModules
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core/core.go:22 +0x124

原因及解决:
权限原因,但已使用 root 运行。 无法解决。

使用1.6日主分支版本编译,错误如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
[L4 Proxy] Add ip is failed : [L4 Proxy] Device edge0 is not exist!! please checkout the env
panic: runtime error: index out of range

goroutine 119 [running]:
github.com/kubeedge/kubeedge/edgemesh/pkg/proxy.addServer(0x3f61120, 0x14, 0x427e300, 0x2, 0x2)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/proxy/proxy.go:388 +0x8f4
github.com/kubeedge/kubeedge/edgemesh/pkg/proxy.updateServer(0x3f61120, 0x14, 0x427e300, 0x2, 0x2, 0x427e308, 0x0)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/proxy/proxy.go:443 +0x508
github.com/kubeedge/kubeedge/edgemesh/pkg/proxy.MsgProcess(0x3e96ba0, 0x24, 0x0, 0x0, 0x7a370c50, 0x16f, 0x0, 0x3ef3b50, 0xe, 0x3ef3b60, ...)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/proxy/proxy.go:359 +0x520
github.com/kubeedge/kubeedge/edgemesh/pkg.(*EdgeMesh).Start(0x3b0f934)
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/edgemesh/pkg/module.go:51 +0x190
created by github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core.StartModules
/home/ubuntu/kubeedge/src/github.com/kubeedge/kubeedge/vendor/github.com/kubeedge/beehive/pkg/core/core.go:23 +0x11c

原因及解决:
在 ubuntu 边缘端查看 edge0 设备,结果如下:

1
2
3
# find / -name "edge0"
/sys/devices/virtual/net/edge0
/sys/class/net/edge0

猜测是无法创建 edge0 设备出错。
越界问题,proxy.go 文件 addServer

1
2
3
4
5
6
7
8
9
10
} else {
truetrueif len(ports) == 0 {
truetruetruereturn
truetrue}
truetrueif len(unused) == 0 {
truetruetrueexpandIpPool()
truetrue}
truetrueip = unused[0]
truetrueunused = unused[1:]
true}

出错,原因 unused 数组长度为 0,取之越界。判断其长度不为0时才获取。
修改后,无越界,但依然有错:

1
2
3
4
5
6
7
8
9
I0106 22:44:28.682395    3562 generic.go:81] GenericLifecycle: Relisting
W0106 22:44:29.026038 3562 proxy.go:76] [L4 Proxy] create Device is failed : operation not supported
I0106 22:44:29.689864 3562 generic.go:81] GenericLifecycle: Relisting
I0106 22:44:30.697860 3562 generic.go:81] GenericLifecycle: Relisting
W0106 22:44:31.068954 3562 proxy.go:76] [L4 Proxy] create Device is failed : operation not supported
I0106 22:44:31.705060 3562 generic.go:81] GenericLifecycle: Relisting
I0106 22:44:32.724408 3562 generic.go:81] GenericLifecycle: Relisting
W0106 22:44:33.117475 3562 proxy.go:76] [L4 Proxy] create Device is failed : operation not supported
I0106 22:44:33.307057 3562 communicate.go:148] CheckConfirm

原因与前述一致。权限问题,无法创建虚拟网络设备 edge0。
但是节点为 Ready 状态。应该还有问题:

1
2
3
4
5
# kubectl get nodes
NAME STATUS ROLES AGE VERSION
edge-node Ready edge 6d5h v1.15.3-kubeedge-v1.1.0-beta.0.323+52dd841358b292
edge-node-arm Ready edge 6h6m v1.15.3-kubeedge-v1.1.0-beta.0.323+52dd841358b292-dirty
ubuntu Ready master 6d7h v1.17.0

日志分析:

1
2
3
4
5
6
7
8
9
10
11
cni.go:213] Unable to update cni config: No networks found in /etc/cni/net.d

hostport_manager.go:68] The binary conntrack is not installed, this can cause failures in network connection cleanup.
--> http://conntrack-tools.netfilter.org/downloads.html

container_manager_linux.go:295] Creating device plugin manager: false
cpu_manager.go:135] [cpumanager] Unknown policy "", falling back to default policy "none"

csi_plugin.go:222] kubernetes.io/csi: kubeclient not set, assuming standalone kubelet

plugin_watcher.go:81] failed to traverse deprecated plugin socket path "/var/lib/edged/plugins", err: error accessing path: /var/lib/edged/plugins error: lstat /var/lib/edged/plugins: no such file or directory

分析vishvananda源码:
/dev/net/tun 不存在,ubuntu上存在 crw-rw-rw- 1 root root 10, 200 Jan 6 16:25 /dev/net/tun
原因:CONFIG_TUN 没有配置

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
Device Drivers  --->   
[*] Network device support --->
--- Network device support
[*] Network core driver support
<M> Bonding driver support
<M> Dummy net driver support
<M> EQL (serial line load balancing) support
<M> Ethernet team driver support ---> // 注:其下的全选为M
<M> MAC-VLAN support
<M> MAC-VLAN based tap driver
<M> Virtual eXtensible Local Area Network (VXLAN)
<M> Generic Network Virtualization Encapsulation
<M> GPRS Tunneling Protocol datapath (GTP-U)
<M> IEEE 802.1AE MAC-level encryption (MACsec)
<M> Network console logging support
[*] Dynamic reconfiguration of logging targets
<*> Universal TUN/TAP device driver support // !! 这就是 CONFIG_TUN
[ ] Support for cross-endian vnet headers on little-endian kernels
<*> Virtual ethernet pair device
<M> Virtual netlink monitoring device

参考