WorkingTipsOnLXDMigration

环境

两台虚拟机用来模拟现实环境中的双节点(可以拓展到多节点)场景下的物理机器关机导致的LXD容器的迁移情况。

机器配置(以mig2为例):

root@mig2:/home/test# cat /etc/issue
Ubuntu 20.04.2 LTS \n \l
root@mig2:/home/test# free -g
              total        used        free      shared  buff/cache   available
Mem:              9           0           8           0           0           9
Swap:             0           0           0
root@mig2:/home/test# lxd --version
4.0.7

步骤

mig1节点上初始化:

root@mig1:/home/test# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What IP address or DNS name should be used to reach this node? [default=192.168.89.11]: 
Are you joining an existing cluster? (yes/no) [default=no]: no
What name should be used to identify this node in the cluster? [default=mig1]: 
Setup password authentication on the cluster? (yes/no) [default=no]: yes
Trust password for new clients: 
Again: 
Do you want to configure a new local storage pool? (yes/no) [default=yes]: 
Name of the storage backend to use (lvm, zfs, btrfs, dir) [default=zfs]: 
Create a new ZFS pool? (yes/no) [default=yes]: 
Would you like to use an existing empty block device (e.g. a disk or partition)? (yes/no) [default=no]: 
Size in GB of the new loop device (1GB minimum) [default=30GB]: 
Do you want to configure a new remote storage pool? (yes/no) [default=no]: 
Would you like to connect to a MAAS server? (yes/no) [default=no]: 
Would you like to configure LXD to use an existing bridge or host interface? (yes/no) [default=no]: 
Would you like to create a new Fan overlay network? (yes/no) [default=yes]: 
What subnet should be used as the Fan underlay? [default=auto]: 
Would you like stale cached images to be updated automatically? (yes/no) [default=yes] 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]:

mig2节点加入集群:

root@mig2:/home/test# lxd init
Would you like to use LXD clustering? (yes/no) [default=no]: yes
What IP address or DNS name should be used to reach this node? [default=192.168.89.12]: 
Are you joining an existing cluster? (yes/no) [default=no]: yes
Do you have a join token? (yes/no) [default=no]: no
What name should be used to identify this node in the cluster? [default=mig2]: 
IP address or FQDN of an existing cluster node: 192.168.89.11
Cluster fingerprint: 75ee6a1962985e0262d6bea9f95d554f197719cca19820671856280fe0d2e28b
You can validate this fingerprint by running "lxc info" locally on an existing node.
Is this the correct fingerprint? (yes/no) [default=no]: yes
Cluster trust password: 
All existing data is lost when joining a cluster, continue? (yes/no) [default=no] yes
Choose "zfs.pool_name" property for storage pool "local": 
Choose "size" property for storage pool "local": 30GB
Choose "source" property for storage pool "local": 
Would you like a YAML "lxd init" preseed to be printed? (yes/no) [default=no]: 
root@mig2:/home/test# 

创建成功后,检查cluster情况:

# root@mig2:/home/test# lxc cluster list
To start your first instance, try: lxc launch ubuntu:18.04

+------+----------------------------+----------+--------+-------------------+--------------+
| NAME |            URL             | DATABASE | STATE  |      MESSAGE      | ARCHITECTURE |
+------+----------------------------+----------+--------+-------------------+--------------+
| mig1 | https://192.168.89.11:8443 | YES      | ONLINE | Fully operational | x86_64       |
+------+----------------------------+----------+--------+-------------------+--------------+
| mig2 | https://192.168.89.12:8443 | YES      | ONLINE | Fully operational | x86_64       |
+------+----------------------------+----------+--------+-------------------+--------------+

WorkingTipsOnSway

Install sway via:

$ sudo pacman -S sway alacritty

Logout and type sway in terminal, here we encounter Nvidia issue:

$ sway
sway/main.c:100Proprietary Nvidia driver are NOT supported. Use Nouveau. To launch ....
$ sway --my-next-gpu-wont-be-nvidia
Could not connect to socket /run/seated.sock: No such file or directory
$ echo LIBSEAT_BACKEND=logind >> /etc/environment 
$ sway --my-next-gpu-wont-be-nvidia 2>&1 | tee start.log
sway/main.c: 202 unable to drop root
$ sudo useradd -m dash1
$ sudo passwd dash1
$ exit
Login with the newly created dash1 and re-test

Install gdm and change to opensource nvidia driver:

$ sudo pacman -S gdm
$ sudo rm -f /etc/modprobe.d/nouveau_blacklist.conf
$ sudo vim /etc/mkinitcpio.conf
MODULES=(... nouveau ...)

$ sudo  mkinitcpio -p linux

WorkingTipsOnMultiSeat

steps

default seat:

# loginctl seat-status seat0>seat0.txt
# cat seat0.txt
seat0
	Sessions: *1
	 Devices:
		  ├─/sys/devices/LNXSYSTM:00/LNXPWRBN:00/input/input1
		  │ input:input1 "Power Button"
		  ├─/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/LNXVIDEO:00/input/input14
		  │ input:input14 "Video Bus"
		  ├─/sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0C:00/input/input0
		  │ input:input0 "Power Button"
		  ├─/sys/devices/pci0000:00/0000:00:02.0/drm/card0
		  │ [MASTER] drm:card0
		  │ ├─/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-DP-1
		  │ │ [MASTER] drm:card0-DP-1
		  │ ├─/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-HDMI-A-1
		  │ │ [MASTER] drm:card0-HDMI-A-1
		  │ ├─/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-HDMI-A-2
		  │ │ [MASTER] drm:card0-HDMI-A-2
		  │ └─/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-eDP-1
		  │   [MASTER] drm:card0-eDP-1
		  │   └─/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-eDP-1/intel_backlight
		  │     backlight:intel_backlight
		  ├─/sys/devices/pci0000:00/0000:00:02.0/graphics/fb0
		  │ graphics:fb0 "i915drmfb"
		  ├─/sys/devices/pci0000:00/0000:00:03.0/sound/card1
		  │ sound:card1 "HDMI"
		  │ ├─/sys/devices/pci0000:00/0000:00:03.0/sound/card1/input16
		  │ │ input:input16 "HDA Intel HDMI HDMI/DP,pcm=3"
		  │ ├─/sys/devices/pci0000:00/0000:00:03.0/sound/card1/input17
		  │ │ input:input17 "HDA Intel HDMI HDMI/DP,pcm=7"
		  │ ├─/sys/devices/pci0000:00/0000:00:03.0/sound/card1/input18
		  │ │ input:input18 "HDA Intel HDMI HDMI/DP,pcm=8"
		  │ ├─/sys/devices/pci0000:00/0000:00:03.0/sound/card1/input19
		  │ │ input:input19 "HDA Intel HDMI HDMI/DP,pcm=9"
		  │ └─/sys/devices/pci0000:00/0000:00:03.0/sound/card1/input20
		  │   input:input20 "HDA Intel HDMI HDMI/DP,pcm=10"
		  ├─/sys/devices/pci0000:00/0000:00:14.0/usb2
		  │ usb:usb2
		  │ └─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3
		  │   usb:2-3
		  │   └─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3:1.2/0003:046D:C52B.0003/0003:046D:404D.0004/input/input15
		  │     input:input15 "Logitech K400 Plus"
		  │     ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3:1.2/0003:046D:C52B.0003/0003:046D:404D.0004/input/input15/input15::capslock
		  │     │ leds:input15::capslock
		  │     ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3:1.2/0003:046D:C52B.0003/0003:046D:404D.0004/input/input15/input15::compose
		  │     │ leds:input15::compose
		  │     ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3:1.2/0003:046D:C52B.0003/0003:046D:404D.0004/input/input15/input15::kana
		  │     │ leds:input15::kana
		  │     ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3:1.2/0003:046D:C52B.0003/0003:046D:404D.0004/input/input15/input15::numlock
		  │     │ leds:input15::numlock
		  │     └─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3/2-3:1.2/0003:046D:C52B.0003/0003:046D:404D.0004/input/input15/input15::scrolllock
		  │       leds:input15::scrolllock
		  ├─/sys/devices/pci0000:00/0000:00:14.0/usb3
		  │ usb:usb3
		  ├─/sys/devices/pci0000:00/0000:00:1b.0/sound/card0
		  │ sound:card0 "PCH"
		  │ └─/sys/devices/pci0000:00/0000:00:1b.0/sound/card0/input8
		  │   input:input8 "HDA Intel PCH Headphone"
		  ├─/sys/devices/pci0000:00/0000:00:1d.0/usb1
		  │ usb:usb1
		  │ └─/sys/devices/pci0000:00/0000:00:1d.0/usb1/1-1
		  │   usb:1-1
		  └─/sys/devices/platform/pcspkr/input/input7
		    input:input7 "PC Speaker"

找寻对应的口:

/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-eDP-1
/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3

/sys/devices/pci0000:00/0000:00:14.0/usb2/2-3

/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-HDMI-A-1

Attach:

# loginctl attach seat1  /sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-eDP-1
# loginctl attach seat1  /sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3
➜  ~ loginctl seat-status seat1
seat1
	 Devices:
		  ├─/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-eDP-1
		  │ [MASTER] drm:card0-eDP-1
		  │ └─/sys/devices/pci0000:00/0000:00:02.0/drm/card0/card0-eDP-1/intel_backlight
		  │   backlight:intel_backlight
		  └─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3
		    usb:2-2.3
		    ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.0/0003:1A81:2019.0005/input/input10
		    │ input:input10 "G-Tech Fuhlen SM680 Mechanical Keyboard"
		    │ ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.0/0003:1A81:2019.0005/input/input10/input10::capslock
		    │ │ leds:input10::capslock
		    │ ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.0/0003:1A81:2019.0005/input/input10/input10::compose
		    │ │ leds:input10::compose
		    │ ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.0/0003:1A81:2019.0005/input/input10/input10::kana
		    │ │ leds:input10::kana
		    │ ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.0/0003:1A81:2019.0005/input/input10/input10::numlock
		    │ │ leds:input10::numlock
		    │ └─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.0/0003:1A81:2019.0005/input/input10/input10::scrolllock
		    │   leds:input10::scrolllock
		    ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.1/0003:1A81:2019.0006/input/input11
		    │ input:input11 "G-Tech Fuhlen SM680 Mechanical Keyboard Mouse"
		    ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.1/0003:1A81:2019.0006/input/input12
		    │ input:input12 "G-Tech Fuhlen SM680 Mechanical Keyboard"
		    ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.1/0003:1A81:2019.0006/input/input13
		    │ input:input13 "G-Tech Fuhlen SM680 Mechanical Keyboard Consumer Control"
		    ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.1/0003:1A81:2019.0006/input/input14
		    │ input:input14 "G-Tech Fuhlen SM680 Mechanical Keyboard System Control"
		    ├─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.2/0003:1A81:2019.0007/input/input16
		    │ input:input16 "G-Tech Fuhlen SM680 Mechanical Keyboard"
		    └─/sys/devices/pci0000:00/0000:00:14.0/usb2/2-2/2-2.3/2-2.3:1.3/0003:1A81:2019.0008/input/input17
		      input:input17 "G-Tech Fuhlen SM680 Mechanical Keyboard"

Verify:

➜  ~ ls -l  /etc/udev/rules.d/
total 12
-rw-r--r-- 1 root root  76 Jul  6 06:22 72-seat-drm-pci-0000_00_02_0.rules
-rw-r--r-- 1 root root  86 Jul  6 06:23 72-seat-usb-pci-0000_00_14_0-usb-0_2_3.rules
-rw-r--r-- 1 root root 432 Aug 10  2020 99-kvmd.rules.pacsave

Disable lxdm and testing:

# systemctl disable lxdm
Removed /etc/systemd/system/display-manager.service.
# reboot

WorkingTipsOnLXDJuju

Commands

Bootstrap via:

test@edge5:~$ juju bootstrap localhost overlord
Creating Juju controller "overlord" on localhost/localhost
Looking for packaged Juju agent version 2.9.5 for amd64
Located Juju agent version 2.9.5-ubuntu-amd64 at https://streams.canonical.com/juju/tools/agent/2.9.5/juju-2.9.5-ubuntu-amd64.tgz
To configure your system to better support LXD containers, please see: https://github.com/lxc/lxd/blob/master/doc/production-setup.md
Launching controller instance(s) on localhost/localhost...
 - juju-55e209-0 (arch=amd64)                 
Installing Juju agent on bootstrap instance
Fetching Juju Dashboard 0.7.1
Waiting for address
Attempting to connect to 10.53.118.136:22
Connected to 10.53.118.136
Running machine configuration script...
Bootstrap agent now started
Contacting Juju controller at 10.53.118.136 to verify accessibility...

Bootstrap complete, controller "overlord" is now available
Controller machines are in the "controller" model
Initial model "default" added

Verify bootstrap status:

test@edge5:~$ lxc ls
+---------------+---------+----------------------+------+-----------+-----------+
|     NAME      |  STATE  |         IPV4         | IPV6 |   TYPE    | SNAPSHOTS |
+---------------+---------+----------------------+------+-----------+-----------+
| juju-55e209-0 | RUNNING | 10.53.118.136 (eth0) |      | CONTAINER | 0         |
+---------------+---------+----------------------+------+-----------+-----------+
test@edge5:~$ juju status
Model    Controller  Cloud/Region         Version  SLA          Timestamp
default  overlord    localhost/localhost  2.9.5    unsupported  15:32:12+08:00

Model "admin/default" is empty.

microk8s

Deploy microk8s via:

$ juju deploy -n3 cs:~pjdc/microk8s
Located charm "microk8s" in charm-store, revision 24
Deploying "microk8s" from charm-store charm "microk8s", revision 24 in channel stable

View juju status:

test@edge5:~$ juju status
Model    Controller  Cloud/Region         Version  SLA          Timestamp
default  overlord    localhost/localhost  2.9.5    unsupported  15:35:54+08:00

App       Version  Status   Scale  Charm     Store       Channel  Rev  OS      Message
microk8s           waiting    0/3  microk8s  charmstore  stable    24  ubuntu  waiting for machine

Unit        Workload  Agent       Machine  Public address  Ports  Message
microk8s/0  waiting   allocating  0        10.53.118.110          waiting for machine
microk8s/1  waiting   allocating  1        10.53.118.99           waiting for machine
microk8s/2  waiting   allocating  2        10.53.118.115          waiting for machine

Machine  State    DNS            Inst id        Series  AZ  Message
0        pending  10.53.118.110  juju-585d2d-0  focal       Running
1        pending  10.53.118.99   juju-585d2d-1  focal       Running
2        pending  10.53.118.115  juju-585d2d-2  focal       Running

Until succeed:

test@edge5:~$ juju status
Model    Controller  Cloud/Region         Version  SLA          Timestamp
default  overlord    localhost/localhost  2.9.5    unsupported  15:49:48+08:00

App       Version  Status  Scale  Charm     Store       Channel  Rev  OS      Message
microk8s           active      3  microk8s  charmstore  stable    24  ubuntu  

Unit         Workload  Agent  Machine  Public address  Ports                     Message
microk8s/0*  active    idle   0        10.53.118.110   80/tcp,443/tcp,16443/tcp  
microk8s/1   active    idle   1        10.53.118.99    80/tcp,443/tcp,16443/tcp  
microk8s/2   active    idle   2        10.53.118.115   80/tcp,443/tcp,16443/tcp  

Machine  State    DNS            Inst id        Series  AZ  Message
0        started  10.53.118.110  juju-585d2d-0  focal       Running
1        started  10.53.118.99   juju-585d2d-1  focal       Running
2        started  10.53.118.115  juju-585d2d-2  focal       Running

own cloud

Ip and hostname listed as:

192.168.89.6	edge5
192.168.89.7	edge6
192.168.89.8	edge7
192.168.89.9	edge8
192.168.89.10	edge9

Added :

test@edge5:~$ ssh-copy-id test@192.168.89.7
test@edge5:~$ juju add-cloud                                                                                                                                  
This operation can be applied to both a copy on this client and to the one on a controller.
No current controller was detected and there are no registered controllers on this client: either bootstrap one or register one.
Cloud Types
  lxd
  maas
  manual
  openstack
  vsphere

Select cloud type: manual

Enter a name for your manual cloud: manual-cloud

Enter the ssh connection string for controller, username@<hostname or IP> or <hostname or IP>: test@192.168.89.7

Cloud "manual-cloud" successfully added to your local client.

verify clouds available:

test@edge5:~$ juju clouds
Only clouds with registered credentials are shown.
There are more clouds, use --all to see them.
You can bootstrap a new controller using one of these clouds...

Clouds available on the client:
Cloud         Regions  Default    Type    Credentials  Source    Description
localhost     1        localhost  lxd     0            built-in  LXD Container Hypervisor
manual-cloud  1        default    manual  0            local 

Now bootstrap the manual-cloud:

$ juju bootstrap manual-cloud

Add machines:

test@edge5:~$ juju add-machine ssh:test@192.168.89.8
created machine 0
test@edge5:~$ juju add-machine ssh:test@192.168.89.9 && juju add-machine ssh:test@192.168.89.10
created machine 1
created machine 2
test@edge5:~$ juju machines
Machine  State    DNS            Inst id               Series  AZ  Message
0        started  192.168.89.8   manual:192.168.89.8   focal       Manually provisioned machine
1        started  192.168.89.9   manual:192.168.89.9   focal       Manually provisioned machine
2        started  192.168.89.10  manual:192.168.89.10  focal       Manually provisioned machine

Deploy microk8s via:

$  juju deploy -n3 cs:~pjdc/microk8s

After deployment, the juju status show:

test@edge5:~$ juju status
Model    Controller            Cloud/Region          Version  SLA          Timestamp
default  manual-cloud-default  manual-cloud/default  2.9.5    unsupported  17:27:04+08:00

App       Version  Status  Scale  Charm     Store       Channel  Rev  OS      Message
microk8s           active      3  microk8s  charmstore  stable    24  ubuntu  

Unit         Workload  Agent  Machine  Public address  Ports                     Message
microk8s/0*  active    idle   0        192.168.89.8    80/tcp,443/tcp,16443/tcp  
microk8s/1   active    idle   1        192.168.89.9    80/tcp,443/tcp,16443/tcp  
microk8s/2   active    idle   2        192.168.89.10   80/tcp,443/tcp,16443/tcp  

Machine  State    DNS            Inst id               Series  AZ  Message
0        started  192.168.89.8   manual:192.168.89.8   focal       Manually provisioned machine
1        started  192.168.89.9   manual:192.168.89.9   focal       Manually provisioned machine
2        started  192.168.89.10  manual:192.168.89.10  focal       Manually provisioned machine

Verify Ha:

test@edge5:~$ juju exec --application microk8s -- 'microk8s status | grep -A2 high-availability:'
- return-code: 0
  stdout: |
    high-availability: yes
      datastore master nodes: 192.168.89.8:19001 192.168.89.9:19001 192.168.89.10:19001
      datastore standby nodes: none
  unit: microk8s/0
- return-code: 0
  stdout: |
    high-availability: yes
      datastore master nodes: 192.168.89.8:19001 192.168.89.9:19001 192.168.89.10:19001
      datastore standby nodes: none
  unit: microk8s/1
- return-code: 0
  stdout: |
    high-availability: yes
      datastore master nodes: 192.168.89.8:19001 192.168.89.9:19001 192.168.89.10:19001
      datastore standby nodes: none
  unit: microk8s/2

Verify the k8s status:

test@edge5:~$ juju exec --application microk8s -- microk8s kubectl get node
- return-code: 0
  stdout: |
    NAME    STATUS   ROLES    AGE   VERSION
    edge8   Ready    <none>   13m   v1.21.1-3+ba118484dd39df
    edge7   Ready    <none>   18m   v1.21.1-3+ba118484dd39df
    edge9   Ready    <none>   13m   v1.21.1-3+ba118484dd39df
  unit: microk8s/0
- return-code: 0
  stdout: |
    NAME    STATUS   ROLES    AGE   VERSION
    edge8   Ready    <none>   13m   v1.21.1-3+ba118484dd39df
    edge7   Ready    <none>   18m   v1.21.1-3+ba118484dd39df
    edge9   Ready    <none>   13m   v1.21.1-3+ba118484dd39df
  unit: microk8s/1
- return-code: 0
  stdout: |
    NAME    STATUS   ROLES    AGE   VERSION
    edge8   Ready    <none>   13m   v1.21.1-3+ba118484dd39df
    edge7   Ready    <none>   18m   v1.21.1-3+ba118484dd39df
    edge9   Ready    <none>   13m   v1.21.1-3+ba118484dd39df
  unit: microk8s/2

WorkingTipsOnLXDAndJuju

Before

A LXD cluster and juju status:

lxd cluster status and lxc instance before deploy work load:

test@freeedge1:~$ lxc cluster list
+-----------+---------------------------+----------+--------------+----------------+-------------+--------+-------------------+
|   NAME    |            URL            | DATABASE | ARCHITECTURE | FAILURE DOMAIN | DESCRIPTION | STATE  |      MESSAGE      |
+-----------+---------------------------+----------+--------------+----------------+-------------+--------+-------------------+
| freeedge1 | https://192.168.89.2:8443 | YES      | x86_64       | default        |             | ONLINE | Fully operational |
+-----------+---------------------------+----------+--------------+----------------+-------------+--------+-------------------+
| freeedge2 | https://192.168.89.3:8443 | YES      | x86_64       | default        |             | ONLINE | Fully operational |
+-----------+---------------------------+----------+--------------+----------------+-------------+--------+-------------------+
| freeedge3 | https://192.168.89.4:8443 | YES      | x86_64       | default        |             | ONLINE | Fully operational |
+-----------+---------------------------+----------+--------------+----------------+-------------+--------+-------------------+
| freeedge4 | https://192.168.89.5:8443 | YES      | x86_64       | default        |             | ONLINE | Fully operational |
+-----------+---------------------------+----------+--------------+----------------+-------------+--------+-------------------+
test@freeedge1:~$ lxc ls
+------+-------+------+------+------+-----------+----------+
| NAME | STATE | IPV4 | IPV6 | TYPE | SNAPSHOTS | LOCATION |
+------+-------+------+------+------+-----------+----------+

juju boostrap for provisioning a machine with LXD and create a controller running within it:

$ juju bootstrap localhost overlord
Creating Juju controller "overlord" on localhost/localhost
Looking for packaged Juju agent version 2.9.5 for amd64
WARNING Got error requesting "https://streams.canonical.com/juju/tools/streams/v1/index2.sjson": Get "https://streams.canonical.com/juju/tools/streams/v1/index2.sjson": dial tcp [2001:67c:1360:8001::33]:443: connect: network is unreachable
No packaged binary found, preparing local Juju agent binary
To configure your system to better support LXD containers, please see: https://github.com/lxc/lxd/blob/master/doc/production-setup.md
Launching controller instance(s) on localhost/localhost...
 - juju-5a91a2-0 (arch=amd64)                 
Installing Juju agent on bootstrap instance
Fetching Juju Dashboard 0.7.1
Waiting for address
Attempting to connect to 192.168.89.155:22
Connected to 192.168.89.155
Running machine configuration script...
Bootstrap agent now started
Contacting Juju controller at 192.168.89.155 to verify accessibility...

Bootstrap complete, controller "overlord" is now available
Controller machines are in the "controller" model
Initial model "default" added

This command automatically create a new lxd instance and running the script in it, like:

$ test@freeedge1:~$ lxc ls
+---------------+---------+-----------------------+------+-----------+-----------+-----------+
|     NAME      |  STATE  |         IPV4          | IPV6 |   TYPE    | SNAPSHOTS | LOCATION  |
+---------------+---------+-----------------------+------+-----------+-----------+-----------+
| juju-5a91a2-0 | RUNNING | 192.168.89.155 (eth0) |      | CONTAINER | 0         | freeedge1 |
+---------------+---------+-----------------------+------+-----------+-----------+-----------+

添加juju machines:

test@freeedge1:~$ juju add-machine -n 2
created machine 0
created machine 1
test@freeedge1:~$ juju machines
Machine  State    DNS  Inst id  Series  AZ  Message
0        pending       pending  focal       starting
1        pending       pending  focal       starting

等待一段时间,直到状态变为started:

$ juju machines
Machine  State    DNS             Inst id        Series  AZ  Message
0        started  192.168.89.197  juju-a5a008-0  focal       Running
1        started  192.168.89.173  juju-a5a008-1  focal       Running

实际上是lxc工作负载,exec进入到该实例中可发现其实是没有做任何资源限制的lxc工作实例, 如果资源需要做限制,则可以参考https://juju.is/docs/olm/constraints#heading--constraints-and-lxd-containers:

test@freeedge1:~$ lxc ls
+---------------+---------+-----------------------+------+-----------+-----------+-----------+
|     NAME      |  STATE  |         IPV4          | IPV6 |   TYPE    | SNAPSHOTS | LOCATION  |
+---------------+---------+-----------------------+------+-----------+-----------+-----------+
| juju-5a91a2-0 | RUNNING | 192.168.89.155 (eth0) |      | CONTAINER | 0         | freeedge1 |
+---------------+---------+-----------------------+------+-----------+-----------+-----------+
| juju-a5a008-0 | RUNNING | 192.168.89.197 (eth0) |      | CONTAINER | 0         | freeedge2 |
+---------------+---------+-----------------------+------+-----------+-----------+-----------+
| juju-a5a008-1 | RUNNING | 192.168.89.173 (eth0) |      | CONTAINER | 0         | freeedge2 |
+---------------+---------+-----------------------+------+-----------+-----------+-----------+

销毁刚才创建的machine:

test@freeedge1:~$ juju remove-machine 0
removing machine 0
test@freeedge1:~$ juju machines
Machine  State    DNS             Inst id        Series  AZ  Message
0        stopped  192.168.89.197  juju-a5a008-0  focal       Running
1        started  192.168.89.173  juju-a5a008-1  focal       Running

test@freeedge1:~$ juju remove-machine 1
removing machine 1
test@freeedge1:~$ juju machines
Machine  State    DNS             Inst id        Series  AZ  Message
1        stopped  192.168.89.173  juju-a5a008-1  focal       Running

创建一个hello-juju的charmed operator:

test@freeedge1:~$ juju deploy hello-juju
Located charm "hello-juju" in charm-hub, revision 8
Deploying "hello-juju" from charm-hub charm "hello-juju", revision 8 in channel stable
test@freeedge1:~$ juju status
Model    Controller  Cloud/Region         Version  SLA          Timestamp
default  overlord    localhost/localhost  2.9.5    unsupported  16:27:44+08:00

App         Version  Status  Scale  Charm       Store     Channel  Rev  OS      Message
hello-juju           active      1  hello-juju  charmhub  stable     8  ubuntu  

Unit           Workload  Agent  Machine  Public address  Ports   Message
hello-juju/0*  active    idle   2        192.168.89.160  80/tcp  

Machine  State    DNS             Inst id        Series  AZ  Message
2        started  192.168.89.160  juju-a5a008-2  focal       Running

test@freeedge1:~$ juju expose hello-juju
test@freeedge1:~$ juju status
Model    Controller  Cloud/Region         Version  SLA          Timestamp
default  overlord    localhost/localhost  2.9.5    unsupported  16:28:06+08:00

App         Version  Status  Scale  Charm       Store     Channel  Rev  OS      Message
hello-juju           active      1  hello-juju  charmhub  stable     8  ubuntu  

Unit           Workload  Agent  Machine  Public address  Ports   Message
hello-juju/0*  active    idle   2        192.168.89.160  80/tcp  

Machine  State    DNS             Inst id        Series  AZ  Message
2        started  192.168.89.160  juju-a5a008-2  focal       Running

/images/2021_07_03_16_31_31_680x551.jpg

部署charmed Kubernetes

记录一下步骤, 在LXD集群的情况下,直接部署会出现以下错误:

# juju deploy charmed-kubernetes
# juju status
Machine  State  DNS  Inst id  Series  AZ  Message
0        down        pending  focal       Failed creating instance record: Failed initialising instance: Failed loading storage pool: No such object
1        down        pending  focal       Failed creating instance record: Failed initialising instance: Failed loading storage pool: No such object
2        down        pending  focal       Failed creating instance record: Failed initialising instance: Failed loading storage pool: No such object
3        down        pending  focal       Failed creating instance record: Failed initialising instance: Failed loading storage pool: No such object
....

这是因为没有默认的default格式的storage定义,

test@freeedge1:~$ lxc storage list
+-------+--------+-------------+---------+---------+
| NAME  | DRIVER | DESCRIPTION | USED BY |  STATE  |
+-------+--------+-------------+---------+---------+
| local | zfs    |             | 3       | CREATED |
+-------+--------+-------------+---------+---------+

先行删除已经部署好的charmed-kubernetes(当前没有直接删除charmed operator的方法,只能将app都删除):

 juju remove-application hello-juju
 juju remove-application containerd
 juju remove-application easyrsa
 juju remove-application etcd
 juju remove-application flannel
 juju remove-application kubeapi-load-balancer
 juju remove-application kubernetes-master
 juju remove-application kubernetes-worker

所有节点上创建相同的目录并添加到一个新的storage定义:

 $ ansible -i hosts.ini all -m shell -a "sudo mkdir -p /data/lxd && sudo chmod 777 -R /data/lxd"
$ lxc storage create --target freeedge1 default dir source=/data/lxd 
Storage pool default pending on member freeedge1
$ lxc storage create --target freeedge2 default dir source=/data/lxd 
Storage pool default pending on member freeedge2
$ lxc storage create --target freeedge3 default dir source=/data/lxd 
Storage pool default pending on member freeedge3
$ lxc storage create --target freeedge4 default dir source=/data/lxd 
Storage pool default pending on member freeedge4
$ lxc storage create default dir
Storage pool default created
$ lxc storage volume create default lxdvol --target freeedge1
Storage volume lxdvol created
$ lxc storage volume show default lxdvol --target freeedge1
config: {}
description: ""
name: lxdvol
type: custom
used_by: []
location: freeedge1
content_type: filesystem

各种乱七八糟的操作以后,清空:

juju destroy-controller overlord

现在重新开始部署juju(参考https://juju.is/docs/olm/lxd):

$ juju bootstrap localhost overlord
$ juju deploy charmed-kubernetes
Located bundle "charmed-kubernetes" in charm-hub, revision 679
WARNING "services" key found in bundle file is deprecated, superseded by "applications" key.
Located charm "containerd" in charm-store, revision 130
Located charm "easyrsa" in charm-store, revision 384
Located charm "etcd" in charm-store, revision 594
Located charm "flannel" in charm-store, revision 558
Located charm "kubeapi-load-balancer" in charm-store, revision 798
Located charm "kubernetes-master" in charm-store, revision 1008
Located charm "kubernetes-worker" in charm-store, revision 768
Executing changes:
- upload charm containerd from charm-store for series focal with architecture=amd64
- deploy application containerd from charm-store on focal
- set annotations for containerd
- upload charm easyrsa from charm-store for series focal with architecture=amd64
- deploy application easyrsa from charm-store on focal
  added resource easyrsa
- set annotations for easyrsa
- upload charm etcd from charm-store for series focal with architecture=amd64
- deploy application etcd from charm-store on focal
  added resource core
  added resource etcd
  added resource snapshot
- set annotations for etcd
- upload charm flannel from charm-store for series focal with architecture=amd64
- deploy application flannel from charm-store on focal
  added resource flannel-amd64
  added resource flannel-arm64
  added resource flannel-s390x
- set annotations for flannel
- upload charm kubeapi-load-balancer from charm-store for series focal with architecture=amd64
- deploy application kubeapi-load-balancer from charm-store on focal
- expose all endpoints of kubeapi-load-balancer and allow access from CIDRs 0.0.0.0/0 and ::/0
- set annotations for kubeapi-load-balancer
- upload charm kubernetes-master from charm-store for series focal with architecture=amd64
- deploy application kubernetes-master from charm-store on focal
  added resource cdk-addons
  added resource core
  added resource kube-apiserver
  added resource kube-controller-manager
  added resource kube-proxy
  added resource kube-scheduler
  added resource kubectl
- set annotations for kubernetes-master
- upload charm kubernetes-worker from charm-store for series focal with architecture=amd64
- deploy application kubernetes-worker from charm-store on focal
  added resource cni-amd64
  added resource cni-arm64
  added resource cni-s390x
  added resource core
  added resource kube-proxy
  added resource kubectl
  added resource kubelet
- expose all endpoints of kubernetes-worker and allow access from CIDRs 0.0.0.0/0 and ::/0
- set annotations for kubernetes-worker
- add relation kubernetes-master:kube-api-endpoint - kubeapi-load-balancer:apiserver
- add relation kubernetes-master:loadbalancer - kubeapi-load-balancer:loadbalancer
- add relation kubernetes-master:kube-control - kubernetes-worker:kube-control
- add relation kubernetes-master:certificates - easyrsa:client
- add relation etcd:certificates - easyrsa:client
- add relation kubernetes-master:etcd - etcd:db
- add relation kubernetes-worker:certificates - easyrsa:client
- add relation kubernetes-worker:kube-api-endpoint - kubeapi-load-balancer:website
- add relation kubeapi-load-balancer:certificates - easyrsa:client
- add relation flannel:etcd - etcd:db
- add relation flannel:cni - kubernetes-master:cni
- add relation flannel:cni - kubernetes-worker:cni
- add relation containerd:containerd - kubernetes-worker:container-runtime
- add relation containerd:containerd - kubernetes-master:container-runtime
- add unit easyrsa/0 to new machine 0
- add unit etcd/0 to new machine 1
- add unit etcd/1 to new machine 2
- add unit etcd/2 to new machine 3
- add unit kubeapi-load-balancer/0 to new machine 4
- add unit kubernetes-master/0 to new machine 5
- add unit kubernetes-master/1 to new machine 6
- add unit kubernetes-worker/0 to new machine 7
- add unit kubernetes-worker/1 to new machine 8
- add unit kubernetes-worker/2 to new machine 9
Deploy of bundle completed.

部署中可以通过juju status看即时的状态变更:

/images/2021_07_03_17_51_14_1359x899.jpg

涉及到的落地实体:

/images/2021_07_03_17_51_58_956x946.jpg

安装kubectl用于管控集群:

test@freeedge1:~$ sudo snap install kubectl --classic
.....
kubectl 1.21.1 from Canonical✓ installed

中间状态:

/images/2021_07_03_18_00_15_1526x913.jpg

/images/2021_07_03_18_07_12_1525x940.jpg

/images/2021_07_03_18_09_07_1489x922.jpg

集群就绪:

/images/2021_07_03_18_40_34_1384x934.jpg

$ juju scp kubernetes-master/0:config ~/.kube/config
$ kubectl get nodes -A
NAME            STATUS   ROLES    AGE   VERSION
juju-08ae08-7   Ready    <none>   19m   v1.21.1
juju-08ae08-8   Ready    <none>   19m   v1.21.1
juju-08ae08-9   Ready    <none>   13m   v1.21.1

/images/2021_07_03_18_42_18_1481x912.jpg

添加节点:

$ juju add-unit kubernetes-worker

/images/2021_07_03_18_43_47_709x158.jpg

增加完后:

$ kubectl get nodes -A                                                                                                                                                      
NAME             STATUS   ROLES    AGE   VERSION
juju-08ae08-10   Ready    <none>   12m   v1.21.1
juju-08ae08-7    Ready    <none>   42m   v1.21.1
juju-08ae08-8    Ready    <none>   42m   v1.21.1
juju-08ae08-9    Ready    <none>   36m   v1.21.1

增加3个:

$ juju add-unit kubernetes-worker -n 3
$ juju status
.....
11       pending                  pending         focal       starting
12       pending                  pending         focal       starting
13       pending                  pending         focal       starting
$ kubectl get nodes -A
NAME             STATUS   ROLES    AGE     VERSION
juju-08ae08-10   Ready    <none>   29m     v1.21.1
juju-08ae08-11   Ready    <none>   5m2s    v1.21.1
juju-08ae08-12   Ready    <none>   3m21s   v1.21.1
juju-08ae08-13   Ready    <none>   86s     v1.21.1
juju-08ae08-7    Ready    <none>   59m     v1.21.1
juju-08ae08-8    Ready    <none>   59m     v1.21.1
juju-08ae08-9    Ready    <none>   53m     v1.21.1