目录

1. 创建kops所需的AWS IAM的用户组用户kops

可以通过AWS命令或AWS控制台创建kops用户组及用户.

因现在已经有一台操作机,操作机已经有绑定IAM Role,其有全部的AWS权限,所以使用AWS命令创建:

[ec2-user@ip-172-31-3-142 ~]$ aws iam create-group --group-name kops
{
    "Group": {
        "Path": "/", 
        "CreateDate": "2021-05-17T07:03:18Z", 
        "GroupId": "AGPAW5FY7AWUKCOI3KGNK", 
        "Arn": "arn:aws:iam::474981795240:group/kops", 
        "GroupName": "kops"
    }
}
[ec2-user@ip-172-31-3-142 ~]$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonEC2FullAccess --group-name kops
[ec2-user@ip-172-31-3-142 ~]$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonRoute53FullAccess --group-name kops
[ec2-user@ip-172-31-3-142 ~]$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonS3FullAccess --group-name kops
[ec2-user@ip-172-31-3-142 ~]$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/IAMFullAccess --group-name kops
[ec2-user@ip-172-31-3-142 ~]$ aws iam attach-group-policy --policy-arn arn:aws:iam::aws:policy/AmazonVPCFullAccess --group-name kops
[ec2-user@ip-172-31-3-142 ~]$ aws iam create-user --user-name kops
{
    "User": {
        "UserName": "kops", 
        "Path": "/", 
        "CreateDate": "2021-05-17T07:04:05Z", 
        "UserId": "AIDAW5FY7AWUOPCWZEIKU", 
        "Arn": "arn:aws:iam::474981795240:user/kops"
    }
}
[ec2-user@ip-172-31-3-142 ~]$ aws iam add-user-to-group --user-name kops --group-name kops
[ec2-user@ip-172-31-3-142 ~]$ aws iam create-access-key --user-name kops
{
    "AccessKey": {
        "UserName": "kops", 
        "Status": "Active", 
        "CreateDate": "2021-05-17T07:04:22Z", 
        "SecretAccessKey": "xxxxxxxxxx", 
        "AccessKeyId": "xxxxxxxx"
    }
}

记住上面的AccessKeyId及SecretAccessKey,后面能用得到。注意:私有密钥只能在创建时进行查看或下载。如果您的现有私有密钥放错位置,请创建新的访问密钥。

2. 配置kops部署机

2.1 配置aws accesskey

现在需要一台部署操作机,将前面创建的AccessKeyId及SecretAccessKey配置到AWS管理命令上:

# configure the aws client to use your new IAM user
aws configure           # Use your new access and secret key here
aws iam list-users      # you should see a list of all your IAM users here

# Because "aws configure" doesn't export these vars for kops to use, we export them now
export AWS_ACCESS_KEY_ID=$(aws configure get aws_access_key_id)
export AWS_SECRET_ACCESS_KEY=$(aws configure get aws_secret_access_key)

2.2 安装kops

curl -Lo kops https://github.com/kubernetes/kops/releases/download/$(curl -s https://api.github.com/repos/kubernetes/kops/releases/latest | grep tag_name | cut -d '"' -f 4)/kops-linux-amd64
chmod +x ./kops
sudo mv ./kops /usr/local/bin/

2.3 安装kubectl

curl -Lo kubectl https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/amd64/kubectl
chmod +x ./kubectl
sudo mv ./kubectl /usr/local/bin/kubectl

2.4 配置DNS

kops需要使用DNS域名,所以需要配置route53,增加一个托管区域,下面是使用AWS命令创建route53托管区域的示例:

ID=$(uuidgen) && aws route53 create-hosted-zone --name kops.domain.com \
    --hosted-zone-config "Comment=kops,PrivateZone=true" \
    --vpc '{"VPCRegion": "ap-northeast-2","VPCId": "vpc-07750e59beab05d39"}' \
    --caller-reference $ID 

如下这个创建的例子:

[ec2-user@ip-172-31-3-142 ~]$ ID=$(uuidgen) && aws route53 create-hosted-zone --name kops.domain.com \
>     --hosted-zone-config "Comment=kops,PrivateZone=true" \
>     --vpc '{"VPCRegion": "ap-northeast-2","VPCId": "vpc-07750e59beab05d39"}' \
>     --caller-reference $ID 
{
    "ChangeInfo": {
        "Status": "PENDING", 
        "SubmittedAt": "2021-05-17T11:09:07.070Z", 
        "Id": "/change/C03008732SWSQI8NSG7F5"
    }, 
    "HostedZone": {
        "ResourceRecordSetCount": 2, 
        "CallerReference": "36167321-c7fe-4fd8-920e-214ee0ea131a", 
        "Config": {
            "Comment": "kops", 
            "PrivateZone": true
        }, 
        "Id": "/hostedzone/Z074430711ZURAVUJI398", 
        "Name": "kops.domain.com."
    }, 
    "Location": "https://route53.amazonaws.com/2013-04-01/hostedzone/Z074430711ZURAVUJI398", 
    "VPC": {
        "VPCId": "vpc-07750e59beab05d39", 
        "VPCRegion": "ap-northeast-2"
    }
}

通过下面的命令可以检查是否成功(关注结果是否为awsdns的服务器):

[ec2-user@ip-10-1-41-220 ~]$ dig  kops.domain.com ns

; <<>> DiG 9.11.4-P2-RedHat-9.11.4-26.P2.amzn2.4 <<>> kops.domain.com ns
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 55197
;; flags: qr rd ra; QUERY: 1, ANSWER: 4, AUTHORITY: 0, ADDITIONAL: 1

;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;kops.domain.com.               IN      NS

;; ANSWER SECTION:
kops.domain.com.        300     IN      NS      ns-0.awsdns-00.com.
kops.domain.com.        300     IN      NS      ns-512.awsdns-00.net.
kops.domain.com.        300     IN      NS      ns-1024.awsdns-00.org.
kops.domain.com.        300     IN      NS      ns-1536.awsdns-00.co.uk.

;; Query time: 1 msec
;; SERVER: 10.1.0.2#53(10.1.0.2)
;; WHEN: Mon May 17 11:19:49 UTC 2021
;; MSG SIZE  rcvd: 179

后面创建集群时就可以指定DNS参数,如:

kops create cluster --dns private --dns-zone Z074430711ZURAVUJI398 $NAME

需要注意的是主机所使用的VPC要开启<DNS主机名>和<DNS 解析>这两个参数。

2.5 创建S3

kops需要使用S3来备份集群的数据等信息,可使用以下命令创建S3:

aws s3api create-bucket \
    --bucket kops-backup-weimob \
    --create-bucket-configuration "LocationConstraint=ap-northeast-2" \
    --region ap-northeast-2

需要注意的是bucket是全局唯一的,要注意冲突。

3.创建集群

通过以下命令找到可用区:

aws ec2 describe-availability-zones --region ap-northeast-2

通过以下命令找到vpc:

aws ec2 describe-vpcs --region ap-northeast-2

通过以下命令找到vpc:

aws ec2 describe-subnets --region ap-northeast-2 --filters="Name=vpc-id,Values=vpc-07750e59beab05d39"

使用以下命令生成集群yaml文件:

export NAME=kops.domain.com
export KOPS_STATE_STORE=s3://kops-backup-weimob
kops create cluster \
    --dns private \
    --dns-zone=Z074430711ZURAVUJI398 \  # route53托管区域ID
    --state=${KOPS_STATE_STORE} \
    --zones="ap-northeast-2b,ap-northeast-2c" \ # 可用区
    --networking=amazonvpc \
    --subnets="subnet-08e309d914e8d7a6e,subnet-0d96fcd37a51857a5" \ # 子网ID
    --node-count=2 \
    --dry-run \
    -o yaml \
    ${NAME} > filename.yaml

然后可以查看或修改yaml文件:

apiVersion: kops.k8s.io/v1alpha2
kind: Cluster
metadata:
  creationTimestamp: null
  name: kops.domain.com
spec:
  api:
    dns: {}
  authorization:
    rbac: {}
  channel: stable
  cloudProvider: aws
  configBase: s3://kops-backup-weimob/kops.domain.com
  dnsZone: Z074430711ZURAVUJI398
  etcdClusters:
  - cpuRequest: 200m
    etcdMembers:
    - encryptedVolume: true
      instanceGroup: master-ap-northeast-2b
      name: b
    memoryRequest: 100Mi
    name: main
  - cpuRequest: 100m
    etcdMembers:
    - encryptedVolume: true
      instanceGroup: master-ap-northeast-2b
      name: b
    memoryRequest: 100Mi
    name: events
  iam:
    allowContainerRegistry: true
    legacy: false
  kubelet:
    anonymousAuth: false
  kubernetesApiAccess:
  - 0.0.0.0/0
  kubernetesVersion: 1.20.6
  masterPublicName: api.kops.domain.com
  networkCIDR: 10.1.0.0/16
  networkID: vpc-07750e59beab05d39
  networking:
    amazonvpc: {}
  nonMasqueradeCIDR: 10.1.0.0/16
  sshAccess:
  - 0.0.0.0/0
  subnets:
  - cidr: 10.1.16.0/20
    id: subnet-08e309d914e8d7a6e
    name: ap-northeast-2b
    type: Public
    zone: ap-northeast-2b
  - cidr: 10.1.64.0/20
    id: subnet-0d96fcd37a51857a5
    name: ap-northeast-2c
    type: Public
    zone: ap-northeast-2c
  topology:
    dns:
      type: Private
    masters: public
    nodes: public

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: null
  labels:
    kops.k8s.io/cluster: kops.domain.com
  name: master-ap-northeast-2b
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210415
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: master-ap-northeast-2b
  role: Master
  subnets:
  - ap-northeast-2b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: null
  labels:
    kops.k8s.io/cluster: kops.domain.com
  name: nodes-ap-northeast-2b
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210415
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-ap-northeast-2b
  role: Node
  subnets:
  - ap-northeast-2b

---

apiVersion: kops.k8s.io/v1alpha2
kind: InstanceGroup
metadata:
  creationTimestamp: null
  labels:
    kops.k8s.io/cluster: kops.domain.com
  name: nodes-ap-northeast-2c
spec:
  image: 099720109477/ubuntu/images/hvm-ssd/ubuntu-focal-20.04-amd64-server-20210415
  machineType: t3.medium
  maxSize: 1
  minSize: 1
  nodeLabels:
    kops.k8s.io/instancegroup: nodes-ap-northeast-2c
  role: Node
  subnets:
  - ap-northeast-2c

现在可以通过yaml文件来创建集群:

[ec2-user@ip-10-1-41-220 ~]$ kops create -f ./filename.yaml 

Created cluster/kops.domain.com
Created instancegroup/master-ap-northeast-2b
Created instancegroup/nodes-ap-northeast-2b
Created instancegroup/nodes-ap-northeast-2c

To deploy these resources, run: kops update cluster kops.domain.com --yes

[ec2-user@ip-10-1-41-220 ~]$ ll
total 4
-rw-rw-r-- 1 ec2-user ec2-user 2618 May 18 06:31 filename.yaml
[ec2-user@ip-10-1-41-220 ~]$ kops update cluster kops.domain.com --yes

SSH public key must be specified when running with AWS (create with `kops create secret --name kops.domain.com sshpublickey admin -i ~/.ssh/id_rsa.pub`)

有个报错,处理下,再继续:

[ec2-user@ip-10-1-41-220 ~]$ kops create secret --name kops.domain.com sshpublickey admin -i ~/.ssh/id_rsa.pub
[ec2-user@ip-10-1-41-220 ~]$ ls .ssh/
authorized_keys  id_rsa.pub
[ec2-user@ip-10-1-41-220 ~]$ kops update cluster kops.domain.com --yes
I0518 07:12:39.400346    9935 dns.go:97] Private DNS: skipping DNS validation
I0518 07:12:39.871032    9935 executor.go:111] Tasks: 0 done / 79 total; 44 can run
W0518 07:12:39.953253    9935 vfs_castore.go:612] CA private key was not found
W0518 07:12:39.986549    9935 vfs_castore.go:612] CA private key was not found
I0518 07:12:39.986655    9935 keypair.go:195] Issuing new certificate: "ca"
I0518 07:12:40.018999    9935 keypair.go:195] Issuing new certificate: "etcd-manager-ca-main"
I0518 07:12:40.080283    9935 keypair.go:195] Issuing new certificate: "etcd-peers-ca-events"
I0518 07:12:40.104703    9935 keypair.go:195] Issuing new certificate: "apiserver-aggregator-ca"
I0518 07:12:40.160323    9935 keypair.go:195] Issuing new certificate: "etcd-clients-ca"
I0518 07:12:40.322550    9935 keypair.go:195] Issuing new certificate: "etcd-manager-ca-events"
I0518 07:12:40.342144    9935 keypair.go:195] Issuing new certificate: "etcd-peers-ca-main"
I0518 07:12:40.605820    9935 keypair.go:195] Issuing new certificate: "service-account"
I0518 07:12:42.325976    9935 executor.go:111] Tasks: 44 done / 79 total; 14 can run
I0518 07:12:43.246360    9935 executor.go:111] Tasks: 58 done / 79 total; 18 can run
I0518 07:12:43.934107    9935 executor.go:111] Tasks: 76 done / 79 total; 3 can run
I0518 07:12:44.724913    9935 executor.go:137] Task "AutoscalingGroup/master-ap-northeast-2b.masters.kops.domain.com" not ready: waiting for the IAM Instance Profile to be propagated
I0518 07:12:44.725022    9935 executor.go:137] Task "AutoscalingGroup/nodes-ap-northeast-2c.kops.domain.com" not ready: waiting for the IAM Instance Profile to be propagated
I0518 07:12:44.725109    9935 executor.go:137] Task "AutoscalingGroup/nodes-ap-northeast-2b.kops.domain.com" not ready: waiting for the IAM Instance Profile to be propagated
I0518 07:12:44.725166    9935 executor.go:155] No progress made, sleeping before retrying 3 task(s)
I0518 07:12:54.725432    9935 executor.go:111] Tasks: 76 done / 79 total; 3 can run
I0518 07:12:56.038961    9935 executor.go:111] Tasks: 79 done / 79 total; 0 can run
I0518 07:12:56.039099    9935 dns.go:157] Pre-creating DNS records
I0518 07:12:58.287195    9935 update_cluster.go:313] Exporting kubecfg for cluster
W0518 07:12:58.317706    9935 create_kubecfg.go:91] Did not find API endpoint for gossip hostname; may not be able to reach cluster
kops has set your kubectl context to kops.domain.com
W0518 07:12:58.367711    9935 update_cluster.go:337] Exported kubecfg with no user authentication; use --admin, --user or --auth-plugin flags with `kops export kubecfg`

Cluster is starting.  It should be ready in a few minutes.

Suggestions:
 * validate cluster: kops validate cluster --wait 10m
 * list nodes: kubectl get nodes --show-labels
 * ssh to the master: ssh -i ~/.ssh/id_rsa ubuntu@api.kops.domain.com
 * the ubuntu user is specific to Ubuntu. If not using Ubuntu please use the appropriate user based on your OS.
 * read about installing addons at: https://kops.sigs.k8s.io/operations/addons.

[ec2-user@ip-10-1-41-220 ~]$ 

4. 使用及管理集群

下面可以使用集群了,试试kubectl get nodes:

[ec2-user@ip-10-1-41-220 ~]$ kubectl  get nodes
Please enter Username: 

看来是kube/config配置文件有问题,仔细看上面创建集群的时候有个警告:

W0518 07:12:58.317706    9935 create_kubecfg.go:91] Did not find API endpoint for gossip hostname; may not be able to reach cluster
kops has set your kubectl context to kops.domain.com
W0518 07:12:58.367711    9935 update_cluster.go:337] Exported kubecfg with no user authentication; use --admin, --user or --auth-plugin flags with `kops export kubecfg`

重新生成一下配置文件:

kops export kubecfg kops.domain.com --admin

再次执行:

[ec2-user@ip-10-1-41-220 ~]$ kubectl  get nodes
NAME                                             STATUS   ROLES                  AGE   VERSION
ip-10-1-19-76.ap-northeast-2.compute.internal    Ready    node                   43m   v1.20.6
ip-10-1-31-224.ap-northeast-2.compute.internal   Ready    control-plane,master   45m   v1.20.6
ip-10-1-79-247.ap-northeast-2.compute.internal   Ready    node                   43m   v1.20.6
[ec2-user@ip-10-1-41-220 ~]$

好了,正常了

本文档参考自: https://kops.sigs.k8s.io/getting_started/aws/