阿辉的博客

系统 网络 集群 数据库 分布式云计算等 研究

通过zabbix监控kubernetes集群

日前写了一个zabbix的监控脚本来监控kubernetes集群,主要用于报警的功能。性能监控还是使用其它方式来实现。

github URL

https://github.com/farmerluo/k8s_zabbix

k8s_zabbix说明

k8s_zabbix实现了使用zabbix监控kubernetes的ingress,hpa,pod状态等功能。

Template Check K8S Cluster Status.xml:zabbix模板,可通过此文件导入到zabbix

check_k8s_status.py:kubernetes的监控脚本

userparameter_k8s.conf:zabbix agent端的配置文件,需要注意脚本的路径

check_k8s_status.py说明

  • 监控的k8s集群配置:
conf.host = "https://10.10.88.20:8443"
conf.api_key['authorization'] = "xxxxxx.xxxxxxx.x-x-xxx-xxx-x"
  • 脚本会监控traefik ingress的访问状态,将对400~599的非正常状态进行报警,需事先将traefik的访问日志通过fluentd或filebeat等导入到elasticsearch集群,脚本将定时通过查询访问日志来监控ingress的访问状态。监控脚本内的配置:
# elasticsearch server config
es_server = [{"host": "10.16.252.50", "port": 9200},
             {"host": "10.16.252.50", "port": 9200},
             {"host": "10.16.252.50", "port": 9200}
             ]
# 索引名
es_index = "logstash-traefik-ingress-lb-*"
# es查询间隔,ms
es_query_duration = 60000

# 状态码报警及阈值配置
# xxx.com为自定域名的例子
status_code_config = {
    'default': {'403': '90', '404': '90', '500': '2', '502': '2', '499': '70', '406': '70', '503': '5',
                '504': '5', '599': '2', 'other': '30', '429': '5', '430': '1'
                },
    'xxx.com': {'403': '100', '404': '100', '500': '70', '502': '70', '499': '100', '406': '80', '503': '70',
                '504': '70', '599': '0', 'other': '60', '429': '5', '430': '1'
                }
}

coredns解析结果异常的问题

我发现k8s内coredns的解析结果有点问题。经常解析不出来。

/ # nslookup kubernetes-dashboard.kube-system.svc.cluster.local
Server:         10.253.255.10
Address:        10.253.255.10:53

Non-authoritative answer:

*** Can't find kubernetes-dashboard.kube-system.svc.cluster.local: No answer

/ # nslookup kubernetes-dashboard.kube-system.svc.cluster.local
Server:         10.253.255.10
Address:        10.253.255.10:53

Name:   kubernetes-dashboard.kube-system.svc.cluster.local
Address: 10.253.255.40

*** Can't find kubernetes-dashboard.kube-system.svc.cluster.local: No answer

/ # nslookup kubernetes-dashboard.kube-system.svc.cluster.local
Server:         10.253.255.10
Address:        10.253.255.10:53

Name:   kubernetes-dashboard.kube-system.svc.cluster.local
Address: 10.253.255.40

(更多…)

kubernetes 日常使用及操作测试


前文已经安装好了一套kubernetes 1.10,下面我们来进行日常使用测试

1. 创建部署及服务

编辑一个yaml文件:

vim nginx.yaml 
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: nginx-router
  namespace: test
spec:
  replicas: 2
  template:
    metadata:
      labels:
        app: nginx-router
    spec:
      containers:
      - name: nginx-router
        image: 172.21.248.242/base/nginx
        ports:
        - containerPort: 80

---
kind: Service
apiVersion: v1
metadata:
  name: nginx-router
  namespace: test
spec:
  selector:
    app: nginx-router
  ports:
    - protocol: TCP
      port: 80
      targetPort: 80

---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: nginx-router-ingress
  namespace: test
  annotations:
    kubernetes.io/ingress.class: traefik
spec:
  rules:
  - host: "nginx.k8s.dev.huilog.com"
    http:
      paths:
      - backend:
          serviceName: nginx-router
          servicePort: 80

# 创建部署及服务:
[root@bs-ops-test-docker-dev-01 dev]# kubectl create -f nginx.yaml 
deployment.extensions "nginx-router" created
service "nginx-router" created
ingress.extensions "nginx-router-ingress" created
[root@bs-ops-test-docker-dev-01 dev]#

(更多…)

kubernetes traefik ingress 安装及配置

Traefik是一款开源的反向代理与负载均衡工具。它最大的优点是能够与常见的微服务系统直接整合,可以实现自动化动态配置。目前支持Docker, Swarm, Mesos/Marathon, Mesos, Kubernetes, Consul, Etcd, Zookeeper, BoltDB, Rest API等等后端模型。
以下是架构图:

需要指出的是,ingress-controllers其实是kubernetes的一部分,ingress就是从kubernetes集群外访问集群的入口,将用户的URL请求转发到不同的service上。Ingress相当于nginx、apache等负载均衡方向代理服务器,其中还包括规则定义,即URL的路由信息,路由信息得的刷新由Ingress controller来提供。

Ingress Controller 实质上可以理解为是个监视器,Ingress Controller 通过不断地跟 kubernetes API 打交道,实时的感知后端 service、pod 等变化,比如新增和减少 pod,service 增加与减少等;当得到这些变化信息后,Ingress Controller 再结合下文的 Ingress 生成配置,然后更新反向代理负载均衡器,并刷新其配置,达到服务发现的作用。
(更多…)

kubernetes V1.10集群安装及配置

1. k8s集群系统规划

1.1. kubernetes 1.10的依赖

k8s V1.10对一些相关的软件包,如etcd,docker并不是全版本支持或全版本测试,建议的版本如下:
– docker: 1.11.2 to 1.13.1 and 17.03.x
– etcd: 3.1.12
– 全部信息如下:

参考:External Dependencies
* The supported etcd server version is 3.1.12, as compared to 3.0.17 in v1.9 (#60988)
* The validated docker versions are the same as for v1.9: 1.11.2 to 1.13.1 and 17.03.x (ref)
* The Go version is go1.9.3, as compared to go1.9.2 in v1.9. (#59012)
* The minimum supported go is the same as for v1.9: go1.9.1. (#55301)
* CNI is the same as v1.9: v0.6.0 (#51250)
* CSI is updated to 0.2.0 as compared to 0.1.0 in v1.9. (#60736)
* The dashboard add-on has been updated to v1.8.3, as compared to 1.8.0 in v1.9. (#57326)
* Heapster has is the same as v1.9: v1.5.0. It will be upgraded in v1.11. (ref)
* Cluster Autoscaler has been updated to v1.2.0. (#60842, @mwielgus)
* Updates kube-dns to v1.14.8 (#57918, @rramkumar1)
* Influxdb is unchanged from v1.9: v1.3.3 (#53319)
* Grafana is unchanged from v1.9: v4.4.3 (#53319)
* CAdvisor is v0.29.1 (#60867)
* fluentd-gcp-scaler is v0.3.0 (#61269)
* Updated fluentd in fluentd-es-image to fluentd v1.1.0 (#58525, @monotek)
* fluentd-elasticsearch is v2.0.4 (#58525)
* Updated fluentd-gcp to v3.0.0. (#60722)
* Ingress glbc is v1.0.0 (#61302)
* OIDC authentication is coreos/go-oidc v2 (#58544)
* Updated fluentd-gcp updated to v2.0.11. (#56927, @x13n)
* Calico has been updated to v2.6.7 (#59130, @caseydavenport)

(更多…)

在K8S 的ingress上配置HTTP认证的方法

在K8S 的ingress上配置HTTP认证的方法如下:

1 . 使用htpasswd创建一个auth文件:

htpasswd -c ./auth myusername
cat auth
myusername:$apr1$78Jyn/1K$ERHKVRPPlzAX8eBtLuvRZ0

2. 创建一个K8S的secret:

kubectl create secret generic mysecret --from-file auth --namespace=monitoring 
kubectl --namespace=monitoring get secret mysecret 
NAME      TYPE    DATA    AGE 
mysecret Opaque   1      106d

3. 通过以下参数将创建的secret与ingress关联起来:

  • ingress.kubernetes.io/auth-type: "basic"
  • ingress.kubernetes.io/auth-secret: "mysecret"
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
 name: prometheus-dashboard
 namespace: monitoring
 annotations:
   kubernetes.io/ingress.class: traefik
   ingress.kubernetes.io/auth-type: "basic"
   ingress.kubernetes.io/auth-secret: "mysecret"
spec:
 rules:
 - host: dashboard.prometheus.example.com
   http:
     paths:
     - backend:
         serviceName: prometheus
         servicePort: 9090

最后创建就可以了:

kubectl create -f prometheus-ingress.yaml -n monitoring

参考:

https://docs.traefik.io/user-guide/kubernetes/

https://docs.traefik.io/configuration/backends/kubernetes/