CKA认证经验贴(认证日期:20200817)

一、背景

    由于年初疫情影响,身处传统IT行业且兼职出差全国各地“救火”的我有幸被领导选中调研私有云平台,这就给我后来的认证之路做下了铺垫。之前调研kubernetes的v1.17版本自带kubeadm搭建起了高可用集群,随后负责各个模块的docker化。后来跟同行们聊天,发现大家都用上了rancher管理平台,于是又推倒重来,用上了rancher2.x,真是折腾。

    k8s作为一个容器编排引擎,一出生就是含着金汤匙出来的。背后的谷歌公司,redhat的大力支持,二者一拍即合成立了CNCF基金会来托管该项目。CKA全称是Certificated Kubernetes Administrator,由Kubernetes的管理机构 CNCF授权,是CNCF官方认证的 Kubernetes 管理员 。除了相关工作经验之外,通过该认证也是候选人对使用和运维k8s能力的一种体现。近年来Linux基金会下属的THELINUXFOUNDATION在中国有了运营团队(中国报名官网地址: LF开源软件大学-Linux Foundation OSS University ),而且也开设了线下的考试中心。成功报名后有一年的考试时间,期间有一次retake重考机会。

二、cka考纲(v1.18,九月份会升到v1.19)

    网络11% ,存储7%, 安全12%,集群维护11%,故障排除10%,核心概念19%,安装、配置和验证 12%,日志&监控5%,应用程序声明周期管理8%,调度5%

三、备考过程

    今年5月份,Linux基金会联合CNCF推出cka和ckad报考7折优惠。原价2088RMB的报考费一下子变成了1461.60RMB。抓住这个机会在5月末前上了车。之后在各地“救火”过程中,时断时续地看《每天5分钟玩转kubernetes》和听极客时间张磊老师的《深入剖析kubernetes》的课程加深理解,然后就是在github上找cka的真题练手。

四、考试预约

    预约考试的网址:https://portal.linuxfoundation.org/portal,登录该网站,选择要考试的时间,会匹配最近可用的相关时间。

五、考试注意事项

  1.   疫情期间,线下考试中心暂时关闭。老外监考,有什么不懂直接问,中英文都没问题,考官超耐心的
  2.   需要英文证件,身份证+信用卡,最好用护照。(我没办护照,用了港澳通行证和Visa信用卡,跟考官解释了下放行了)
  3.   看官网消息,9月份考题更新,时间会从3个小时缩短为2小时,考试重点会偏移,最好在8月底前考过
  4.   预约时间最好是早上,网络稳定,只有参加国外考试,才知道有把好梯子是多么重要
  5.   任务管理器的应用程序只能有chrome一个程序,梯子可以在后台一直挂着,我用的梯子直连老美洛杉矶机房的线路 (希望老美正在酝酿的净网行动"流产" ^_^)
  6.   考试只能打开两个标签页,一个psi考试界面和一个kubernetes.io官网的标签页,不能出现第三个和官网文档中的第三方链接的标签页(有考友打开了wins自带的记事本,被考官视为违规禁考半小时,心态崩了,直接挂了)
  7.   平时练习真题的时候,最好能在2小时内完成,不然考试的时候有点悬,考试的终端和note工具并没有那么好用,响应很慢
  8.   一定要把考纲中的创建pod、initContainer、secret、daemonsets、deployment等常用的yaml样式保存到书签里,考试的时候直接贴出来在上面改就行
  9.   题目中会有多个k8s集群,大多数题目会固定在一个k8s集群中,少部分会在ik8s、vk8s、bk8s这种集群中
  10.   一定要注意题目细节,名称,镜像,目录等。建议先粘贴yaml文件,然后根据题目去粘贴更改名称,命名空间,镜像,容器名,标签等,鼠标点一下关键字,ctrl+Insert复制,shift+Insert粘贴
  11.   考试说是3小时,做好4小时不上厕所和不喝水的准备,因为会遇到莫名的断网和psi国外考官超耐心的检查备考环境环节

六、八月真题

1、日志 kubectl logs

# Set configuration context $ kubectl config use-context k8s Monitor the logs of Pod foobar and
# Extract log lines corresponding to error file-not-found
# Write them to /opt/KULM00201/foobar
kubectl logs foobar | grep file-not-found > /opt/KULM00201/foobar

2、输出排序 --sort-by=.metadata.name

# List all PVs sorted by name saving the full kubectl output to /opt/KUCC0010/my_volumes . 
# Use kubectl’s own functionally for sorting the output, and do not manipulate it any further.
kubectl get pv --all-namespaces --sort-by=.metadata.name

3、ds部署

# Ensure a single instance of Pod nginx is running on each node of the kubernetes cluster where nginx also represents 
# the image name which has to be used. Do no override any taints currently in place.
# Use Daemonsets to complete this task and use ds.kusc00201 as Daemonset name
# 题目对应文档:https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
# 删除tolerations字段,复制到image: gcr.io/fluentd-elasticsearch/fluentd:v2.5.1这里即可,再按题意更改yaml文件。

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: ds.kusc00201
  namespace: kube-system
  labels:
    k8s-app: fluentd-logging
spec:
  selector:
    matchLabels:
      name: fluentd-elasticsearch
  template:
    metadata:
      labels:
        name: fluentd-elasticsearch
    spec:
      containers:
      - name: fluentd-elasticsearch
        image: nginx

4、initContainers

# Add an init container to lumpy--koala (Which has been defined in spec file /opt/kucc00100/pod-spec-KUCC00100.yaml)
# The init container should create an empty file named /workdir/calm.txt
# If /workdir/calm.txt is not detected, the Pod should exit
# Once the spec file has been updated with the init container definition, the Pod should be created.
题目中yaml文件已经给出,只需要增加initcontainers部分,以及emptyDir: {} 即可
init文档位置:https://kubernetes.io/docs/concepts/workloads/pods/init-containers/

apiVersion: v1
kind: Pod
metadata:
  name: lumpy--koala
  labels:
    app: myapp
spec:
  containers:
  - name: myapp-con
    image: nginx
    command: ['sh', '-c', 'echo The app is running! && sleep 3600']
    volumeMounts: #数据卷目录
    - name: data
      mountPath: /workdir

    livenessProbe:  #健康检查
      exec:
        command:
        - cat
        - /workdir/calm.txt

  initContainers:
  - name: init-myservice
    image: busybox:1.28
    command: ['sh', '-c', "touch /workdir/calm.txt"]

    volumeMounts: #数据卷目录
    - name: data
      mountPath: /workdir

  volumes: #空数据卷
  - name: data
    emptyDir: {}

5、多容器 

# Create a pod named kucc4 with a single container for each of the following images running inside
#(there may be between 1 and 4 images specified): nginx + redis + memcached + consul
apiVersion: v1
kind: Pod
metadata:
  name: kucc4
spec:
  containers:
  - name: nginx
    image: nginx
  - name: redis
    image: redis
  - name: memcached
    image: memcached
  - name: consul
    image: consul

6、nodeSelector

# Schedule a Pod as follows:
# Name: nginx-kusc00101
# Image: nginx
# Node selector: disk=ssd
apiVersion: v1
kind: Pod
metadata:
  name: nginx-kusc00101
  labels:
    env: test
spec:
  containers:
  - name: nginx
    image: nginx
    imagePullPolicy: IfNotPresent
  nodeSelector:
    disktype: ssd

7、deployment升级和回退(set image --record rollout undo)

# Create a deployment as follows
# Name: nginx-app
# Using container nginx with version 1.10.2-alpine
# The deployment should contain 3 replicas
# Next, deploy the app with new version 1.13.0-alpine by performing a rolling update and record that update.
# Finally, rollback that update to the previous version 1.10.2-alpine
kubectl create deployment nginx-app --image=nginx:1.10.2-alpine
kubectl scale deploy nginx-app --replicas=3
kubectl set image deploy nginx-app nginx=nginx:1.13.0-alpine --record #记录
kubectl rollout history deploy nginx-app #查看更新记录
kubectl rollout undo  deploy nginx-app --to-revision=1  #回滚到上一版本

8、NodePort

# Create and configure the service front-end-service so it’s accessible through NodePort 
# and routes to the existing pod named front-end
kubectl expose pod front-end --name=front-end-service --port=80 --type=NodePort

9、namespace

# Create a Pod as follows:
# Name: jenkins
# Using image: jenkins
# In a new Kubenetes namespace named website-frontend
kubectl create ns website-frontend  

kubectl run jenkins --image=jenkins -n website-frontend  

apiVersion: v1
kind: Pod
metadata:
  name: Jenkins
  namespace: website-frontend
spec:
  containers:
  - name: Jenkins
      image: Jenkins

10、kubectl run 命令使用

# Create a deployment spec file that will:
# Launch 7 replicas of the redis image with the label: app_env_stage=dev
# Deployment name: kual0020
# Save a copy of this spec file to /opt/KUAL00201/deploy_spec.yaml (or .json)
# When you are done, clean up (delete) any new k8s API objects that you produced during this task 
kubectl run kual00201 --image=redis --labels=app_enb_stage=dev --dry-run -oyaml > /opt/KUAL00201/deploy_spec.yaml

11、根据service的selector查询pod

# Create a file /opt/KUCC00302/kucc00302.txt that lists all pods that implement Service foo in Namespace production.
# The format of the file should be one pod name per line
kubectl get svc -n production --show-labels | grep foo
kubectl get pods -l app=foo(label标签)  | grep -v NAME | awk '{print $1}' >> /opt/KUCC00302/kucc00302.txt

12、emptyDir

# Create a pod as follows:
# Name: non-persistent-redis
# Container image: redis
# Named-volume with name: cache-control
# Mount path: /data/redis
# It should launch in the pre-prod namespace and the volume MUST NOT be persistent.
没有明确要求挂载在node主机上的具体位置,使用随机位置emptyDir:{} ,如果明确挂载到主机的指定位置和地址,则使用hostPath.
1。创建pre-prod名称空间
kubectl create ns pre-prod
2.创建yaml文件,如下:
apiVersion: v1
kind: Pod
metadata:
  name: non-presistent-redis
  namespace: pre-prod
spec:
  containers:
  - image: redis
    name: redis
    volumeMounts:
    - mountPath: /data/redis
      name: cache-control
  volumes:
  - name: cache-control
    emptyDir: {}

13、deploy scale

kubectl scale deployment website --replicas=6

14、统计可调度node数

# Check to see how many nodes are ready (not including nodes tainted NoSchedule) and write the number to /opt/nodenum
1.kubectl get node | grep -w  Ready | wc -l          ####grep -w是精确匹配
通过上面命令取得一个数N
2.通过下面命令取得一个数M
kubectl describe nodes | grep Taints | grep -I noschedule | wc -l
3.答案填写N减去M得到的值

15、kubectl top

# From the Pod label name=cpu-utilizer, find pods running high CPU workloads 
# and write the name of the Pod consuming most CPU to the file /opt/cpu.txt (which already exists)
kubectl top pods --sort-by="cpu" -l app=web

16、node notReady

# A Kubernetes worker node, labelled with name=wk8s-node-0 is in state NotReady .
# Investigate why this is the case, and perform any appropriate steps to bring the node to a Ready state, 
# ensuring that any changes are made permanent.
kubectl get nodes | grep NotReady
ssh node  
systemctl status kubelet
systemctl start kubelet   
systemctl enable kubelet

17、pv创建

# Creae a persistent volume with name app-config of capacity 1Gi and access mode ReadWriteOnce. 
# The type of volume is hostPath and its location is /srv/app-config
# https://kubernetes.io/docs/tasks/configure-pod-container/configure-persistent-volume-storage/#create-a-persistentvolume

apiVersion: v1
kind: PersistentVolume
metadata:
  name: pv0003
spec
  capacity:
    storage: 1Gi
  volumeMode: Filesystem
  accessModes:
    - ReadWriteOnce
  persistentVolumeReclaimPolicy: Recycle
  storageClassName: slow
  hostPath:
    path: /srv/app-config

18、etcd备份

# Create a snapshot of the etcd instance running at https://127.0.0.1:2379 saving the snapshot to the 
# file path /data/backup/etcd-snapshot.db
# The etcd instance is running etcd version 3.1.10
# The following TLS certificates/key are supplied for connecting to the server with etcdctl
# CA certificate: /opt/KUCM00302/ca.crt
# Client certificate: /opt/KUCM00302/etcd-client.crt
# Clientkey:/opt/KUCM00302/etcd-client.key 
ETCDCTL_API=3 etcdctl --endpoints=https://127.0.0.1:2379  --cacert=ca.pem --cert=server.pem --key=server-key.pem  snapshot save 给的路径
备份:
ETCDCTL_API=3 /usr/bin/etcdctl snapshot save  /data/backup/etcd-snapshot.db 
--endpoints=https://127.0.0.1:2379 
--cacert=/opt/KUCM00302/ca.crt 
--cert=/opt/KUCM00302/etcd-client.crt 
--key=/opt/KUCM00302/etcd-client.key

19、node维护(drain、cordon、uncordon)

# Set the node labelled with name=ek8s-node-1 as unavailable and reschedule all the pods running on it.
先切换集群到ek8
kubectl get nodes -l name=ek8s-node-1
kubectl drain wk8s-node-1  
#有人说遇到命令执行失败,需要加以下参数,个人没遇到
#--ignore-daemonsets=true --delete-local-data=true --force=true

# Node 正常下线流程:
# 1 cordon 设置维护的节点不可调度
# 2 drain 驱逐节点上pod
# 3 delete node 
kubectl cordon  k8s-node2
kubectl drain k8s-node2 --ignore-daemonsets --force 

20、svc dns

# Create a deployment as follows
# Name: nginx-dns
# Exposed via a service: nginx-dns
# Ensure that the service & pod are accessible via their respective DNS records
# The container(s) within any Pod(s) running as a part of this deployment should use the nginx image
# Next, use the utility nslookup to look up the DNS records of the service & pod and write the output to /opt/service.dns and /opt/pod.dns respectively.
# Ensure you use the busybox:1.28 image(or earlier) for any testing, an the latest release has an unpstream bug which impacts thd use of nslookup.
第一步:创建deployment
kubectl run nginx-dns --image=nginx
第二步:发布服务
kubectl expose deployment nginx-dns --name=nginx-dns --port=80 --type=NodePort
第三步:查询podIP
kubectl  get pods -o wide (获取pod的ip)  比如Ip是:10.244.1.37 
第四步:使用busybox1.28版本进行测试
kubectl run busybox -it --rm --image=busybox:1.28 sh
\#:/ nslookup nginx-dns     #####查询nginx-dns的记录
\#:/ nslookup 10.244.1.37  #####查询pod的记录
第五步:
把查询到的记录,写到题目要求的文件内,/opt/service.dns和/opt/pod.dns
\####这题有疑义,干脆把查到的结果都写进去,给不给分靠天收,写全一点。
1。nginx-dns的
echo 'Name: nginx-dns' >> /opt/service.dns
echo 'Address 1: 10.96.0.10 kube-dns.kube-system.svc.cluster.local' >> /opt/service.dns
2。pod的
echo 'Name:      10.244.1.37' >> /opt/pod.dns
echo 'Address 1: 10.244.1.37 10-244-1-37.nginx-dns.default.svc.cluster.local' >> /opt/pod.dns

21、secret挂载

# Create a Kubernetes Secret as follows:
# Name: super-secret
# Credential: alice  or username:bob 
# Create a Pod named pod-secrets-via-file using the redis image which mounts a secret named super-secret at /secrets
# Create a second Pod named pod-secrets-via-env using the redis image, which exports credential as TOPSECRET
https://kubernetes.io/zh/docs/concepts/configuration/secret/#%E8%AF%A6%E7%BB%86
echo -n "bob" | base64

apiVersion: v1
kind: Secret
metadata:
  name: super-secret
type: Opaque
data:
  username: Ym9i  # echo -n "bob" | base64
  
apiVersion: v1
kind: Pod
metadata:
  name: pod1
spec:
containers:
- name: mypod
  image: redis
  volumeMounts:
- name: foo
  mountPath: "/secret"
  readOnly: true
volumes: secret
- name: foo
  secret:
    secretName: super-secret
---
apiVersion: v1
kind: Pod
metadata:
  name: pod-evn-eee
spec:
containers:
- name: mycontainer
image: redis
env:
- name: SECRET_USERNAME
    valueFrom:
      secretKeyRef:
        name: super-secret
        key: username
restartPolicy: Never

22、static pod --pod-manifest-path

# Configure the kubelet systemd managed service, on the node labelled with name=wk8s-node-1, 
# to launch a Pod containing a single container of image nginx named myservice automatically. 
# Any spec files required should be placed in the /etc/kubernetes/manifests directory on the node.
该文件应该放置在/etc/kubernetes/manifest目录下(给出了pod路径)
1.vi /etc/kubernetes/manifest/static-pod.yaml
定义一个POD
2.systemctl status kubelet   查找kubelet.service路径  
3.vi /etc/systemd/system/kubernetes.service   
观察有没有 --pod-manifest-path=/etc/kubernetes/manifest 
没有就加上
4.ssh node  sudo -i
5. systemctl daemon-reload
    systemctl restart kubelet.service
    systemctl enable kubelet

23、集群问题排查

# Determine the node, the failing service and take actions to bring up the failed service 
# and restore the health of the cluster. Ensure that any changes are made permanently.
# The worker node in this cluster is labelled with name=bk8s-node-0

ps -ef|grep kubelet

查找--config=/var/lib/kubelet/config.yaml 这个参数指定的yaml里看下有没有指定静态pod的指定路径,要是也没有的话,kubelet是不会自动创建静态Pod的,而且pod-manifest-path没有默认值。 cat /var/lib/kubelet/config.yaml

发现没有指定静态Pod路径的参数,在最后添加staticPodPath: /etc/kubernetes/manifests 然后运行:

systemctl restart kubelet systemctl enable kubelet 再查看node啥的,就OK了

24、kubeadm部署集群

要求:

   提供两个节点,master1和node1,和一个admin.conf文件,部署集群。

步骤:

1. ssh到master节点主机
2. 安装相关组件、kubelet配置自启动
官网文档(https://kubernetes.io/zh/docs/setup/production-environment/tools/kubeadm/install-kubeadm/#%E5%AE%89%E8%A3%85-kubeadm-kubelet-%E5%92%8C-kubectl)
sudo apt-get update && sudo apt-get install -y apt-transport-https curl
curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -  
cat <<EOF | sudo tee /etc/apt/sources.list.d/kubernetes.list 
deb https://apt.kubernetes.io/ kubernetes-xenial main
EOF
sudo apt-get update
sudo apt-get install -y kubelet kubeadm kubectl
sudo apt-mark hold kubelet kubeadm kubectl
3. 初始化master节点
kubeadm init --config /etc/kubeadm.conf ignore-preflight-errors=all  #忽略错误参数和配置文件都有提供,配置文件不用改,注意审题!
4. 复制从节点加入命令
5. 切回学生主机
6. ssh到node节点主机
7. 安装相关组件、kubelet配置自启动
8. 粘贴节点加入命令
9. 切回学生主机
10. ssh到master节点主机
11. 检查从节点是否加入
12. master节点运行网络插件安装命令
官网文档(https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#pod-network)
kubectl apply -f https://docs.projectcalico.org/v3.14/manifests/calico.yaml
13. 检查节点是否都ready 14. 切回学生主机
 
posted @ 2020-08-22 21:58  Aaron-Ye  阅读(2305)  评论(0编辑  收藏  举报