Skip to content
This repository was archived by the owner on Oct 31, 2019. It is now read-only.

Commit c03b9d5

Browse files
committed
Add network diagrams and restructure README.
1 parent 047c046 commit c03b9d5

10 files changed

+431
-396
lines changed

README.md

Lines changed: 26 additions & 396 deletions
Large diffs are not rendered by default.

docs/cluster-access.md

Lines changed: 123 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,123 @@
1+
# Accessing the Cluster
2+
3+
## Access the cluster using kubectl, continuous build pipelines, or other clients
4+
5+
If you've chosen to configure a _public_ Load Balancer for your Kubernetes Master(s) (i.e. `control_plane_subnet_access=public` or
6+
`control_plane_subnet_access=private` _and_ `k8s_master_lb_access=public`), you can interact with your cluster using kubectl, continuous build
7+
pipelines, or any other client over the Internet. A working kubeconfig can be found in the ./generated folder or generated on the fly using the `kubeconfig` Terraform output variable.
8+
9+
```bash
10+
# warning: 0.0.0.0/0 is wide open. Consider limiting HTTPs ingress to smaller set of IPs.
11+
$ terraform plan -var master_https_ingress=0.0.0.0/0
12+
$ terraform apply -var master_https_ingress=0.0.0.0/0
13+
# consider closing access off again using terraform apply -var master_https_ingress=10.0.0.0/16
14+
```
15+
16+
```bash
17+
$ export KUBECONFIG=`pwd`/generated/kubeconfig
18+
$ kubectl cluster-info
19+
$ kubectl get nodes
20+
```
21+
22+
If you've chosen to configure a strictly _private_ cluster (i.e. `control_plane_subnet_access=private` _and_ `k8s_master_lb_access=private`),
23+
access to the cluster will be limited to the NAT instance(s) similar to how you would use a bastion host e.g.
24+
25+
```bash
26+
$ terraform plan -var public_subnet_ssh_ingress=0.0.0.0/0
27+
$ terraform apply -var public_subnet_ssh_ingress=0.0.0.0/0
28+
$ terraform output ssh_private_key > generated/instances_id_rsa
29+
$ chmod 600 generated/instances_id_rsa
30+
$ scp -i generated/instances_id_rsa generated/instances_id_rsa opc@NAT_INSTANCE_PUBLIC_IP:/home/opc/
31+
$ ssh -i generated/instances_id_rsa opc@NAT_INSTANCE_PUBLIC_IP
32+
nat$ ssh -i /home/opc/instances_id_rsa opc@K8SMASTER_INSTANCE_PRIVATE_IP
33+
master$ kubectl cluster-info
34+
master$ kubectl get nodes
35+
```
36+
37+
Note, for easier access, consider setting up an SSH tunnel between your local host and a NAT instance.
38+
39+
## Access the cluster using Kubernetes Dashboard
40+
41+
Assuming `kubectl` has access to the Kubernetes Master Load Balancer, you can use use `kubectl proxy` to access the
42+
Dashboard:
43+
44+
```
45+
kubectl proxy &
46+
open http://localhost:8001/ui
47+
```
48+
49+
## Verifying your cluster:
50+
51+
If you've chosen to configure a public cluster, you can do a quick and automated verification of your cluster from
52+
your local machine by running the `cluster-check.sh` located in the `scripts` directory. Note that this script requires your KUBECONFIG environment variable to be set (above), and SSH and HTTPs access to be open to etcd and worker nodes.
53+
54+
To temporarily open access SSH and HTTPs access for `cluster-check.sh`, add the following to your `terraform.tfvars` file:
55+
56+
```bash
57+
# warning: 0.0.0.0/0 is wide open. remember to undo this.
58+
etcd_ssh_ingress = "0.0.0.0/0"
59+
master_ssh_ingress = "0.0.0.0/0"
60+
worker_ssh_ingress = "0.0.0.0/0"
61+
master_https_ingress = "0.0.0.0/0"
62+
worker_nodeport_ingress = "0.0.0.0/0"
63+
```
64+
65+
```bash
66+
$ scripts/cluster-check.sh
67+
```
68+
```
69+
[cluster-check.sh] Running some basic checks on Kubernetes cluster....
70+
[cluster-check.sh] Checking ssh connectivity to each node...
71+
[cluster-check.sh] Checking whether instance bootstrap has completed on each node...
72+
[cluster-check.sh] Checking Flannel's etcd key from each node...
73+
[cluster-check.sh] Checking whether expected system services are running on each node...
74+
[cluster-check.sh] Checking status of /healthz endpoint at each k8s master node...
75+
[cluster-check.sh] Checking status of /healthz endpoint at the LB...
76+
[cluster-check.sh] Running 'kubectl get nodes' a number or times through the master LB...
77+
78+
The Kubernetes cluster is up and appears to be healthy.
79+
Kubernetes master is running at https://129.146.22.175:443
80+
KubeDNS is running at https://129.146.22.175:443/api/v1/proxy/namespaces/kube-system/services/kube-dns
81+
kubernetes-dashboard is running at https://129.146.22.175:443/ui
82+
```
83+
84+
## SSH into OCI Instances
85+
86+
If you've chosen to launch your control plane instance in _public_ subnets (i.e. `control_plane_subnet_access=public`), you can open
87+
access SSH access to your master nodes by adding the following to your `terraform.tfvars` file:
88+
89+
```bash
90+
# warning: 0.0.0.0/0 is wide open. remember to undo this.
91+
etcd_ssh_ingress = "0.0.0.0/0"
92+
master_ssh_ingress = "0.0.0.0/0"
93+
worker_ssh_ingress = "0.0.0.0/0"
94+
```
95+
96+
```bash
97+
# Create local SSH private key file logging into OCI instances
98+
$ terraform output ssh_private_key > generated/instances_id_rsa
99+
# Retrieve public IP for etcd nodes
100+
$ terraform output etcd_public_ips
101+
# Log in as user opc to the OEL OS
102+
$ ssh -i `pwd`/generated/instances_id_rsa opc@ETCD_INSTANCE_PUBLIC_IP
103+
# Retrieve public IP for k8s masters
104+
$ terraform output master_public_ips
105+
$ ssh -i `pwd`/generated/instances_id_rsa opc@K8SMASTER_INSTANCE_PUBLIC_IP
106+
# Retrieve public IP for k8s workers
107+
$ terraform output worker_public_ips
108+
$ ssh -i `pwd`/generated/instances_id_rsa opc@K8SWORKER_INSTANCE_PUBLIC_IP
109+
```
110+
111+
If you've chosen to launch your control plane instance in _private_ subnets (i.e. `control_plane_subnet_access=private`), you'll
112+
need to first SSH into a NAT instance, then to a worker, master, or etcd node:
113+
114+
```bash
115+
$ terraform plan -var public_subnet_ssh_ingress=0.0.0.0/0
116+
$ terraform apply -var public_subnet_ssh_ingress=0.0.0.0/0
117+
$ terraform output ssh_private_key > generated/instances_id_rsa
118+
$ chmod 600 generated/instances_id_rsa
119+
$ terraform output nat_instance_public_ips
120+
$ scp -i generated/instances_id_rsa generated/instances_id_rsa opc@NAT_INSTANCE_PUBLIC_IP:/home/opc/
121+
$ ssh -i generated/instances_id_rsa opc@NAT_INSTANCE_PUBLIC_IP
122+
nat$ ssh -i /home/opc/instances_id_rsa opc@PRIVATE_IP
123+
```

docs/examples.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Example Operations
2+
3+
## Deploying a new cluster using terraform apply
4+
5+
Override any of the above input variables in your terraform.vars and run the plan and apply commands:
6+
7+
```bash
8+
# verify what will change
9+
$ terraform plan
10+
11+
# scale workers
12+
$ terraform apply
13+
```
14+
15+
## Scaling k8s workers (in or out) using terraform apply
16+
17+
To scale workers in or out, adjust the `k8sWorkerAd1Count`, `k8sWorkerAd2Count`, or `k8sWorkerAd3Count` input
18+
variables in terraform.vars and run the plan and apply commands:
19+
20+
```bash
21+
# verify changes
22+
$ terraform plan
23+
24+
# scale workers (use -target=module.instances-k8sworker-adX to only target workers in a particular AD)
25+
$ terraform apply
26+
```
27+
28+
When scaling worker nodes _up_, you will need to wait for the node initialization to finish asynchronously before
29+
the new nodes will be seen with `kubectl get nodes`
30+
31+
When scaling worker nodes _down_, the instances/k8sworker module's user_data code will take care of running `kubectl drain` and `kubectl delete node` on the nodes being terminated.
32+
33+
## Scaling k8s masters (in or out) using terraform apply
34+
35+
To scale the masters in or out, adjust the `k8sMasterAd1Count`, `k8sMasterAd2Count`, or `k8sMasterAd3Count` input variables in terraform.vars and run the plan and apply commands:
36+
37+
```bash
38+
# verify changes
39+
$ terraform plan
40+
41+
# scale master nodes
42+
$ terraform apply
43+
```
44+
45+
Similar to the initial deployment, you will need to wait for the node initialization to finish asynchronously.
46+
47+
## Scaling etcd nodes (in or out) using terraform apply
48+
49+
Scaling the etcd nodes in or out after the initial deployment is not currently supported. Terminating all the nodes in the etcd cluster will result in data loss.
50+
51+
## Replacing worker nodes using terraform taint
52+
53+
We can use `terraform taint` to worker instances in a particular AD as "tainted", which will cause
54+
them to be destroyed and recreated on the next apply. This can be a useful strategy for reverting local changes or
55+
regenerating a misbehaving worker.
56+
57+
```bash
58+
# taint all workers in AD1
59+
terraform taint -module=instances-k8sworker-ad1 oci_core_instance.TFInstanceK8sWorker
60+
# optionally taint workers in AD2 and AD3 or do so in a subsequent apply
61+
# terraform taint -module=instances-k8sworker-ad2 oci_core_instance.TFInstanceK8sWorker
62+
# terraform taint -module=instances-k8sworker-ad3 oci_core_instance.TFInstanceK8sWorker
63+
64+
# preview changes
65+
$ terraform plan
66+
67+
# replace workers
68+
$ terraform apply
69+
```
70+
71+
## Replacing masters using terraform taint
72+
73+
We can also use `terraform taint` to master instances in a particular AD as "tainted", which will cause
74+
them to be destroyed and recreated on the next apply. This can be a useful strategy for reverting local
75+
changes or regenerating a misbehaving master.
76+
77+
```bash
78+
# taint all masters in AD1
79+
terraform taint -module=instances-k8smaster-ad1 oci_core_instance.TFInstanceK8sMaster
80+
# optionally taint masters in AD2 and AD3 or do so in a subsequent apply
81+
# terraform taint -module=instances-k8smaster-ad2 oci_core_instance.TFInstanceK8sMaster
82+
# terraform taint -module=instances-k8smaster-ad3 oci_core_instance.TFInstanceK8sMaster
83+
84+
# preview changes
85+
$ terraform plan
86+
87+
# replace workers
88+
$ terraform apply
89+
```
90+
91+
## Upgrading cluster using the k8s_ver input variable
92+
93+
One way to upgrade your cluster is by incrementally changing the value of the `k8s_ver` input variable on your master and then worker nodes.
94+
95+
```bash
96+
# preview upgrade of all workers in AD1 to K8s 1.7.5
97+
$ terraform plan -var k8s_ver=1.7.5 -target=module.instances-k8sworker-ad1
98+
99+
# perform upgrade/replace workers
100+
$ terraform apply -var k8s_ver=1.7.5 -target=module.instances-k8sworker-ad1
101+
```
102+
103+
The above command will:
104+
105+
1. drain all worker nodes in AD1 to your nodes in AD2 and AD3
106+
2. destroy all worker nodes in AD1
107+
3. re-create worker nodes in AD1 using Kubernetes 1.7.5
108+
109+
If you have more than one worker in an AD, you can upgrade worker nodes individually using the subscript operator
110+
111+
```bash
112+
# preview upgrade of a single worker in AD1 to K8s 1.7.5
113+
$ terraform plan -var k8s_ver=1.7.5 -target=module.instances-k8smaster-ad1.oci_core_instance.TFInstanceK8sMaster[1]
114+
115+
# perform upgrade/replace of worker
116+
$ terraform apply -var k8s_ver=1.7.5 -target=module.instances-k8sworker-ad1
117+
```
118+
Be sure to smoke test this approach on a stand-by cluster to weed out pitfalls and ensure our scripts are compatible
119+
with the version of Kubernetes you are trying to upgrade to. We have not tested other versions of Kubernetes other
120+
than the current default version.
121+
122+
## Replacing etcd cluster members using terraform taint
123+
124+
Replacing etcd cluster members after the initial deployment is not currently supported.
125+
126+
## Deleting a cluster using terraform destroy
127+
128+
```bash
129+
$ terraform destroy
130+
```

docs/images/arch.jpg

-6.39 KB
Loading
Loading
Loading
Loading
104 KB
Loading

0 commit comments

Comments
 (0)