...
1# plank
2
3plank is a tool for multi-tenancy incident handling. Multi-tenancy in this case really equates to alerts from any GCP project. Edge clusters will be a focal point which may result in branching behavior but these cases should be few and scrutinized heavily.
4
5## Prereqs
6
7### Infra
8
9A [sandbox](https://docs.edge-infra.dev/dev/contributing/create-shared-dev-sandbox/) is recommended for testing as plank relies on GCP infrastructure (PubSub and permissions) to function. It will be assumed that KCC is already running here.
10
11### AlloyDB
12
13plank currently relies on [AlloyDB](https://cloud.google.com/alloydb) for its storage. As this technology is still new it cannot be fully automated through our existing means so there are a number of steps that must be done manually first. As these steps will change, use their [official docs](https://cloud.google.com/alloydb/docs/quickstart/create-and-connect) for creating a new database. The 2 values needed from this process are the database `password` which is given or generated at creation time and the AlloyDB [instance URI](https://cloud.google.com/alloydb/docs/auth-proxy/connect#instance-uris).
14
15Optionally
16
17### Create Configuration
18
19Before the pallet can successfully be deployed it needs a `ConfigMap` with the name `plank-config`. This will contain the AlloyDB instance URI and name of the PubSub Subscription to listen to. The instance URI can be constructed from the previous step. The Subscription name, as of now, will always have the value of `plank-handler` unless manually changed in the [KCC resource](gcpinfra/sub.yaml). **NOTE** When pallets support generalized rendering parameters this may be any value.
20
21```shell
22kubectl create configmap plank-config -n plank --from-literal=ALLOYDB_INSTANCE=<instance> --from-literal=SUBID=plank-handler
23```
24
25### Create Secret
26
27In addition, there needs to be a secret added to `Secret Manager` containing the AlloyDB password. **NOTE** Be sure you're targeting the right project!
28
29```shell
30gcloud secrets create plank-alloydb-secret --replication-policy="automatic" --project <project-id>
31echo -n "<password>" | gcloud secrets versions add plank-alloydb-secret --data-file=- --project <project-id>
32```
33
34## Installation
35
36### Apply Pallet
37
38The below steps assume a sandbox infra cluster and a separate gke cluster for plank itself. Any other valid pallet cluster topology will work but the `lift` commands will vary.
39
40Apply both infra and runtime resources. Example of the full command:
41
42```shell
43lift apply --infra=true --infra-namespace=plank-testing-infra --infra-context=sandbox-o11y --gcp-project-id=ret-edge-o11y --cluster-uuid=plank-testing --cluster-provider=gke config/pallets/plank
44```
45
46### Create SQL Table
47
48The database table needed by plank must be created manually, for now. This step may be done as part of the prereqs but it easier to perform once plank has been deployed so the AlloyDB instance can be access via the auth proxy container. See more details on connecting in [troubleshooting](#alloydb-access).
49
50Once connection as been established you'll need to use a tool like [pgAdmin](https://www.pgadmin.org/) to run SQL statements against the deployed instance. Below is the current schema for incidents used by the plank binary.
51
52```sql
53CREATE TABLE incidents (
54 incident_id VARCHAR ( 50 ) PRIMARY KEY,
55 policy_name VARCHAR ( 255 ) NOT NULL,
56 started_at int,
57 ended_at int,
58 project_id VARCHAR ( 50 ),
59 cluster_id VARCHAR ( 50 ),
60 url VARCHAR ( 255 ),
61 state VARCHAR ( 50 ),
62 location VARCHAR ( 50 ),
63 namespace VARCHAR ( 50 ),
64 pod VARCHAR (255 ),
65 container VARCHAR ( 255 ),
66 metrics_labels VARCHAR ( 255 )
67);
68```
69
70## Troubleshooting
71
72### external secrets runtime timeout
73
74When first applying the pallet to a GKE cluster, the external secrets dependency may hit the default 60s timeout waiting for all the resources to be up and running. The output of this is captured below.
75
76```
77waiting for runtime resources to become ready...
78Object kexternal-secrets-cert-controller Status:False Reason:MinimumReplicasUnavailable Message:Deployment does not have minimum availability.
79Object kexternal-secrets-webhook Status:False Reason:MinimumReplicasUnavailable Message:Deployment does not have minimum availability.
80timeout waiting for: [Deployment/external-secrets/kexternal-secrets-webhook status: 'InProgress', Deployment/external-secrets/kexternal-secrets-cert-controller status: 'InProgress']
81```
82
83If this happens just run the `lift` command again and it will progress past this step if there are no other issues.
84
85### plank infr timeout
86
87When first applying the pallet, the infra resources for plank may hit the default 60s timeout waiting for all resources to be up and running. An example output of this is captured below.
88
89```
90waiting for infrastructure resources to become ready...
91Object plank-handler-subscriber Status:False Reason:DependencyNotReady Message:reference PubSubSubscription plank-testing-infra/plank-handler is not ready
92timeout waiting for: [IAMPolicyMember/plank-testing-infra/plank-handler-subscriber status: 'InProgress']
93```
94
95If this happens just run the `lift` command again and it will progress past this step if there are no other issues.
96
97### AlloyDB access
98
99AlloyDB is under private preview and one of the current limitations is that it cannot be publicly exposed via GCP configuration. As a work around you must use the [AlloyDB Auth proxy](https://cloud.google.com/alloydb/docs/auth-proxy/overview). While the documentation may be followed to deploy a separate instance, the plank pallet already deploys this component. To open a connection from your machine to the proxy you must use the following `port-forward` command
100
101```shell
102kubectl port-forward service/plank 5432:5432 -n plank
103```
104
105From there using a PostgreSQL client tool, recommendation is [pgAdmin](https://www.pgadmin.org/), you can access the cluster via `localhost` with the given `port` and database `password`. Default values for everything else should suffice.
View as plain text