Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cassandra basic docs? #26

Open
paralin opened this issue Jun 6, 2016 · 22 comments
Open

Cassandra basic docs? #26

paralin opened this issue Jun 6, 2016 · 22 comments
Assignees

Comments

@paralin
Copy link

paralin commented Jun 6, 2016

Hey,

Could you potentially make a short readme on how to set up the cassandra petset? Just to get me up to speed? I'm reading through it right now and it seems to make sense except for:

$ kubectl create -f cassandra-petset-local.yaml
unable to decode "cassandra-petset-local.yaml": quantities must match the regular expression '^([+-]?[0-9.]+)([eEinumkKMGTP]*[-+]?[0-9]*)$'

Just a short doc with how you usually go about testing it would be nice. Nothing fancy or polished.

Thanks.

@paralin
Copy link
Author

paralin commented Jun 6, 2016

I fixed the document by putting quotes around anything that started with a number but ended / had a letter in it.

The basic idea is:

$ kubectl create -f cassandra-service.yaml
$ kubectl create -f cassandra-petset-local.yaml

Wait for the first node "cassandra-0" to be ready, then change "replicas" to 3 or so on the PetSet, as the nodes are stable edit each of their pod definitions and flip "initialized" to false which will trigger the node to enter the quorum and become "NORMAL" from "UP".

Works very well. Great work!

@chrislovecnm
Copy link
Member

Thanks @paralin that was actually an issue that I was working through that was introduced in v1.3.0-alpha.5. Actually going to file a bug with kubernetes folks. I will update either the Cassandra example in the kubernetes project or the documentation here.

@chrislovecnm
Copy link
Member

@paralin I don't think you have to wait for a node to be ready. I now have an environment where I can test more, but I have launched 3 nodes at the same time.

@paralin
Copy link
Author

paralin commented Jun 6, 2016

I used a linter to check the Yaml and it's actually valid but for some reason Kubernetes doesn't read it properly. It makes sense since the integer in the beginning of the value would suggest a number so it would read it as such, but it's not too big of a deal to put quotes around the values.

@chrislovecnm What does setting initialized to false actually do? How do you control what it does?

Also we need to mint a new image for cassandra that uses dumb-init, the current one does not process the terminate signal properly, which means Cassandra will always wait the 30 second grace period and then be killed ungracefully.

@chrislovecnm
Copy link
Member

@chrislovecnm What does setting initialized to false actually do? How do you control what it does?

I am guessing that you are referring to

pod.alpha.kubernetes.io/initialized: "true"

It is my understanding that changing it to false will only launch one pod. Something else that you will want in your documentation @bprashanth :)

@chrislovecnm
Copy link
Member

chrislovecnm commented Jun 6, 2016

Thanks for the catch in regards to stopping. Would we have the same behavior in alpine? I have been wanting to move this over to a small image, min ubuntu or alpine. @mward29 has done some work with alpine already, but will we have the same issue?

@paralin
Copy link
Author

paralin commented Jun 6, 2016

Alpine is fine, you can just download the precompiled dumb-init binary in a run step. But yes you would need dumb-init.

@chrislovecnm
Copy link
Member

Opened another issue about the init issue.

@chrislovecnm chrislovecnm self-assigned this Jun 6, 2016
@chrislovecnm
Copy link
Member

@paralin What did you do to fix the yaml? We need to file this as a bug with k8s

@paralin
Copy link
Author

paralin commented Jun 6, 2016

Actually there's no bug, it's just a problem with your "resources.limits", it should be something like "1m" not just "1". I removed the section entirely.

@paralin
Copy link
Author

paralin commented Jun 6, 2016

Patch here:

diff --git a/pet-race-devops/k8s/cassandra/cassandra-petset-local.yaml b/pet-race-devops/k8s/cassandra/cassandra-petset-local.yaml
index d65f79e..c651116 100644
--- a/pet-race-devops/k8s/cassandra/cassandra-petset-local.yaml
+++ b/pet-race-devops/k8s/cassandra/cassandra-petset-local.yaml
@@ -17,7 +17,6 @@ spec:
       containers:
       - name: cassandra
         image: gcr.io/google-samples/cassandra:v9
-        #image: 10.100.179.231:5000/cassandra
         imagePullPolicy: Always
         command:
           - /run.sh
@@ -33,14 +32,11 @@ spec:
         # If you need it it is going away in C* 4.0
         #- containerPort: 9160
         #  name: thrift
-        resources:
-          limits:
-            cpu: 1
         env:
           - name: MAX_HEAP_SIZE
-            value: 512M
+            value: "512M"
           - name: HEAP_NEWSIZE
-            value: 100M
+            value: "100M"
           - name: POD_NAMESPACE
             valueFrom:
               fieldRef:

@chrislovecnm
Copy link
Member

yah ... that is a bug ...

@chrislovecnm
Copy link
Member

The issue with the quoting is with the cpu limit. See kubernetes/kubernetes#26898

@paralin
Copy link
Author

paralin commented Jun 10, 2016

@chrislovecnm Can we maybe start to outline how to set up a multiple datacenter cluster with petsets?

@chrislovecnm
Copy link
Member

chrislovecnm commented Jun 10, 2016

@paralin just thinking about that actually ... Literally. Maybe you can help with the design. So more than one kubernetes instance does not work, as all DCs, Racks and Cluster needs to be in the same k8s instance. You clients need to be inside the same K8s instance. They have to be inside the same k8 instance. The reason it does not work is a proxying problem. You cannot use load balancers. Every C* node needs to talk to every other C* node. Only nodes inside of a single K8s cluster can communicate with each other.

Their are a few different challenges:

  1. The seed provider is not DC aware. You need a seed in every DC, and all nodes need to see the other seeds. That is the fun one.
  2. Do you know if a Service can talk to multiple applications? I probably need to file a bug to see if a service selector can be setup for multiple apps to talk to it. If we had multiple services and modified the seed provider, we could do that as well.
  3. Setting up a snitch can depend on your cloud deployment. With affinity rules or node labels you can pretty much have a balance deployment across multiple zones in AWS or GCE.
  4. We could hack in a SimpleSnitch by accessing node labels from the pod itself.
  5. Rolling restarts are not supported yet. Will be soon.

Probably will open an issue in the kubernetes project and continue the discussion there.

tldr; DCs across multiple K8 instances is not supported, cannot use load balancers. Need to put in an issue about this with the kubernetes project. Seed provider needs some design and some work to support multiple DC's. Multiple racks will probably work, but not tested.

@mward29 you want to pipe in here??

@chrislovecnm
Copy link
Member

Btw the issue you found with the numbers in yaml is fixed in HEAD of k8s kubernetes/kubernetes#26907

@paralin
Copy link
Author

paralin commented Jun 10, 2016

The way I'm going to do it is:

  • Route everything over a vpn such that you can hit pod IPs from outside of the cluster
  • Use a simple seed provider and just hardcode the pod IP addresses.

A service can talk to multiple things, yes.

I don't see why you wouldn't be able to do multiple DCs with this in mind.

@paralin paralin closed this as completed Jun 10, 2016
@paralin paralin reopened this Jun 10, 2016
@paralin
Copy link
Author

paralin commented Jun 10, 2016

Whoops... didn't mean to close

@chrislovecnm
Copy link
Member

@paralin please document how you setup the vpn. What cloud provider are you in? How are you going to know what your seed ips are?

@paralin
Copy link
Author

paralin commented Jun 10, 2016

GCE and it is actually an extremely complex setup I ended up building for this... Will document it eventually. But the nice thing is I am now bridging a couple of mesh networks with the GCE network and everything can talk to everything else.

The only thing I didn't figure out is hitting services from outside the cluster which I made an issue about here:

kubernetes/kubernetes#27161

The basic components of the setup are babeld and openvpn with some iptables rules.

@paralin paralin closed this as completed Jun 10, 2016
@paralin paralin reopened this Jun 10, 2016
@paralin
Copy link
Author

paralin commented Jun 10, 2016

... and I accidentally closed it again...

@chrislovecnm
Copy link
Member

@paralin I looped you and @mward29 on a couple of issues on K8s project. Working on these issues with the K8s team, will further strengthen the capability of C* on K8s.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants