Master node dies after running kubernetes on aws for few days

0 votes

I had set up a kubernetes cluster and what i realized is after a few days my master dies. The pods are still running and available but there's no way to manage the nodes as the master node is dead.

This is what I found after further debugging:

goroutine 571 [running]:
net/http.func·018()
    /usr/src/go/src/net/http/transport.go:517 +0x2a
net/http.(*Transport).CancelRequest(0xc2083c0630, 0xc209750d00)
    /usr/src/go/src/net/http/transport.go:284 +0x97
github.com/coreos/go-etcd/etcd.func·003()
    /go/src/github.com/GoogleCloudPlatform/kubernetes/Godeps/_workspace/src/github.com/coreos/go-etcd/etcd/requests.go:159 +0x236
created by github.com/coreos/go-etcd/etcd.(*Client).SendRequest
    /go/src/github.com/GoogleCloudPlatform/kubernetes/Godeps/_workspace/src/github.com/coreos/go-etcd/etcd/requests.go:168 +0x3e3

goroutine 1 [IO wait, 12 minutes]:
net.(*pollDesc).Wait(0xc20870e760, 0x72, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:84 +0x47
net.(*pollDesc).WaitRead(0xc20870e760, 0x0, 0x0)
    /usr/src/go/src/net/fd_poll_runtime.go:89 +0x43
net.(*netFD).accept(0xc20870e700, 0x0, 0x7f4424a42008, 0xc20930a168)
    /usr/src/go/src/net/fd_unix.go:419 +0x40b
net.(*TCPListener).AcceptTCP(0xc20804bec0, 0x5bccce, 0x0, 0x0)
    /usr/src/go/src/net/tcpsock_posix.go:234 +0x4e
net/http.tcpKeepAliveListener.Accept(0xc20804bec0, 0x0, 0x0, 0x0, 0x0)
    /usr/src/go/src/net/http/server.go:1976 +0x4c
net/http.(*Server).Serve(0xc20887ec60, 0x7f4424a66dc8, 0xc20804bec0, 0x0, 0x0)
    /usr/src/go/src/net/http/server.go:1728 +0x92
net/http.(*Server).ListenAndServe(0xc20887ec60, 0x0, 0x0)
    /usr/src/go/src/net/http/server.go:1718 +0x154
github.com/GoogleCloudPlatform/kubernetes/cmd/kube-apiserver/app.(*APIServer).Run(0xc2081f0e00, 0xc20806e0e0, 0x0, 0xe, 0x0, 0x0)
    /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/cmd/kube-apiserver/app/server.go:484 +0x264a
main.main()
        /go/src/github.com/GoogleCloudPlatform/kubernetes/_output/dockerized/go/src/github.com/GoogleCloudPlatform/kubernetes/cmd/kube-apiserver/apiserver.go:48 +0x154
Oct 4, 2018 in Kubernetes by lina
• 8,220 points
1,531 views

1 answer to this question.

0 votes
I think you're falling short of memory, the initial size of machine was too low and ran out of memory. Increase the cluster size and it should be fine.
answered Oct 4, 2018 by Kalgi
• 52,350 points

Related Questions In Kubernetes

0 votes
1 answer
0 votes
1 answer
0 votes
1 answer

Running Kubernetes on Mac

This error occurs becaouse you have to ...READ MORE

answered Aug 28, 2018 in Kubernetes by Kalgi
• 52,350 points
805 views
0 votes
1 answer

permissions related to AWS ECR

if you add allowContainerRegistry: true, kops will add those permissions ...READ MORE

answered Oct 9, 2018 in Kubernetes by Kalgi
• 52,350 points
1,473 views
+1 vote
1 answer
0 votes
3 answers

Error while joining cluster with node

Hi Kalgi after following above steps it ...READ MORE

answered Jan 17, 2019 in Others by anonymous
15,700 views
0 votes
2 answers

NoSuchBucket error when running Kubernetes on AWS

It was a bug on their part. ...READ MORE

answered Sep 4, 2018 in Kubernetes by Nilesh
• 7,060 points
1,020 views
0 votes
1 answer
webinar REGISTER FOR FREE WEBINAR X
REGISTER NOW
webinar_success Thank you for registering Join Edureka Meetup community for 100+ Free Webinars each month JOIN MEETUP GROUP