November 13, 2018 in Systems7 minutes
When deploying an application to Kubernetes, you almost certainly will want to create a Service to represent that application. Rather than relying on direct connectivity to Pods, which may be ephemeral, Services by contrast are long-living resources that sit on top of one or more Pods. They are also the bare minimum for allowing those pods to communicate outside the cluster.
While Services are a nice abstraction so we don’t have to worry about individual Pods, they are also fairly dumb. They don’t look at the application layer or allow us to make decisions. Considering this, there are two main options for directing traffic to the appropriate application within your cluster:
The latter option is increasingly popular, for a few reasons. First, it allows us to store our application routing rules “as code”, alongside all our other Kubernetes manifests - rather than relying on an external load balancer to be configured correctly. It also provides a high level of automation. Whatever ingress controller we’re using will watch for new Ingresses to be created, and will automatically configure the appropriate load balancer.
NRE Labs uses this model with the nginx-ingress controller for two main use cases:
syringe
and antidote-web
are deployed with their own Ingress rules so that users can access each from the web.antidote-web
can access them.What this means is that whenever an Ingress resource is created, regardless of the purpose, the NGINX ingress controller will pick it up and modify its configuration dynamically.
When building out the functionality to use Ingresses to dynamically create embeddable web resources for lessons, I ran into an issue. When a user loads a lesson that contains a web resource, syringe
will not only create the appropriate Pod(s) and Service(s) but will also create an Ingress to make sure it’s externally available.
To make sure all users can access their respective web resources without stepping on each other, these Ingress rules are designed to present a unique path externally, and rewrite it to the appropriate path per the lesson definition. For instance, if the web resource is a jupyter notebook, the Pod might be expecting something like this:
However, that will be the same for all users trying to access this lesson. So, when we create the Ingress, we add a rewrite rule so that externally, the URL is a combination of the underlying Kubernetes namespace, and the name of the web resource:
This path is guaranteed to be unique per-user, per-resource.
The Ingress resource that’s created to accomplish this is fairly straightforward. The appropriate NGINX controller annotations specify the rewrite that’s to take place:
And now, the problem. When I was testing this resource, I was getting 404s.
However, these were coming from the Jupyter Notebook application, so the actual application routing wasn’t the problem - there was something else going on here. Time to look at the Jupyter logs to see what the incoming requests look like:
Right there in the Jupyter logs, we can see the rewrite isn’t working. But why? In this case, it’s best to dive behind the abstraction and look at the actual NGINX configuration that the controller rendered for us from our Ingress definition:
And there lies the culprit - it appears that the NGINX ingress controller appended a trailing slash to the path we provided in the Ingress definition. Take a look at the two important directives:
location
directive contains the trailing backslash, but uses the ?
regex token to indicate that it’s optional. This means that we get routed correctly regardless of the presence of the backslash in the request.rewrite
directive is not so flexible. The regex used here strictly matches a path that ends in a backslash. If that backslash doesn’t exist, the rewrite doesn’t happen.So the net behavior is that our traffic gets routed to the appropriate place, but without the rewrite we asked for. We can see this behavior in action in the shell:
Easy fix, right? Just append a backslash, right? Well here’s where it gets wonky:
We’re still getting a 404 with the trailing slash, but before that, we’re being redirected to /notebooks/lesson-13/stage1/notebook.ipynb
. The idea with the rewrite is that the user shouldn’t ever see this path, so this is strange. Back to the Jupyter server:
It appears as though Jupyter is exactly as particular about backslashes but in the opposite direction - it is redirecting to a new URL without it.
We can see the fresh 302s, but the 404s we saw on the client side are nowhere to be found. Because we’re erroneously redirecting to /notebooks/lesson-13/stage1/notebook.ipynb
, our browser is sending the second request through the NGINX load balancer, which doesn’t have that path in its incoming configuration. The reason we’re not seeing the 404s from the redirection on our Jupyter pod is because it’s our NGINX load balancer that’s reponding with the 404.
Fortunately the solution was very easy. I was using a pretty old verison of the NGINX ingress controller, and a recent PR fixed rewrites for paths not ending in a backslash. Using a newer version of the controller resolved this issue.
In my research I also came across a Github issue that recommends using the configuration-snippet
annotation in lieu of a rewrite annotation, to directly affect the NGINX configuration:
In my case I preferred to stick with the rewrite annotation and let the controller do its thing, but I think I’ll hide this away in the back of my mind for later, it’s a useful way of inserting your own custom logic.
Anyways, despite the overwhelming simplicity of the solution, the point of this post was to document my troubleshooting, in the hope that it might be useful to you in your similar endeavors.