S3 through nginx ingress

Proxy a S3 bucket with an ingress and external serviceBy vicjicama

Introduction

This post is about how to proxy a s3 bucket using a kubernetes ingress and an external service. I have a bucket named resources.repoflow.com with all the css and js file that I need to use those resources in the blog and the linker services, instead of directly access the resources thought the bucket url like this:

http://s3.amazonaws.com/resources.repoflow.com/resources/react-toastify/dist/ReactToastify.css
I wanted to access the resources like this:
https://blog.repoflow.com/resources/react-toastify/dist/ReactToastify.css

In case I need to change the source of the resources I just need to configure the ingress and all the other service should not be affected.
At first I used a node server that was serving static routes...
Then I had a container with an nginx configured to upstream the s3 bucket to the needed route (without rewrite)...
Now, as mentioned on this post I am using an external service...


Kubernetes entities

We need to set the upstream-vhost and rewrite-target annotations for the ingress

The upstream-vhost annotation will replace the defined ingress host, in this case s3-proxy.repoflow.com to s3.amazonaws.com

The rewrite-target annotation will replace the $2 in the annotation   with the first captured group (.*), check the troubleshooting section for more details on the (/|$) part of the path.

apiVersion: v1
kind: Ingress
metadata:
  name: proxy-ingress
  namespace: repoflow-s3-proxy-demo
annotations:
  nginx.ingress.kubernetes.io/rewrite-target: /resources.repoflow.com/resources/$2
  nginx.ingress.kubernetes.io/upstream-vhost: s3.amazonaws.com
spec:
  rules:
  - host: s3-proxy.repoflow.com
    http:
      paths:
        - path: /resources(/|$)(.*)
          backend:
            serviceName: resources-external-service
            servicePort: 80
        

For the external service you need to set the externalName with a DNS name, in this case set it to s3.amazonaws.com, the external service is also useful when you want to access a service from another namespace, you can use the cluster dns name like service-name.namespace.svc.cluster.local and use the service port as the targetPort.

apiVersion: v1
kind: Service
metadata:
  name: resources-external-service
  namespace: repoflow-s3-proxy-demo
spec:
  type: ExternalName
  externalName: s3.amazonaws.com
  ports:
    -
      name: http
      protocol: TCP
      port: 80
      targetPort: 80
        

Check the repository


Troubleshooting

During the testing this was not working at the beginning, I saw that I was   getting an application/x-directory response on the browser...  I had an issue with the path rewrite... in order to check which is the rewrite result I added the next configuration-snippet to the ingress:

apiVersion: v1
kind: Ingress
metadata:
  name: proxy-ingress
  namespace: repoflow-s3-proxy-demo
annotations:
  nginx.ingress.kubernetes.io/rewrite-target: /resources.repoflow.com/resources/$2
  nginx.ingress.kubernetes.io/upstream-vhost: s3.amazonaws.com
  nginx.ingress.kubernetes.io/configuration-snippet : |
    rewrite_log on;
spec:
  rules:
  - host: s3-proxy.repoflow.com
    http:
      paths:
        - path: /resources(/|$)(.*)
          backend:
            serviceName: resources-external-service
            servicePort: 80
  

This is a part of the nginx-ingress log, you can see that the rewritten data was not correct... all resource path was missing.

[notice] 333#333: *17610 "(?i)/resources/(.*)" matches
  "/resources/react-toastify/dist/ReactToastify.css", client: 10.244.0.28,
  server: blog-stage.repoflow.com, request: "GET
  /resources/react-toastify/dist/ReactToastify.css HTTP/2.0", host:
  "blog-stage.repoflow.com" 2020/01/07 22:37:54 [notice] 333#333: *17610
  rewritten data: "/resources.repoflow.com/resources/", args: "",
  client: 10.244.0.28, server: blog-stage.repoflow.com, request: "GET
  /resources/react-toastify/dist/ReactToastify.css HTTP/2.0"

I saw how to solve this issue on this stack overflow Q/A:

I need to change the ingress path from /resources/(.*) to /resources(/|$)(.*). Now the rewritten data was showing the right value and the request were returning the expected resource.

[notice] 189#189: *16440 "(?i)/resources(/|$)(.*)" matches
"/resources/react-toastify/dist/ReactToastify.css", client: 10.244.0.28,
server: blog-stage.repoflow.com, request: "GET
/resources/react-toastify/dist/ReactToastify.css HTTP/2.0", host:
"blog-stage.repoflow.com" 2020/01/07 22:36:26 [notice] 189#189: *16440
rewritten data: "/resources.repoflow.com/resources/react-toastify/dist/ReactToastify.css", args: "", client: 10.244.0.28, server: blog-stage.repoflow.com, request:
"GET /resources/react-toastify/dist/ReactToastify.css HTTP/2.0"
        

Testing

I used a minikube instance that is living on an old pc to test this concept.To have this example working all you need to do is apply the entities.yaml file, this will create the namespace, the external service and ingress previously described.

kubectl apply -f entities.yaml

Since the minikube instance is in another machine you need to forward the   minikube ingress to you local machine and then modify the /etc/hosts file to   add the ingress host. I do this kind of forwards for my local machine using   the linker tool

Forward the ingress in this way is very useful for example if you want to test something using same url than the production cluster but in a minikube   instance.


Conclusion

Using the ingress definitions to proxy external services have a lot of possibilities, I will keep improving the way those resources are shared across multiple service and projects and sharing the findings.

If you are interested you can find the code for this blog that is running in kubernetes, all the entities, the containers, graphql and node services here: microservices

If you have any suggestions for any change or post, if have any feedback or if you want to reach out don't hesitate to contact me my email is vic@repoflow.com