Inference Metrics

To meaure inference performance and dynamic scale inference resource, following metrics are collected from inference engines.

On kubernetes, prometheus + grafana are always used to monitor and visualize the different metrics like service, cluster, node etc. Inference service also report above metrics as service metrics:

Beyond indivdual service’s metrics, you can also monitor and visualize the total inference and drop FPS to indicate whole cluster’s performance and resource gap.

Setup Prometheus+Grafana

Since prometheus + grafana is very common monitoring tool but its configuration is a little complex, Kubernete operator might use different ways to setup or CSP ochestrator already provides that. Here steps are based on CoreOS’s open source project kube-prometheus at https://github.com/coreos/kube-prometheus.

Install kube-prometheus

(Note: you can also use install-kube-prometheus.sh or refer below steps.)

  1. Clone v0.3.0

    git clone https://github.com/coreos/kube-prometheus.git -b v0.3.0

    (Note: other branch or tag might not work with custom metics based HPA.)

  2. Install ``` kubectl apply -f kube-prometheus/manifests/setup/ kubectl apply -f kube-prometheus/manifests

    # Expose grafana service via NodePort for external access kubectl patch svc grafana -n monitoring –type=‘json’ -p ‘[{“op”:“replace”,“path”:“/spec/type”,“value”:“NodePort”}]’

    # Expose prometheus-k8s service via NodePort for external access kubectl patch svc prometheus-k8s -n monitoring –type=‘json’ -p ‘[{“op”:“replace”,“path”:“/spec/type”,“value”:“NodePort”}]’ ```

    After that, you should able to check prometheus and grafana via each NodePort.

Monitor inference service

To monitor inference service, you need:

cd cloud-native-demo/elastic_inference/kubernetes/monitoring
kubectl apply -f servicemonitor.yaml

View Inference Metrics

View from Inference Service

You can read metrics value from inference service after exposing individual inference service via NodePort.

kubectl patch svc ei-infer-face-fp32-app --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]'

View from Prometheus

You can monitor metrics from prometheus after exposing prometheus-k8s via NodePort as:

# Expose prometheus-k8s service via NodePort for external access
kubectl patch svc prometheus-k8s -n monitoring --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]'

(Note: If you could get metric value from above step - “View from Inference Service” but could not get value in this step, the most issue is caused by invalid timezone or date on kubernete cluster.)

View from Grafana

You can monitor individual and cluster-wide metrics on grafana after exposing grafana service via NodePort.

# Expose grafana service via NodePort for external access
kubectl patch svc grafana -n monitoring --type='json' -p '[{"op":"replace","path":"/spec/type","value":"NodePort"}]'

You also need create customize graph and add following metrics: - ei_infer_fps - ei_drop_fps - sum(ei_infer_fps) - sum(ei_drop_fps) - ei_scale_ratio