←back to thread

160 points simplesort | 3 comments | | HN request time: 0.21s | source
Show context
slt2021 ◴[] No.43626982[source]
Question to the Netflix folks: I saw a lot of in-house developed tools being quoted, do you guys have service mesh like linkerd ?

Have you guys evaluated vendors like Kentik?

I would love to get more insight into what do you guys actually do with flow logs? for example if I store 1 TB of flow logs, what value can I actually derive from them that justify the cost of collection, processing, and storage.

replies(2): >>43627581 #>>43628881 #
1. retiredpapaya ◴[] No.43627581[source]
I think Netflix does use an Envoy-based Service Mesh [1], and they roll their own control plane.

https://netflixtechblog.com/zero-configuration-service-mesh-...

replies(1): >>43627703 #
2. slt2021 ◴[] No.43627703[source]
If the goal of gathering and attributing VPC flows is to have a workload granularity flow logs, then imho gathering mesh level logs is more direct and atraight forward approach, because mesh(and workload orchestrator) are uniquely qualified to know when workload A is running on a host X and is trying to connect to workload B.

Looking at Envoy access logs for example is more straightforward and simple aplroach, than running distributed ebpf and memory intensive large spark streaming job

replies(1): >>43627867 #
3. nptr ◴[] No.43627867[source]
The blog post mentioned that "The eBPF flow logs provide a comprehensive view of service topology and network health across Netflix’s extensive microservices fleet, regardless of the programming language, RPC mechanism, or application-layer protocol used by individual workloads."

Service mesh may have restrictions on the network protocols and may not cover all network traffic (like connections to Kafka and databases).