-
Notifications
You must be signed in to change notification settings - Fork 12
FLARE implementation #143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
FLARE implementation #143
Conversation
f1636b5 to
4d0980b
Compare
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
Signed-off-by: Dario Tranchitella <[email protected]>
|
I tried testing this but noticed that the Flavor doesn’t reflect some values from the FLARE annotations (GPU count and GPU memory stay at 0). Are the annotations the same as in this repository https://github.com/clastix/flare? For testing, should I follow the quickstart from that repo, or are there specific steps/examples for this PR to make validation easier? |
Yes, I implemented the required code following that documentation: just sharing the annotations and the resulting flavor. // kubectl get nodes fluidos-provider-2-worker -ojsonpath='{.metadata.annotations}'
{
"cost.fluidos.eu/currency": "EUR",
"cost.fluidos.eu/hourly-rate": "2.1",
"gpu.fluidos.eu/architecture": "ampere",
"gpu.fluidos.eu/clock-speed": "1.80G",
"gpu.fluidos.eu/compute-capability": "8.6",
"gpu.fluidos.eu/cores": "10752",
"gpu.fluidos.eu/count": "8",
"gpu.fluidos.eu/dedicated": "true",
"gpu.fluidos.eu/fp32-tflops": "38.7",
"gpu.fluidos.eu/interconnect": "nvlink",
"gpu.fluidos.eu/interconnect-bandwidth-gbps": "600",
"gpu.fluidos.eu/interruptible": "false",
"gpu.fluidos.eu/memory-per-gpu": "48Gi",
"gpu.fluidos.eu/model": "nvidia-a6000",
"gpu.fluidos.eu/multi-gpu-efficiency": "0.85",
"gpu.fluidos.eu/sharing-capable": "false",
"gpu.fluidos.eu/sharing-strategy": "none",
"gpu.fluidos.eu/tier": "standard",
"gpu.fluidos.eu/topology": "ring",
"gpu.fluidos.eu/vendor": "nvidia",
"kubeadm.alpha.kubernetes.io/cri-socket": "unix:///run/containerd/containerd.sock",
"location.fluidos.eu/region": "us-east-1",
"location.fluidos.eu/zone": "zone-a",
"network.fluidos.eu/bandwidth-gbps": "25",
"network.fluidos.eu/latency-ms": "5",
"network.fluidos.eu/tier": "standard",
"node.alpha.kubernetes.io/ttl": "0",
"provider.fluidos.eu/name": "cloud-provider-2",
"provider.fluidos.eu/preemptible": "false",
"volumes.kubernetes.io/controller-managed-attach-detach": "true",
"workload.fluidos.eu/graphics-score": "0.95",
"workload.fluidos.eu/hpc-score": "0.80",
"workload.fluidos.eu/inference-score": "0.90",
"workload.fluidos.eu/training-score": "0.85"
}// kubectl get flavors.nodecore.fluidos.eu fluidos.eu-k8slice-89ad -ojsonpath='{.spec.flavorType.typeData.characteristics.gpu}'|jq
{
"architecture": "ampere",
"clock_speed": "1800M",
"compute_capability": "8.6",
"cores": "86016",
"count": 8,
"dedicated": true,
"fp32_tflops": 38.7,
"graphics_score": 0.95,
"hourly_rate": 2.1,
"hpc_score": 0.8,
"inference_score": 0.9,
"interconnect": "nvlink",
"interconnect_bandwidth": "600",
"memory": "384Gi",
"model": "nvidia-a6000",
"multi_gpu_efficiency": "0.85",
"network_bandwidth": "25",
"network_latency_ms": 5,
"network_tier": "standard",
"provider": "cloud-provider-2",
"region": "zone-a",
"sharing_strategy": "none",
"tier": "standard",
"topology": "ring",
"training_score": 0.85,
"vendor": "nvidia",
"zone": "us-east-1"
}
Yes, we're going to release the code before the end of the week: testing will be easier. |
No description provided.