Skip to content

Conversation

@prometherion
Copy link

No description provided.

@rmedina97
Copy link
Collaborator

I tried testing this but noticed that the Flavor doesn’t reflect some values from the FLARE annotations (GPU count and GPU memory stay at 0). Are the annotations the same as in this repository https://github.com/clastix/flare? For testing, should I follow the quickstart from that repo, or are there specific steps/examples for this PR to make validation easier?

@prometherion
Copy link
Author

prometherion commented Aug 28, 2025

Are the annotations the same as in this repository https://github.com/clastix/flare?

Yes, I implemented the required code following that documentation: just sharing the annotations and the resulting flavor.

// kubectl get nodes fluidos-provider-2-worker -ojsonpath='{.metadata.annotations}'
{
  "cost.fluidos.eu/currency": "EUR",
  "cost.fluidos.eu/hourly-rate": "2.1",
  "gpu.fluidos.eu/architecture": "ampere",
  "gpu.fluidos.eu/clock-speed": "1.80G",
  "gpu.fluidos.eu/compute-capability": "8.6",
  "gpu.fluidos.eu/cores": "10752",
  "gpu.fluidos.eu/count": "8",
  "gpu.fluidos.eu/dedicated": "true",
  "gpu.fluidos.eu/fp32-tflops": "38.7",
  "gpu.fluidos.eu/interconnect": "nvlink",
  "gpu.fluidos.eu/interconnect-bandwidth-gbps": "600",
  "gpu.fluidos.eu/interruptible": "false",
  "gpu.fluidos.eu/memory-per-gpu": "48Gi",
  "gpu.fluidos.eu/model": "nvidia-a6000",
  "gpu.fluidos.eu/multi-gpu-efficiency": "0.85",
  "gpu.fluidos.eu/sharing-capable": "false",
  "gpu.fluidos.eu/sharing-strategy": "none",
  "gpu.fluidos.eu/tier": "standard",
  "gpu.fluidos.eu/topology": "ring",
  "gpu.fluidos.eu/vendor": "nvidia",
  "kubeadm.alpha.kubernetes.io/cri-socket": "unix:///run/containerd/containerd.sock",
  "location.fluidos.eu/region": "us-east-1",
  "location.fluidos.eu/zone": "zone-a",
  "network.fluidos.eu/bandwidth-gbps": "25",
  "network.fluidos.eu/latency-ms": "5",
  "network.fluidos.eu/tier": "standard",
  "node.alpha.kubernetes.io/ttl": "0",
  "provider.fluidos.eu/name": "cloud-provider-2",
  "provider.fluidos.eu/preemptible": "false",
  "volumes.kubernetes.io/controller-managed-attach-detach": "true",
  "workload.fluidos.eu/graphics-score": "0.95",
  "workload.fluidos.eu/hpc-score": "0.80",
  "workload.fluidos.eu/inference-score": "0.90",
  "workload.fluidos.eu/training-score": "0.85"
}
// kubectl get flavors.nodecore.fluidos.eu fluidos.eu-k8slice-89ad -ojsonpath='{.spec.flavorType.typeData.characteristics.gpu}'|jq
{
  "architecture": "ampere",
  "clock_speed": "1800M",
  "compute_capability": "8.6",
  "cores": "86016",
  "count": 8,
  "dedicated": true,
  "fp32_tflops": 38.7,
  "graphics_score": 0.95,
  "hourly_rate": 2.1,
  "hpc_score": 0.8,
  "inference_score": 0.9,
  "interconnect": "nvlink",
  "interconnect_bandwidth": "600",
  "memory": "384Gi",
  "model": "nvidia-a6000",
  "multi_gpu_efficiency": "0.85",
  "network_bandwidth": "25",
  "network_latency_ms": 5,
  "network_tier": "standard",
  "provider": "cloud-provider-2",
  "region": "zone-a",
  "sharing_strategy": "none",
  "tier": "standard",
  "topology": "ring",
  "training_score": 0.85,
  "vendor": "nvidia",
  "zone": "us-east-1"
}

For testing, should I follow the quickstart from that repo, or are there specific steps/examples for this PR to make validation easier?

Yes, we're going to release the code before the end of the week: testing will be easier.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants