Description
I have a question related to the two types backends: Vulkan and Kompute.
I am running on Windows 11 latest version with latest version of AMD drivers on a AMD 7800 XT graphic card.
I have tried with different windows instalations and it is the same. Also tried different drivers version with clean installation and it's the same.
I just want to know if this card is supported on windows with Kompute and Vulkan or it's only for linux.
Thanks
If I run the Vulkan version (b2251) I receive this error:
main: build = 2251 (fd43d66)
main: built with MSVC 19.38.33135.0 for x64
Starting Test
Allocating Memory of size 800194560 bytes, 763 MB
ggml_vulkan: Found 1 Vulkan devices:
Vulkan0: AMD Radeon RX 7800 XT | uma: 0 | fp16: 1 | warp size: 64
Creating new tensors
------ Test 1 - Matrix Mult via F32 code
n_threads=1
m11: type = 0 ( f32) ne = 11008 x 4096 x 1, nb = ( 4, 44032, 180355072) - Sum of tensor m11 is 45088768.00
m2: type = 0 ( f32) ne = 11008 x 128 x 1, nb = ( 4, 44032, 5636096) - Sum of tensor m2 is 2818048.00
GGML_ASSERT: D:\a\llama.cpp\llama.cpp\ggml-vulkan.cpp:1767: false
If I run the Kompute version (b2251) I don't receive and error, but doesn't seems to use the graphic card based on the low points:
main: build = 2251 (fd43d66)
main: built with MSVC 19.38.33135.0 for x64
Starting Test
Allocating Memory of size 800194560 bytes, 763 MB
Creating new tensors
_------ Test 1 - Matrix Mult via F32 code
n_threads=1
m11: type = 0 ( f32) ne = 11008 x 4096 x 1, nb = ( 4, 44032, 180355072) - Sum of tensor m11 is 45088768.00
m2: type = 0 ( f32) ne = 11008 x 128 x 1, nb = ( 4, 44032, 5636096) - Sum of tensor m2 is 2818048.00
gf->nodes[0]: type = 0 ( f32) ne = 4096 x 128 x 1, nb = ( 4, 16384, 2097152) - Sum of tensor gf->nodes[0] is 11542724608.00
------ Test 2 - Matrix Mult via q4_1 code
n_threads=1
Matrix Multiplication of (11008,4096,1) x (11008,128,1) - about 11.54 gFLOPS
Iteration;NThreads; SizeX; SizeY; SizeZ; Required_FLOPS; Elapsed_u_Seconds; gigaFLOPS
0; 1; 11008; 4096; 128; 11542724608; 141751; 81.43
1; 1; 11008; 4096; 128; 11542724608; 140819; 81.97
2; 1; 11008; 4096; 128; 11542724608; 141098; 81.81
3; 1; 11008; 4096; 128; 11542724608; 140593; 82.10
4; 1; 11008; 4096; 128; 11542724608; 140639; 82.07
5; 1; 11008; 4096; 128; 11542724608; 140766; 82.00
6; 1; 11008; 4096; 128; 11542724608; 140835; 81.96
7; 1; 11008; 4096; 128; 11542724608; 141091; 81.81
8; 1; 11008; 4096; 128; 11542724608; 140719; 82.03
9; 1; 11008; 4096; 128; 11542724608; 140794; 81.98
Average 81.92_
The my Vulkan info first part (it was to big to add it all) info looks like this:
WARNING: [Loader Message] Code 0 : windows_read_data_files_in_registry: Registry lookup failed to get layer manifest files.
VULKANINFO
Vulkan Instance Version: 1.3.261
Instance Extensions: count = 13
VK_EXT_debug_report : extension revision 10
VK_EXT_debug_utils : extension revision 2
VK_EXT_swapchain_colorspace : extension revision 4
VK_KHR_device_group_creation : extension revision 1
VK_KHR_external_fence_capabilities : extension revision 1
VK_KHR_external_memory_capabilities : extension revision 1
VK_KHR_external_semaphore_capabilities : extension revision 1
VK_KHR_get_physical_device_properties2 : extension revision 2
VK_KHR_get_surface_capabilities2 : extension revision 1
VK_KHR_portability_enumeration : extension revision 1
VK_KHR_surface : extension revision 25
VK_KHR_win32_surface : extension revision 6
VK_LUNARG_direct_driver_loading : extension revision 1
Layers: count = 1
VK_LAYER_AMD_switchable_graphics (AMD switchable graphics layer) Vulkan version 1.3.277, layer version 1:
Layer Extensions: count = 0
Devices: count = 1
GPU id = 0 (AMD Radeon RX 7800 XT)
Layer-Device Extensions: count = 0
Presentable Surfaces:
GPU id : 0 (AMD Radeon RX 7800 XT):
Surface type = VK_KHR_win32_surface
Formats: count = 4
SurfaceFormat[0]:
format = FORMAT_R8G8B8A8_UNORM
colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
SurfaceFormat[1]:
format = FORMAT_B8G8R8A8_UNORM
colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
SurfaceFormat[2]:
format = FORMAT_R8G8B8A8_SRGB
colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
SurfaceFormat[3]:
format = FORMAT_B8G8R8A8_SRGB
colorSpace = COLOR_SPACE_SRGB_NONLINEAR_KHR
Present Modes: count = 3
PRESENT_MODE_IMMEDIATE_KHR
PRESENT_MODE_FIFO_KHR
PRESENT_MODE_FIFO_RELAXED_KHR
VkSurfaceCapabilitiesKHR:
-------------------------
minImageCount = 2
maxImageCount = 16
currentExtent:
width = 256
height = 256
minImageExtent:
width = 256
height = 256
maxImageExtent:
width = 256
height = 256
maxImageArrayLayers = 1
supportedTransforms: count = 1
SURFACE_TRANSFORM_IDENTITY_BIT_KHR
currentTransform = SURFACE_TRANSFORM_IDENTITY_BIT_KHR
supportedCompositeAlpha: count = 1
COMPOSITE_ALPHA_OPAQUE_BIT_KHR
supportedUsageFlags: count = 6
IMAGE_USAGE_TRANSFER_SRC_BIT
IMAGE_USAGE_TRANSFER_DST_BIT
IMAGE_USAGE_SAMPLED_BIT
IMAGE_USAGE_STORAGE_BIT
IMAGE_USAGE_COLOR_ATTACHMENT_BIT
IMAGE_USAGE_INPUT_ATTACHMENT_BIT
VkSurfaceCapabilitiesFullScreenExclusiveEXT:
--------------------------------------------
fullScreenExclusiveSupported = true
Device Properties and Extensions:
GPU0:
VkPhysicalDeviceProperties:
apiVersion = 1.3.277 (4206869)
driverVersion = 2.0.299 (8388907)
vendorID = 0x1002
deviceID = 0x747e
deviceType = PHYSICAL_DEVICE_TYPE_DISCRETE_GPU
deviceName = AMD Radeon RX 7800 XT
pipelineCacheUUID = 342bec4f-5205-5a35-9265-2b80db05cfac
VkPhysicalDeviceLimits:
maxImageDimension1D = 16384
maxImageDimension2D = 16384
maxImageDimension3D = 8192
maxImageDimensionCube = 16384
maxImageArrayLayers = 8192
maxTexelBufferElements = 4294967295
maxUniformBufferRange = 4294967295
maxStorageBufferRange = 4294967295
maxPushConstantsSize = 128
maxMemoryAllocationCount = 4294967295
maxSamplerAllocationCount = 1048576
bufferImageGranularity = 0x00000001
sparseAddressSpaceSize = 0x7ffa00000000
maxBoundDescriptorSets = 32
maxPerStageDescriptorSamplers = 4294967295
maxPerStageDescriptorUniformBuffers = 4294967295
maxPerStageDescriptorStorageBuffers = 4294967295
maxPerStageDescriptorSampledImages = 4294967295
maxPerStageDescriptorStorageImages = 4294967295
maxPerStageDescriptorInputAttachments = 4294967295
maxPerStageResources = 4294967295
maxDescriptorSetSamplers = 4294967295
maxDescriptorSetUniformBuffers = 4294967295
maxDescriptorSetUniformBuffersDynamic = 8
maxDescriptorSetStorageBuffers = 4294967295
maxDescriptorSetStorageBuffersDynamic = 8
maxDescriptorSetSampledImages = 4294967295
maxDescriptorSetStorageImages = 4294967295
maxDescriptorSetInputAttachments = 4294967295
maxVertexInputAttributes = 64
maxVertexInputBindings = 32
maxVertexInputAttributeOffset = 4294967295
maxVertexInputBindingStride = 16383
maxVertexOutputComponents = 128
maxTessellationGenerationLevel = 64
maxTessellationPatchSize = 32
maxTessellationControlPerVertexInputComponents = 128
maxTessellationControlPerVertexOutputComponents = 128
maxTessellationControlPerPatchOutputComponents = 120
maxTessellationControlTotalOutputComponents = 4096
maxTessellationEvaluationInputComponents = 128
maxTessellationEvaluationOutputComponents = 128
maxGeometryShaderInvocations = 126
maxGeometryInputComponents = 128
maxGeometryOutputComponents = 128
maxGeometryOutputVertices = 256
maxGeometryTotalOutputComponents = 1024
maxFragmentInputComponents = 128
maxFragmentOutputAttachments = 8
maxFragmentDualSrcAttachments = 1
maxFragmentCombinedOutputResources = 4294967295
maxComputeSharedMemorySize = 32768
maxComputeWorkGroupCount: count = 3
4294967295
65535
65535
maxComputeWorkGroupInvocations = 1024
maxComputeWorkGroupSize: count = 3
1024
1024
1024
subPixelPrecisionBits = 8
subTexelPrecisionBits = 8
mipmapPrecisionBits = 8
maxDrawIndexedIndexValue = 4294967295
maxDrawIndirectCount = 4294967295
maxSamplerLodBias = 15.9961
maxSamplerAnisotropy = 16
maxViewports = 16
maxViewportDimensions: count = 2
16384
16384
viewportBoundsRange: count = 2
-32768
32767
viewportSubPixelBits = 8
minMemoryMapAlignment = 64
minTexelBufferOffsetAlignment = 0x00000004
minUniformBufferOffsetAlignment = 0x00000010
minStorageBufferOffsetAlignment = 0x00000004
minTexelOffset = -64
maxTexelOffset = 63
minTexelGatherOffset = -32
maxTexelGatherOffset = 31
minInterpolationOffset = -2
maxInterpolationOffset = 1
subPixelInterpolationOffsetBits = 8
maxFramebufferWidth = 16384
maxFramebufferHeight = 16384
maxFramebufferLayers = 8192
framebufferColorSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
framebufferDepthSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
framebufferStencilSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
framebufferNoAttachmentsSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
maxColorAttachments = 8
sampledImageColorSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
sampledImageIntegerSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
sampledImageDepthSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
sampledImageStencilSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
storageImageSampleCounts: count = 4
SAMPLE_COUNT_1_BIT
SAMPLE_COUNT_2_BIT
SAMPLE_COUNT_4_BIT
SAMPLE_COUNT_8_BIT
maxSampleMaskWords = 1
timestampComputeAndGraphics = true
timestampPeriod = 10
maxClipDistances = 8
maxCullDistances = 8
maxCombinedClipAndCullDistances = 8
discreteQueuePriorities = 2
pointSizeRange: count = 2
0
8191.88
lineWidthRange: count = 2
0
8191.88
pointSizeGranularity = 0.125
lineWidthGranularity = 0.125
strictLines = false
standardSampleLocations = true
optimalBufferCopyOffsetAlignment = 0x00000001
optimalBufferCopyRowPitchAlignment = 0x00000001
nonCoherentAtomSize = 0x00000080