Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions .github/workflows/native_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -11,12 +11,14 @@ on:
paths-ignore:
- 'lglpy/**'
- '**/*.md'
- '**/*.json'
pull_request:
branches:
- main
paths-ignore:
- 'lglpy/**'
- '**/*.md'
- '**/*.json'

env:
CMAKE_BUILD_PARALLEL_LEVEL: '8'
Expand Down
2 changes: 2 additions & 0 deletions .github/workflows/python_test.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -10,11 +10,13 @@ on:
- '*'
paths-ignore:
- '**/*.md'
- '**/*.json'
pull_request:
branches:
- main
paths-ignore:
- '**/*.md'
- '**/*.json'

jobs:
python-test:
Expand Down
41 changes: 31 additions & 10 deletions layer_gpu_profile/README_LAYER.md
Original file line number Diff line number Diff line change
Expand Up @@ -113,19 +113,40 @@ application under test and the capture process. For full instructions see the

## Layer configuration

The current layer supports two `sampling_mode` values:
### Setting frame selection mode

* `periodic_frame`: Sample every N frames.
* `frame_list`: Sample specific frames.
The current layer supports the following ways to select frames to profile using
the `frame_mode` config option:

When `mode` is `periodic_frame` the integer value of the `periodic_frame` key
defines the frame sampling period. The integer value of the
`periodic_min_frame` key defines the first possible frame that could be
profiled, allowing profiles to skip over any loading frames. By default frame 0
is ignored.
* `disabled`: Sampling is disabled.
* `periodic`: Sample every N frames.
* `list`: Sample specific frames.

When `mode` is `frame_list` the value of the `frame_list` key defines a list
of integers giving the specific frames to capture.
When frame selection mode is `periodic` the integer value of the
`periodic_frame` key defines the frame sampling period. The integer value of
the `periodic_min_frame` key defines the first possible frame that could be
profiled, allowing profiles to skip over any loading frames. By default frame
0 is ignored.

When frame selection mode is `list` the value of the `frame_list` key defines
a list of integers giving the specific frames to capture.

### Setting counter sampling mode

The current layer supports the following ways to select how to sample counters
to profile using the `sample_mode` config option:

* `disabled`: Sampling is disabled.
* `workload`: Sample every workload in each frame of interest.
* `frame`: Sample at the end of each frame of interest.

By default per-frame samples are isolated from other frames by inserting a
`vkDeviceWaitIdle()` before and after the frame to ensure that workload
in the sampled region does not overlap neighboring frames. Setting the
`frame_serialization` config option to `false` will allow frames to overlap
without serialization, but can add noise to the returned counter values. This
option has no effect for per-workload sampling, which must always use
serialization.

## Layer counters

Expand Down
2 changes: 1 addition & 1 deletion layer_gpu_profile/android_build.sh
Original file line number Diff line number Diff line change
Expand Up @@ -68,7 +68,7 @@ cmake \
-DCMAKE_WARN_DEPRECATED=OFF \
..

cmake --build . -j4
cmake --build .

popd

Expand Down
6 changes: 4 additions & 2 deletions layer_gpu_profile/layer_config.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
{
"layer": "VK_LAYER_LGL_gpu_profile",
"sample_mode": "periodic_frame",
"frame_mode": "periodic",
"sample_mode": "frame",
"periodic_min_frame": 1,
"periodic_frame": 600,
"frame_list": []
"frame_list": [],
"frame_serialization": true
}
1 change: 1 addition & 0 deletions layer_gpu_profile/source/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -55,6 +55,7 @@ add_library(
layer_device_functions_render_pass.cpp
layer_device_functions_trace_rays.cpp
layer_device_functions_transfer.cpp
layer_instance_functions.cpp
submit_visitor.cpp)

target_include_directories(
Expand Down
3 changes: 2 additions & 1 deletion layer_gpu_profile/source/device_utils.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,8 @@
VkCommandBuffer commandBuffer
) {
// Don't instrument outside of active frame of interest
if(!layer.isFrameOfInterest)
bool isEnabled = layer.instance->config.isSamplingWorkloads();
if(!layer.isFrameOfInterest || !isEnabled)
{
return;
}
Expand Down
4 changes: 3 additions & 1 deletion layer_gpu_profile/source/instance.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ const std::vector<std::string> Instance::requiredDriverExtensions {
const std::vector<std::pair<std::string, uint32_t>> Instance::injectedInstanceExtensions {};

/* See header for documentation. */
std::vector<std::pair<std::string, uint32_t>> Instance::injectedDeviceExtensions {};
std::vector<std::pair<std::string, uint32_t>> Instance::injectedDeviceExtensions {
{VK_EXT_FRAME_BOUNDARY_EXTENSION_NAME, VK_EXT_FRAME_BOUNDARY_SPEC_VERSION}
};

/* See header for documentation. */
void Instance::store(VkInstance handle, std::unique_ptr<Instance>& instance)
Expand Down
94 changes: 77 additions & 17 deletions layer_gpu_profile/source/layer_config.cpp
Original file line number Diff line number Diff line change
Expand Up @@ -43,45 +43,78 @@
/* See header for documentation. */
void LayerConfig::parseSamplingOptions(const json& config)
{
// Decode top level options
std::string rawMode = config.at("sample_mode");
// Decode frame selection mode
std::string rawFrameMode = config.at("frame_mode");

if (rawMode == "disabled")
if (rawFrameMode == "disabled")
{
mode = MODE_DISABLED;
frameMode = FRAME_SELECTION_DISABLED;
}
else if (rawMode == "periodic_frame")
else if (rawFrameMode == "periodic")
{
mode = MODE_PERIODIC_FRAME;
frameMode = FRAME_SELECTION_PERIODIC;
periodicFrame = config.at("periodic_frame");
periodicMinFrame = config.at("periodic_min_frame");
}
else if (rawMode == "frame_list")
else if (rawFrameMode == "list")
{
mode = MODE_FRAME_LIST;
frameMode = FRAME_SELECTION_LIST;
specificFrames = config.at("frame_list").get<std::vector<uint64_t>>();
}
else
{
LAYER_ERR("Unknown counter sample_mode: %s", rawMode.c_str());
rawMode = "disabled";
LAYER_ERR("Unknown frame_mode: %s", rawFrameMode.c_str());
frameMode = FRAME_SELECTION_DISABLED;
rawFrameMode = "disabled";
}

// Decode counter sampling mode
std::string rawSampleMode = config.at("sample_mode");

if (rawSampleMode == "disabled")
{
samplingMode = COUNTER_SAMPLING_DISABLED;
}
else if (rawSampleMode == "frame")
{
samplingMode = COUNTER_SAMPLING_FRAMES;
}
else if (rawSampleMode == "workload")
{
samplingMode = COUNTER_SAMPLING_WORKLOADS;
}
else
{
LAYER_ERR("Unknown sample_mode: %s", rawSampleMode.c_str());
samplingMode = COUNTER_SAMPLING_DISABLED;
rawSampleMode = "disabled";
}

// Decode frame serialization mode
frameSerialization = config.at("frame_serialization");

LAYER_LOG("Layer sampling configuration");
LAYER_LOG("============================");
LAYER_LOG(" - Sample mode: %s", rawMode.c_str());
LAYER_LOG(" - Frame selection mode: %s", rawFrameMode.c_str());

if (mode == MODE_PERIODIC_FRAME)
if (frameMode == FRAME_SELECTION_PERIODIC)
{
LAYER_LOG(" - Frame period: %" PRIu64, periodicFrame);
LAYER_LOG(" - Minimum frame: %" PRIu64, periodicMinFrame);
}
else if (mode == MODE_FRAME_LIST)
else if (frameMode == FRAME_SELECTION_LIST)
{
std::stringstream result;
std::copy(specificFrames.begin(), specificFrames.end(), std::ostream_iterator<uint64_t>(result, " "));
LAYER_LOG(" - Frames: %s", result.str().c_str());
}

LAYER_LOG(" - Counter sampling mode: %s", rawSampleMode.c_str());

if (samplingMode == COUNTER_SAMPLING_FRAMES)
{
LAYER_LOG(" - Frame serialization: %u", frameSerialization);
}
}

/* See header for documentation. */
Expand Down Expand Up @@ -131,18 +164,45 @@ LayerConfig::LayerConfig()
bool LayerConfig::isFrameOfInterest(
uint64_t frameID
) const {
switch(mode)
switch(frameMode)
{
case MODE_DISABLED:
case FRAME_SELECTION_DISABLED:
return false;
case MODE_PERIODIC_FRAME:
case FRAME_SELECTION_PERIODIC:
return (frameID >= periodicMinFrame) &&
((frameID % periodicFrame) == 0);
case MODE_FRAME_LIST:
case FRAME_SELECTION_LIST:
return isIn(frameID, specificFrames);
}

// Should never reach here
return false;
}

/* See header for documentation. */
bool LayerConfig::isSamplingWorkloads() const
{
return frameMode != FRAME_SELECTION_DISABLED &&
samplingMode == COUNTER_SAMPLING_WORKLOADS;
}

/* See header for documentation. */
bool LayerConfig::isSamplingFrames() const
{
return frameMode != FRAME_SELECTION_DISABLED &&
samplingMode == COUNTER_SAMPLING_FRAMES;
}

/* See header for documentation. */
bool LayerConfig::isSamplingAny() const
{
return frameMode != FRAME_SELECTION_DISABLED &&
samplingMode != COUNTER_SAMPLING_DISABLED;
}

/* See header for documentation. */
bool LayerConfig::isSerializingFrames() const
{
return isSamplingWorkloads() ||
(isSamplingFrames() && frameSerialization);
};
64 changes: 56 additions & 8 deletions layer_gpu_profile/source/layer_config.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -54,19 +54,57 @@ class LayerConfig
*
* @param frameID The index of the next frame.
*
* @return True if profiling should be enabled, False otherwise.
* @return @c true if profiling should be enabled, @c false otherwise.
*/
bool isFrameOfInterest(uint64_t frameID) const;

/**
* @brief Test if we are sampling workloads.
*
* @return @c true if profiling workloads, @c false otherwise.
*/
bool isSamplingWorkloads() const;

/**
* @brief Test if we are sampling frames.
*
* @return @c true if profiling frames, @c false otherwise.
*/
bool isSamplingFrames() const;

/**
* @brief Test if any kind of sampling is active.
*
* @return @c true if profiling, @c false otherwise.
*/
bool isSamplingAny() const;

/**
* @brief Test if we are serializing frames.
*
* @return @c true if serializing, @c false otherwise.
*/
bool isSerializingFrames() const;

private:
/**
* @brief Supported sampling modes.
* @brief Supported frame selection modes.
*/
enum FrameSelectionMode
{
FRAME_SELECTION_DISABLED,
FRAME_SELECTION_LIST,
FRAME_SELECTION_PERIODIC
};

/**
* @brief Supported counter sampling modes.
*/
enum SamplingMode
enum CounterSamplingMode
{
MODE_DISABLED,
MODE_FRAME_LIST,
MODE_PERIODIC_FRAME
COUNTER_SAMPLING_DISABLED,
COUNTER_SAMPLING_WORKLOADS,
COUNTER_SAMPLING_FRAMES
};

/**
Expand All @@ -79,9 +117,19 @@ class LayerConfig
void parseSamplingOptions(const json& config);

/**
* @brief The sampling mode.
* @brief The frame selection mode.
*/
FrameSelectionMode frameMode {FRAME_SELECTION_DISABLED};

/**
* @brief The counter sampling mode.
*/
CounterSamplingMode samplingMode {COUNTER_SAMPLING_DISABLED};

/**
* @brief The frame sample serialization mode.
*/
SamplingMode mode {MODE_DISABLED};
bool frameSerialization {true};

/**
* @brief The sampling period in frames, or 0 if disabled.
Expand Down
7 changes: 7 additions & 0 deletions layer_gpu_profile/source/layer_device_functions.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -419,3 +419,10 @@ VKAPI_ATTR VkResult VKAPI_CALL layer_vkQueueSubmit2KHR<user_tag>(VkQueue queue,
uint32_t submitCount,
const VkSubmitInfo2* pSubmits,
VkFence fence);

/* See Vulkan API for documentation. */
template <>
VKAPI_ATTR VkResult VKAPI_CALL layer_vkQueueBindSparse<user_tag>(VkQueue queue,
uint32_t bindInfoCount,
const VkBindSparseInfo* pBindInfo,
VkFence fence);
Loading