-
Notifications
You must be signed in to change notification settings - Fork 65
New debug draw extension for AABBs #900
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
struct PSInput | ||
{ | ||
float32_t4 position : SV_Position; | ||
float32_t4 color : TEXCOORD0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are you using HW attributes for color? the color is per-instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah these are Vx-Px inter-stage shenanigans, I'll allow, make sure you label color flat though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
still not labelled as flat interpolation
include/nbl/ext/DebugDraw/builtin/hlsl/aabb_instances.fragment.hlsl
Outdated
Show resolved
Hide resolved
include/nbl/ext/DebugDraw/builtin/hlsl/aabb_instances.vertex.hlsl
Outdated
Show resolved
Hide resolved
include/nbl/ext/DebugDraw/builtin/hlsl/aabb_instances.vertex.hlsl
Outdated
Show resolved
Hide resolved
include/nbl/ext/DebugDraw/builtin/hlsl/aabb_instances.vertex.hlsl
Outdated
Show resolved
Hide resolved
const float32_t3 unitAABBVertices[8] = { | ||
float32_t3(0.0, 0.0, 0.0), | ||
float32_t3(1.0, 0.0, 0.0), | ||
float32_t3(0.0, 0.0, 1.0), | ||
float32_t3(1.0, 0.0, 1.0), | ||
float32_t3(0.0, 1.0, 0.0), | ||
float32_t3(1.0, 1.0, 0.0), | ||
float32_t3(0.0, 1.0, 1.0), | ||
float32_t3(1.0, 1.0, 1.0) | ||
}; | ||
|
||
PSInput output; | ||
float32_t3 vertex = unitAABBVertices[glsl::gl_VertexIndex()]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
its better to use arithmetic for this, e.g.
const uint32_t index = glsl::gl_VertexIndex();
cosnt float32_t vertex = (promote<uint32_t3>(index)>>uint32_t(0,2,1)) & 0x1u;
Also its slightly annoying that that you took Y coordinate to be more important than Z
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this repeats betwee n the single and multiple draw vertex shader, turn this into a function and slap it in common.hlsl
#include "nbl/builtin/hlsl/math/linalg/fast_affine.hlsl" | ||
#include "nbl/builtin/hlsl/glsl_compat/core.hlsl" | ||
#include "nbl/builtin/hlsl/bda/__ptr.hlsl" | ||
#include "common.hlsl" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the includes repeat between the single and multiple draw vertex shader, slap them in common.hlsl
under a __HLSL_VERSION
#ifndef _NBL_EXT_DRAW_AABB_H_ | ||
#define _NBL_EXT_DRAW_AABB_H_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
full header guard
{ | ||
public: | ||
static constexpr inline uint32_t IndicesCount = 24u; | ||
static constexpr inline uint32_t VerticesCount = 8u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unused variable, yeet it
|
||
bool render(video::IGPUCommandBuffer* commandBuffer, video::ISemaphore::SWaitInfo waitInfo, std::span<const InstanceData> aabbInstances, const hlsl::float32_t4x4& cameraMat); | ||
|
||
static hlsl::float32_t4x4 getTransformFromAABB(const hlsl::shapes::AABB<3, float>& aabb); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
docs please for this, also make it inline
~DrawAABB() override; | ||
|
||
private: | ||
static bool validateCreationParameters(SCreationParameters& params); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it an operator bool() const
of the creation params
|
||
struct SCreationParameters : SCachedCreationParameters | ||
{ | ||
video::IQueue* transfer = nullptr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment that its only needed to make the 24 element index buffer and not used for anything later
DrawAABB::DrawAABB(SCreationParameters&& params, core::smart_refctd_ptr<video::IGPUGraphicsPipeline> singlePipeline, smart_refctd_ptr<IGPUGraphicsPipeline> batchPipeline, smart_refctd_ptr<IGPUBuffer> indicesBuffer) | ||
: m_cachedCreationParams(std::move(params)), m_singlePipeline(std::move(singlePipeline)), m_batchPipeline(std::move(batchPipeline)), | ||
m_indicesBuffer(std::move(indicesBuffer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nitpick you could roll all these parameters into a struct so you have less typing as you move between create
, new
, the constructor and initialization
#ifdef NBL_EMBED_BUILTIN_RESOURCES | ||
auto archive = make_smart_refctd_ptr<builtin::CArchive>(smart_refctd_ptr(logger)); | ||
system->mount(smart_refctd_ptr(archive), archiveAlias.data()); | ||
#else | ||
auto NBL_EXTENSION_MOUNT_DIRECTORY_ENTRY = (path(_ARCHIVE_ABSOLUTE_ENTRY_PATH_) / NBL_ARCHIVE_ENTRY).make_preferred(); | ||
auto archive = make_smart_refctd_ptr<nbl::system::CMountDirectoryArchive>(std::move(NBL_EXTENSION_MOUNT_DIRECTORY_ENTRY), smart_refctd_ptr(logger), system); | ||
system->mount(smart_refctd_ptr(archive), archiveAlias.data()); | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I'm starting to think that the HLSL includes can't be found because of a bad archive alias (esp in the non-embedded case)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also how do you protect against mounting the same archive twice ?
const auto validation = std::to_array | ||
({ | ||
std::make_pair(bool(creationParams.assetManager), "Invalid `creationParams.assetManager` is nullptr!"), | ||
std::make_pair(bool(creationParams.assetManager->getSystem()), "Invalid `creationParams.assetManager->getSystem()` is nullptr!"), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this can be asserted
// | ||
virtual inline bool isDirectory(const system::path& p) const | ||
{ | ||
// TODO: fix bug, input "nbl/ext/DebugDraw/builtin/hlsl" -> returs true when no such dir present in mounted stuff due to how it uses parent paths in loop (goes up up till matches "nbl" builtin archive and thinks it resolved the requested dir) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AnastaZIuk open an issue about it
if (!system->exists(path(NBL_ARCHIVE_ENTRY) / "common.hlsl", {})) | ||
mount(smart_refctd_ptr<ILogger>(params.utilities->getLogger()), system.get(), NBL_ARCHIVE_ENTRY); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't this check be inside mount
? https://github.com/Devsh-Graphics-Programming/Nabla/pull/900/files#r2353004514
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually yes it could be
// note we use archive entry explicitly for temporary compiler include search path & asset cwd to use keys directly | ||
constexpr std::string_view NBL_ARCHIVE_ENTRY = _ARCHIVE_ENTRY_KEY_; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does this variable come from and does everything still build without embedded resources cmake option?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
where does this variable come
_ARCHIVE_ENTRY_KEY_
is private define set with CMake, scoped to the lib's sources only
does everything still build without embedded resources cmake option?
builds and runs
CHLSLCompiler::SOptions options = {}; | ||
options.stage = stage; | ||
options.preprocessorOptions.sourceIdentifier = filePath; | ||
options.preprocessorOptions.logger = params.utilities->getLogger(); | ||
options.preprocessorOptions.includeFinder = includeFinder.get(); | ||
shaderSrc = compiler->compileToSPIRV((const char*)shaderSrc->getContent()->getPointer(), options); | ||
|
||
return params.utilities->getLogicalDevice()->compileShader({ shaderSrc.get() }); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@AnastaZIuk can we make @YasInvolved use CMake to make precompiled SPIR-V with NSC right way in a future PR to CI the shaders?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes we can
auto getRequiredAccessFlags = [&](const bitflag<IDeviceMemoryAllocation::E_MEMORY_PROPERTY_FLAGS>& properties) | ||
{ | ||
bitflag<IDeviceMemoryAllocation::E_MAPPING_CPU_ACCESS_FLAGS> flags(IDeviceMemoryAllocation::EMCAF_NO_MAPPING_ACCESS); | ||
|
||
if (properties.hasFlags(IDeviceMemoryAllocation::EMPF_HOST_READABLE_BIT)) | ||
flags |= IDeviceMemoryAllocation::EMCAF_READ; | ||
if (properties.hasFlags(IDeviceMemoryAllocation::EMPF_HOST_WRITABLE_BIT)) | ||
flags |= IDeviceMemoryAllocation::EMCAF_WRITE; | ||
|
||
return flags; | ||
}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only check for write flag on the mapping
smart_refctd_ptr<IGPUBuffer> indicesBuffer; | ||
params.utilities->createFilledDeviceLocalBufferOnDedMem( | ||
SIntendedSubmitInfo{ .queue = params.transfer }, | ||
std::move(bufparams), | ||
unitAABBIndices.data() | ||
).move_into(indicesBuffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be slightly less complex (wouldn't have to take the utilities
in the creation params) if you just shot off a single use commandbuffer with updateBuffer
command since the update is less than 64kb (you just need the usage flag on the buffer to allow it)
.offset = 0, | ||
.size = sizeof(SPushConstants) | ||
}; | ||
return device->createPipelineLayout({ &pcRange , 1 }, nullptr, nullptr, nullptr, nullptr); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why no tcall the createPipelineLayoutFromPCRange
singlePipeline = createPipeline(params, params.singlePipelineLayout.get(), "single.vertex.hlsl", "aabb_instances.fragment.hlsl"); | ||
if (!singlePipeline) | ||
{ | ||
logger->log("Failed to create pipeline!", ILogger::ELL_ERROR); | ||
return nullptr; | ||
} | ||
} | ||
|
||
smart_refctd_ptr<IGPUGraphicsPipeline> batchPipeline = nullptr; | ||
if (params.drawMode & ADM_DRAW_BATCH) | ||
{ | ||
batchPipeline = createPipeline(params, params.batchPipelineLayout.get(), "aabb_instances.vertex.hlsl", "aabb_instances.fragment.hlsl"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
shouldn't you default the layouts if they're missing?
SPushConstantRange pcRange = { | ||
.stageFlags = IShader::E_SHADER_STAGE::ESS_VERTEX, | ||
.offset = 0, | ||
.size = sizeof(SPushConstants) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
your createDefaultPipelineLayout
needs to take an enum about what pipeline this is for, cause you use two different push constant structs
return true; | ||
} | ||
|
||
hlsl::float32_t4x4 DrawAABB::getTransformFromAABB(const hlsl::shapes::AABB<3, float>& aabb) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this should return 3x4
not 4x4
bool DrawAABB::renderSingle(IGPUCommandBuffer* commandBuffer, const hlsl::shapes::AABB<3, float>& aabb, const hlsl::float32_t4& color, const hlsl::float32_t4x4& cameraMat) | ||
{ | ||
if (!(m_cachedCreationParams.drawMode & ADM_DRAW_SINGLE)) | ||
{ | ||
m_cachedCreationParams.utilities->getLogger()->log("DrawAABB has not been enabled for draw single!", ILogger::ELL_ERROR); | ||
return false; | ||
} | ||
|
||
commandBuffer->bindGraphicsPipeline(m_singlePipeline.get()); | ||
commandBuffer->setLineWidth(1.f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a struct with command buffer pointer, camera viewProj matrix and line width to use for both functions
SSinglePushConstants pc; | ||
|
||
hlsl::float32_t4x4 instanceTransform = getTransformFromAABB(aabb); | ||
pc.instance.transform = hlsl::mul(cameraMat, instanceTransform); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promoted_mul to mul 4x4 with 3x4
std::vector<InstanceData> instances(aabbInstances.size()); | ||
for (uint32_t i = 0; i < aabbInstances.size(); i++) | ||
{ | ||
auto& inst = instances[i]; | ||
inst = aabbInstances[i]; | ||
inst.transform = hlsl::mul(cameraMat, inst.transform); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can we skip having this temporary vector and do this loop as a lambda we call instead of the memcpy
below?
auto instancesIt = instances.begin(); | ||
const uint32_t instancesPerIter = streaming->getBuffer()->getSize() / sizeof(InstanceData); | ||
using suballocator_t = core::LinearAddressAllocatorST<offset_t>; | ||
while (instancesIt != instances.end()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you'll deadlock if the streaming buffer is not large enough, check aabbInstances.size()>instancesPerIter
and return false
No description provided.