Add BCn encoders/decoders with RDO support#1167
Conversation
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Add encode/decode tests (that use both CompressBCn/DecodeBCn) *Add BCn ktx2 test files (transcoded from tests/resources/ktx2/color_grid_uastc_zstd_5.ktx2) *Cleanup BCn test fixtures *Remove `std::cout` statement Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
There are also a lot of compiler warnings from bc7enc_rdo dependency. These should be straightforward to address directly in copied files from bc7enc_rdo. |
*Add BC1, BC3, BC4, BC5, and BC7 encoding support to "ktx encode" command. *Cleanup ktxBCnParams and add BC1/BC3 quality and mode params. *Add docstrings/documentation to newly added enums/structs in ktx.h. Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
@walcht, Thank you for this. Can you view the build logs? One issue I notice immediately is that there are no changes to Regarding RDO and multi-threading, the UASTC encoder also has separate multi-threading options for the encoder and for the RDO step. You can follow the same model. If we need to fix warnings in the encoder I suggest forking the encoder then incorporating the fork here by way of Re. SIMD and ISPC, be careful how you support this. We need to support building and running on arm64 processors. Also compile flags to enable SSE or other SIMD options are not compatible with straightforward use of universal build tool chains, which is why this project does not do universal builds. Better is use of compiler pre-defined macros and run-time queries to discover what the software is being compiled for and running on. However since we aren't doing universal builds there is no need to obsess over this last detail. |
|
@MarkCallow - Concerning the Concerning the build/CI logs: they are mostly failing because of bc7enc_rdo compiler warnings which should be suppressed. They will also fail because I haven't re-generated the golden files yet for ktx CTS (e.g.,
Please do so (as far as I understood, this will be forked under the KhronosGroup and any updates here will be pushed there via the Concerning SIMD and ISPC: I think it makes sense to leave this for another PR, do you agree? (reasoning is this: I have to get the basics working properly, add proper testsuite that covers all supported BCn formats, etc. Once that is done, I can open another PR for SIMD performance improvements). Or I will leave this to very end (last in TODO list above). |
*Suppress bc7enc_rdo warnings like unused-variables, memset'ing a non-trivial class (in this case the class is obviously trivial hence a void* cast is used to suppress this warning). Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Some CIs report further unused variables/functions that are not reported when building locally. Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
Updates about CI jobs:
|
|
You will need to rebase to or merge current main to get the fix for the NSIS issue on Windows. I have created a fork of https://github.com/richgel999/bc7enc_rdo. To incorporate it in your add-BCn-decoder branch do the following in the repo root directory: You can make changes in this subdirectory, as you are now, and when everything is working I can push the changes to the fork.
I agree. I wanted to make you aware that arm64 is a build target. Re. reuse, you have to add an entry to REUSE.toml to get external/bc7enc_rdo ignored. It is better to do that than add SPDX comments to all the files. The entry in REUSE.toml will have to mention a license. Use the MIT license option. Here are a few high level points.
|
|
Re the macOS build failure, because the output from the Xcode build is so voluminous it is run through a script, xcpretty, to prettify it. On a past CI service, without this, the logs exceeded the maximum allowed. Recently, for reasons I have yet to investigate, it has started swallowing compile errors. You can turn it off by editing scripts/install_macos.sh and commenting out the line that installs |
*Add BCn encoder support for `ktx create` command. *Add missing tests for `ktx encode`, `ktx extract`, and `ktx create`. *Expose BC1/BC3 approximation mode option to ktx CLIs *Misc cleanups (still early-stage PR) Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Agree. I wasn't initially aware of this. Apparently bc7f is significantly faster that the bc7 encoder used here (also the added benefit of being continuously maintained). I will integrate this right now since this seems to be straightforward (bc7enc_rdo also seems a bit not-longer-maintained so l think it's better to just integrate it now rather than waiting for it to be integrated into bc7enc_rdo repo). Concerning the decoding API: I added BCn decoders for VkUpload/GLUpload as a TODO (will follow same API as in etcunpack). Might also open a PR to add it for ASTC since I have already spent some time getting familiar with this code base.
Will add it in this PR. Will add it at the very end though since I have to finalize current formats.
I will try to address CI issues now (It is fine if I incrementally commit here to see if certain jobs pass?). |
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
…Group/bc7enc_rdo.git external/bc7enc_rdo subrepo: subdir: "external/bc7enc_rdo" merged: "dbe416d2" upstream: origin: "https://github.com/KhronosGroup/bc7enc_rdo.git" branch: "changes_for_ktx" commit: "dbe416d2" git-subrepo: version: "0.4.9" origin: "https://github.com/ingydotnet/git-subrepo" commit: "5e0f401"
*Before this commit, bc7enc_rdo dependency was manually copied to external/bc7enc_rdo directory (only needed files were copied). This was not ideal for a lot of reasons (mainly that we are introducing changes that may be streamed back to the original repo and having a subrepo/submodule is better suited for that than manually copying dependency files). *Add bc7enc_rdo to REUSE.toml with MIT license. *Git ignore compile_commands.json file (used by clangd LSP) Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
The integrated basisu_transcoder is a bit outdated (actually, significantly) and doesn't contain |
|
I was not aware that bc7f is included in basisu_transcoder. By pure coincidence I have just completed integration of Basis Universal release 2.1.0. See the Neither the KTX-Software code nor our golden files required any updates for our extensive test suite to pass with BU 2.1.0. I am therefore amenable to merging it now but will have to discuss within the Khronos WG and can't make any promises. The single image decoders used by GLUpload/VkUpload should be exposed in the library API but must be independent of the ktxTexture* classes. Software reading a KTX file incrementally will find them useful. Regarding the current ETC decoder, please note that it does not have a recognized open source license so might not be suitable for your OIIO work. |
*Add initial RDO post processing step for BC1, BC3, and BC7 but without ultrasmooth blocks support see: https://richg42.blogspot.com/2021/02/updated-bc7encrdo-with-improved-smooth.html *Fix encoder in case input texture does not have multiple-of-4 dimensions. This was reading beyond std::vector size before this commit. *Add initial RDO params to BCnParams with verbose explanation/description comments. *Misc refactoring/adjustments. Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
@MarkCallow - please don't approve the CI workflow yet (it will fail because I haven't updated the golden test files yet).
Will add bc7f once I finish RDO post processing. |
|
I have just merged PR #1170 so you will now find bc7f in One other thing re. |
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
The arguments I decided to expose BCn options/args as such (this is output from |
UASTC RDO has many of the same options. Please use the same names, just changing the prefix to something like I also want to include the high level |
Oops. I misunderstood and removed the subrepo. Nothing is lost and will add it again as subrepo with all other unused files removed.
This is already done (with EXR output for BC6HU and PNG for others). |
*Through testing, RDO ultrasmooth block handling significantly reduces block artifacts for smooth/very smooth blocks (e.g., gradients, sky, blurred background, etc.). Ideally, this should be enabled by default. Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
I am happy you have extract working. As for ultrasmooth, the lower image definitely looks better at a glance but when zooming in, I find it difficult to see precisely why. I don't see any hugely noticeable blockiness in either image. Have you got metrics working? Metrics relies on extracting an image and comparing it against the original input. That will enable quantitative comparison. |
Now you have pointed it out, I can see it.
I look forward to it. It will likely be a useful guide for UASTC RDO too.
See |
|
This is the link to the guide/detailed notes about different RDO parameters: Hopefully by next commit (in which all your points should be addressed) I will finally mark this as |
I had a quick read. It is very thorough and interesting. It needs some editing - to remove duplication, aid clarity and fix some typos. Providing the feedback will take more time than I have today. When I have time, should I open issues at https://github.com/walcht/walcht/issues? I can probably provide PRs for some simple things but I wouldn't want to try for others. |
Yes, please (that would be really nice). I will add more details + benchmarks at some point. |
|
@MarkCallow - Some minor updates:
Done. I chose to use exact same name as ktxBasisParams but with I have also removed the
I looked into some details in basisu about this. Should I do this now (in this PR) or wait for the PR that will add |
|
And this is the single image BCn decoder that you requested me to include. It is independent of KtxTexture* C classes. I included this in /**
* @ingroup reader
* @brief Decodes a provided BCn-encoded image. All BCn formats are supported
* (BC1, BC2, BC3, BC4, BC5, BC6HU, BC6HS, or BC7).
*
* Decoding into non-multiple-of-4 texture dimensions is also supported
* (decoded blocks that fall out of the image's dimensions are simply
* discarded).
*
* @param [in] src_blocks pointer to the BCn-encoded blocks.
* @param [in] dst pointer to where to write the decoded image. Should
* be able to hold the size of the corresponding
* decompressed vkFormat.
* @param [in] width current image's width.
* @param [in] height current image's height.
* @param [in] bcn which BCn compression kind the provided image is
* encoded in.
* @param [in] params pointer to BC1, and subsequently BC3, decoder
* parameters.
*
* @return KTX_SUCCESS on success, other KTX_* enum values on
* error.
*
* @exception KTX_INVALID_VALUE
* @p params is NULL but @p This texture is BC1 or BC3
* compressed.
* @exception KTX_INVALID_OPERATION
* Decoder/Unpacker returned an error exit code or a
* non-success return flag. Only occurs for BC1, BC2,
* BC3, and BC7 (BC2 and BC3 are based on BC1).
*/
KTX_API KTX_error_code KTX_APIENTRY
ktxUnpackBCn(const ktx_uint8_t* src_blocks, ktx_uint8_t* dst, ktx_uint32_t width,
ktx_uint32_t height, ktx_bcn_compression_e bcn, ktxBC1UnpackParams* params);Now all that remains are CTS tests. |
… code *Remove --rdo-allow-relative-movement since it makes encoding significantly slower with no noticeable improvements. *Use same RDO option names as UASTC RDO (also use same defaults and same ranges). *Add single-image BCn decoder that is independant of KtxTexture* C classes and which can be used by Vulkan/OpenGL loaders to upload a single image/mip level. *Refact BCn encoding code to use the said function above. *Fix a pointer arithmetic issue with encoder when given non-multiple-of-4 input images/textures. Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
|
Please make a PR for the CTS tests so I review them. Also update the CTS ref here to resolve the conflict which will also allow me to run CI on this PR. Note, I am not sure if having a CTS ref to a commit in the branch that is source for your PR will work. Let's try it and see. |
*Basisu's BC6HU encoder fails if it encounters NaNs, infinites, negative, or larger than basist::ASTC_HDR_MAX_VAL half floats. This commit addresses that by adding a HDR cleanup pre-processing step (basis_compressor::clean_hdr_image). Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
*Point to original CTS submodule commit (not the PR containing the additional BCn encoder/decoder tests). Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
KhronosGroup/KTX-Software-CTS#74 (I haven't finalized these yet).
I reverted to the same CTS submodule commit as main (I just keep getting a merge conflict even after updating the CTS PR - have no idea why). |
I think you need to merge the latest main from KTX-Software-CTS into your branch. |
For the moment, I am fixing Window build CI issues. |
*size_t were used all over the place and silently converted to uint32_t which throws warnings (which are treated as errors) for Windows platforms on Visual Studio 2022 17. Now uint32_t is used consistently and size_t is only used very sparingly. *CTS tests now point to the PR corresponding to BCn encoders/decoders PR. *multithreading.h/cpp was wrongly refactored. static keyword is removed and functions are properly declared and defined in header and cpp files, respectively. *REUSE complained about missing license of bc6hu file. Added Khronos Group license to this file (it was created by me from scratch). Signed-off-by: Walid Chtioui <walid.chtioui.main@gmail.com>
The conflict may have been because the selected action for this PR was "Create a merge commit". If so, that was selected because that was the action I used on the most recent PR that I merged. It now shows "Squash and merge" which vastly reduces the possibility of a conflict. With "Create a merge commit", each commit has the possibility to be causing a conflict. I have been caught out by this before. I hate that there is no way to set a default action. |





As discussed in #1159:
ktxTexture2_CompressBCnandktxTexture2_DecodeBCnare introduced in this PR to allow libktx users/consumers to encode/decode BCn textures from/to raw decompressed formats.https://github.com/richgel999/bc7enc_rdo does not support BC6HU/BC6HS encoding/decoding and also no BC2 (this format is essentially dead since BC3 replaces it).
Please feel free to give feedback, edit, and nitpick as much as possible.
Some context: I am adding KTX2 support to OIIO (PR: AcademySoftwareFoundation/OpenImageIO#5185) and having libktx encode/decode BCn formats significantly simplifies things (also ETC encoding/decoding which I can also open a PR for - if approved).
I haven't updated the KTX-Software-CTS with the added BCn test files.
Once this is finalized, this will fix #587.
Current TODOs:
[x] BC2 mode RDOBC2 is rarely used (I don't see any reason why to use it instead of BC3). Postponed or not planned to be implemented at all.BC6H mode RDO (HDR). I don't know if this is possible. If not, then I will not adjust the ert::reduce_entropy function for the moment.postponed (too much work/overload to include in this PR).ktx encodecommandktx extractcommandktx createcommandktxBCnParamsstruct (apparently there are many and I am not qualified to know which subset to expose or to expose them all). For the moment I will just expose them all.We agreed on using same parameters as UASTC RDO and adding additional ones if they make sense (I haven't benchmarked skip 0 MSE error option so I might remove it if it useless).
enable SIMD acceleration for BC7 encoder (using ISPC - see https://github.com/ispc/ispc) (this should be straightforward and should be enabled by default via a CMake flag; e.g., BC7_SIMD).=> since we are planning to use bc7f we will be planning to use bc7g once it is released (this is the SIMD equivalent of bc7f).LIBKTX_FEATURE_BCN_DECODERCMake flag optioncopied from MIT Licensed https://github.com/iOrange/bcdecfrom basisu)[x] add BC2 encoderpostponed/aborted (rarely used).Note1: no LLMs/AI coding tools were used in any capacity whatsoever in writing or aiding in the writing of this PR.
Note2: I am an individual contributor (main reason I am contributing here is to add support for KTX2 in Blender).
Edit: TODO list edits