Log1p #377

ManasaDattaT · 2024-12-31T04:16:06Z

Host and HIP implementation of log1p , taken reference from log
In HOST in AVX2, for 3d and 4d combined the dimension to vectorize.
In HIP for log in 3d implementation as there is usage of multiple kernel launches, used Nd for 3d as well in log1p
And some changes were hardcoded in the test suite to make it able to run for I16 datatype, which are not meant to be pushed but just to evaluate the QA test.

… log1p_HOST

HOST Implementation

HazarathKumarM

@ManasaDattaT please address the review comments

include/rppt_tensor_arithmetic_operations.h

HazarathKumarM · 2025-01-02T05:59:28Z

include/rppt_tensor_arithmetic_operations.h

+ * \retval RPP_SUCCESS Successful completion.
+ * \retval RPP_ERROR* Unsuccessful completion.
+ */
+RppStatus rppt_log1p_gpu(RppPtr_t srcPtr, RpptGenericDescPtr srcGenericDescPtr, RppPtr_t dstPtr, RpptGenericDescPtr dstGenericDescPtr, Rpp32u *roiTensor, rppHandle_t rppHandle);


please check the gpu declaration, it is repeated twice

HazarathKumarM · 2025-01-02T06:15:09Z

src/include/cpu/rpp_cpu_simd.hpp

@@ -148,6 +149,8 @@ const __m128i xmm_pxMask04To07 = _mm_setr_epi8(4, 0x80, 0x80, 0x80, 5, 0x80, 0x8
 const __m128i xmm_pxMask08To11 = _mm_setr_epi8(8, 0x80, 0x80, 0x80, 9, 0x80, 0x80, 0x80, 10, 0x80, 0x80, 0x80, 11, 0x80, 0x80, 0x80);
 const __m128i xmm_pxMask12To15 = _mm_setr_epi8(12, 0x80, 0x80, 0x80, 13, 0x80, 0x80, 0x80, 14, 0x80, 0x80, 0x80, 15, 0x80, 0x80, 0x80);

+


please remove these extra spaces

HazarathKumarM · 2025-01-02T06:31:02Z

src/include/cpu/rpp_cpu_simd.hpp

@@ -2693,6 +2713,8 @@ static inline __m256 log_ps(__m256 x)
    __m256 one = *(__m256 *)&avx_p1;
    __m256 invalid_mask = _mm256_cmp_ps(x, avx_p0, _CMP_LE_OQ);

+    // x = _mm256_add_ps(x, one);


please remove the commented-out lines, if these are not necessary

HazarathKumarM · 2025-01-02T06:31:26Z

src/include/hip/rpp_hip_common.hpp

@@ -464,6 +464,18 @@ __device__ __forceinline__ float rpp_hip_unpack3(int src)
 {
    return (float)(schar)((src >> 24) & 0xFF);
 }
+// Un-Packing from I16s
+


remove blank line at L468

HazarathKumarM · 2025-01-02T06:33:45Z

src/modules/rppt_tensor_arithmetic_operations.cpp

+    else if ((srcGenericDescPtr->dataType == RpptDataType::I8) && (dstGenericDescPtr->dataType == RpptDataType::I8)) return RPP_ERROR_INVALID_DST_DATATYPE;
+    else if ((srcGenericDescPtr->dataType == RpptDataType::I16) && (dstGenericDescPtr->dataType == RpptDataType::F32))
+    {
+        log1p_generic_host_tensor(static_cast<Rpp16s *>(srcPtr) + srcGenericDescPtr->offsetInBytes,


please check the alignment here

HazarathKumarM · 2025-01-02T06:34:00Z

src/modules/rppt_tensor_arithmetic_operations.cpp

@@ -559,4 +594,66 @@ RppStatus rppt_log_gpu(RppPtr_t srcPtr,
 #endif // backend
 }

+RppStatus rppt_log1p_gpu(RppPtr_t srcPtr,


check alignment

HazarathKumarM · 2025-01-02T06:35:17Z

utilities/test_suite/HIP/Tensor_misc_hip.cpp

    set_generic_descriptor(srcDescriptorPtrND, nDim, offSetInBytes, bitDepth, batchSize, roiTensor);
-    set_generic_descriptor(dstDescriptorPtrND, nDim, offSetInBytes, bitDepth, batchSize, roiTensor);
+    set_generic_descriptor(dstDescriptorPtrND, nDim, offSetInBytes, 2, batchSize, roiTensor);


I guess you have hardcoded the value for your use case, please revert these changes

HazarathKumarM · 2025-01-02T06:36:44Z

utilities/test_suite/rpp_test_suite_misc.h

@@ -129,7 +142,7 @@ void fill_roi_values(Rpp32u nDim, Rpp32u batchSize, Rpp32u *roiTensor, bool qaMo
            }
            case 4:
            {
-                std::array<Rpp32u, 8> roi = {0, 0, 0, 0, 1, 128, 128, 128};
+                std::array<Rpp32u, 8> roi = {0, 0, 0, 0, 128, 128, 128, 4};


please revert the changes

HazarathKumarM · 2025-01-02T06:37:09Z

utilities/test_suite/rpp_test_suite_misc.h

@@ -369,8 +384,14 @@ void compare_output(Rpp32f *outputF32, Rpp32u nDim, Rpp32u batchSize, Rpp32u buf
        for(int j = 0; j < sampleLength; j++)
        {
            bool invalid_comparision = ((out[j] == 0.0f) && (ref[j] != 0.0f));
-            if(!invalid_comparision && abs(out[j] - ref[j]) < 1e-4)
-                cnt++;
+            if(!invalid_comparision )


please revert these changes

ManasaDattaT and others added 18 commits December 18, 2024 15:12

Initial log1p implementation in C++

d834306

Added for nDim = 4 separately instead of recursive loop in log1p

0cfe759

Test by converting existing input F32 to I16

80184d2

log1p_HIP_Implementation

ddd7cb4

added abs in AVX2

1f6a0ce

HIP calls

3f58b8c

log1p HOST

6edb178

log1p HOST

2967bb7

calls in HIP backend

5513f2d

#

eef3ae2

Add files via upload

be135aa

reference output files for log1p

8f31d18

Merge branch 'log1p_HOST' of https://github.com/ManasaDattaT/rpp into…

aafae05

… log1p_HOST

log1p HOST implementation

0117c34

log1p HOST implementation

2eb2039

Merge branch 'log1p_HOST' into log1p_HOST_HIP

c65051e

HOST Implementation

merge conflicts resolved

ae0fd19

removed print statements

b82517f

HazarathKumarM reviewed Jan 2, 2025

View reviewed changes

ManasaDattaT added 7 commits January 2, 2025 18:36

Worked on the review comment

d6beaa4

Worked on the review comment

d78c287

Update rpp_hip_common.hpp

3038153

Minor changes after review

835a8ad

Resolved merge conflicts

8872db6

Reverted the testsuite changes, which were added in support for I16

578a967

removed the testsuite support

2173693

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Log1p #377

Log1p #377

ManasaDattaT commented Dec 31, 2024 •

edited

Loading

HazarathKumarM left a comment

HazarathKumarM Jan 2, 2025

ManasaDattaT Feb 14, 2025

HazarathKumarM Jan 2, 2025

ManasaDattaT Feb 14, 2025

HazarathKumarM Jan 2, 2025

HazarathKumarM Jan 2, 2025

ManasaDattaT Feb 14, 2025

HazarathKumarM Jan 2, 2025

ManasaDattaT Feb 14, 2025

HazarathKumarM Jan 2, 2025

ManasaDattaT Feb 14, 2025

HazarathKumarM Jan 2, 2025

HazarathKumarM Jan 2, 2025

HazarathKumarM Jan 2, 2025

		@@ -148,6 +149,8 @@ const __m128i xmm_pxMask04To07 = _mm_setr_epi8(4, 0x80, 0x80, 0x80, 5, 0x80, 0x8
		const __m128i xmm_pxMask08To11 = _mm_setr_epi8(8, 0x80, 0x80, 0x80, 9, 0x80, 0x80, 0x80, 10, 0x80, 0x80, 0x80, 11, 0x80, 0x80, 0x80);
		const __m128i xmm_pxMask12To15 = _mm_setr_epi8(12, 0x80, 0x80, 0x80, 13, 0x80, 0x80, 0x80, 14, 0x80, 0x80, 0x80, 15, 0x80, 0x80, 0x80);

Log1p #377

Are you sure you want to change the base?

Log1p #377

Conversation

ManasaDattaT commented Dec 31, 2024 • edited Loading

HazarathKumarM left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ManasaDattaT commented Dec 31, 2024 •

edited

Loading