HPCC-30365 Add XRef Sasha service to K8s #19639

jackdelv · 2025-03-19T13:20:30Z

Type of change:

This change is a bug fix (non-breaking change which fixes an issue).
This change is a new feature (non-breaking change which adds functionality).
This change improves the code (refactor or other change that does not change the functionality)
This change fixes warnings (the fix does not alter the functionality or the generated code)
This change is a breaking change (fix or feature that will cause existing behavior to change).
This change alters the query API (existing queries will have to be recompiled)

Checklist:

Smoketest:

Send notifications about my Pull Request position in Smoketest queue.
Test my draft Pull Request.

Testing:

- expandMask wasn't taking stripeNum and dirPerPart into account when creating the physical file name from the part mask - scanDirectory was calling itselg recursively on the dirPerPart directories causing files parts to be under different cDirDesc which prevented them from being logged as found.

github-actions · 2025-03-19T13:20:48Z

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-30365

Jirabot Action Result:
Workflow Transition To: Merge Pending
Additional PR: #19639

jakesmith

@jackdelv - I think changes look logical, but there are various problems with the changes to expandMask.
Could do with some unittests added to prove for various input it is creating the expected output.

jakesmith · 2025-03-20T09:21:32Z

dali/base/dautils.cpp

@@ -3890,6 +3890,7 @@ extern da_decl void parseFileName(const char *name,StringBuffer &mname,unsigned
                        mname.append(s);
                    num = pn;
                    max = mn;
+                    break;


this change looks sensible, but what was happening before this change was made?

Before the break was added, it would iterate over the tail of the string again i.e it would check each character in "_1_of_2" after it had already been processed for the part and max numbers.

jakesmith · 2025-03-20T09:23:29Z

dali/sasha/saxref.cpp

+                }
+                if (isDirPerPart) {
+                    // MORE: Should maybe check this doesn't contain any subdirectories to make
+                    // sure it is really a dirPerPart directory. Is an all numbers subdirectory valid in ecl?


in practice yes, a scope must have a leading alpha char. So worth clarifying the comment.

Clarified comment.

jakesmith · 2025-03-20T09:28:19Z

dali/sasha/saxref.cpp

+                        isDirPerPart = false;
+                }
+                if (isDirPerPart) {
+                    // MORE: Should maybe check this doesn't contain any subdirectories to make


can check pdir.dirs after the scanDirectory, should be empty.
Let's add a check and throw an exception if not.

Added check (pdir->dirs.ordinality()>0)

jakesmith · 2025-03-20T09:42:51Z

dali/sasha/saxref.cpp

-            unsigned maxMb = serverConfig->getPropInt("DfuXRef/@memoryLimit", DEFAULT_MAXMEMORY);
+            unsigned maxMb;
+            if (isContainerized()) {
+                const char *resourcedMemory = getComponentConfigSP()->queryProp("resources/@memory");


now 'props' is saved, should used it instead of getComponentConfigSP() here.

Fixed to use new props member.

jakesmith · 2025-03-20T09:45:07Z

system/jlib/jutil.cpp

    const char *s=mask;
    if (s)
        while (*s) {
            char next = *(s++);
            if (next=='$') {
+                if (dirPerPart)
+                {


trivial/formatting: Allman vs K&R

Removed change.

jakesmith · 2025-03-20T09:57:01Z

system/jlib/jutil.cpp

+                {
+                    const char * start = buf.str();
+                    const char * slash = start + buf.length();
+                    while (slash > start && *slash != PATHSEPCHAR)


'slash' at the start is potentially beyond the end of the allocated buffer I think.

Reverted changed.

jakesmith · 2025-03-20T10:00:12Z

system/jlib/jutil.cpp

+                    while (slash > start && *slash != PATHSEPCHAR)
+                        slash--;
+                    buf.insert(slash-start, PATHSEPCHAR);
+                    buf.insert((slash+1)-start, p+1);


not safe, buf may have reallocated after 1st insert, meaning 'slash' points to a free'd pointer.

These insert also don't look right in that they're inserting "/", but the mask is typically something like "myfilename._$P$_of_10", so it's going to end up with e.g. "/5myfilename._5_of_10" afaics.

If it did find a slash, it would prob be okay, but there isn't necessarily a full path so there's no guarantee there's a slash.

Also, the mask can be "myfilename._$P$_of_$N$" (see 'N' handleing below)
This insert code is going to be hit twice in that case, and therefore insert the dir-per-part directory twice...

Reverted changes to expandMask in favor of storing prefix and scope directory paths in cFileDesc and building up full paths in getPartName instead.

jakesmith · 2025-03-20T10:28:18Z

system/jlib/jutil.cpp

 {
+    if (stripeNum>0)
+        addPathSepChar(buf.append('d').append(stripeNum));


hm, how is this strip dir going to be in right place?

Removed changes to expandMask.

- Change cFileDesc to track the prefix and scope paths separately from name - When getPartName is called build up stripeNum and dirperpart in between prefix and scope paths - Find matching file parts by hashing full path minus stripeNum and dirperpart numbers

jakesmith

As discussed in our meeting, there's are some issues with the way the info. is being scanned and stored in the hierarchical structure built up during the physical file scan and used subsequently.
Currently too much information is being stored per cFileDesc - as before, the file path shouldn't be needed, only the deduced filename mask.
Lost files are probably being misreported (at least during initial phase), because the paths they walk do not match the physical representation on disk (due to striping and dir-per-part directories).

saxref should:

ensure runXRef is dealing with 1 plane a time (i.e. if multiple selected, process 1 at a time).
store plane details (IPT) so accessible to other xref tasks during scan.
during scanDirectories, if striped, detected when at stripe level, and 'skip'
detect dir-per-part directory and 'skip'. But, check after recursing the dir, that contains no subdirs.
Keep the filename mask only in cFileDesc
Add scope/lfn to listOrphans. Build up in same way as 'basedir' is now.
Think we can get rid of baseDir altogether.
during scanLogicalFiles, parse file paths fetched from parts (add helper func), to remove stripe and dir-part-part, so can marry the pathing with the cDirDesc tree.
NB: LATER- may be better to assume that the cDirDesc is a representation of scopes, and walk scopes of logical file (+part endpoints), instead of getting part directories.
In listOrphans(cFileDesc), deduce file path from lfn, partNum and plane. Add a utility func that deduces and uses stripe num and dir-per-part if relevant (based on plane details).

@jackdelv

jakesmith · 2025-03-21T10:53:47Z

dali/base/dautils.cpp

@@ -3834,7 +3835,8 @@ extern da_decl void parseFileName(const char *name,StringBuffer &mname,unsigned
        }
    }

-    if (mname.isEmpty())
+    // Assume that if prefix is passed in a match is required
+    if (prefix && prefix->isEmpty())
        throw makeStringExceptionV(-1, "Could not find matching prefix in plane definition for file %s", filename);


I think throwing an exception here, will mean 1 path that fails to match, will cause the whole xref process to fail?

True of any exception in parseFilename. NB: addFile only issues warnings.

addFile should likely have a try/catch and issue warnings in case of any exception in parseFilename.

There's something that doesn't make sense here in fact I think..

How would the files not match prefix?
And, why is it scanning planes every file being added, to determine which plane the prefix path is in?
Given, we are specifically xref'ing a given plane, and therefore start at the prefix path..
This relates to HPCC-33151.

Let's discuss.

jakesmith · 2025-03-21T11:10:53Z

dali/base/dautils.cpp

-                mname.append((d+1)-name, name).append(cur-(tailSlash+1), tailSlash+1);
+                if (dirs)
+                    dirs->append((d+1)-name, name);
+                mname.append(cur-(tailSlash+1), tailSlash+1);


common with line 3881, could go outside if/else

jakesmith · 2025-03-21T11:40:09Z

dali/sasha/saxref.cpp

    }

    StringBuffer &getPartName(StringBuffer &buf,unsigned p)
    {
+        // In baremetal, buf can be prepoulated with replicate directory


The old code was:

StringBuffer &getPartName(StringBuffer &buf,unsigned p) { StringBuffer mask; getName(mask); return expandMask(buf, mask, p, N); }

If buf is prepopulated, didn't that mean that as it was, it added the whole path again?

This feels like it's masking a problem, rather than the correct fix.

Something not quite right here..
the cFileDesc (and cDirDesc) contain the name only, not a path.

getPartName should be returning an expanded form of the name mask only, as it was before.

jackdelv and others added 2 commits March 13, 2025 16:20

HPCC-30365 Add XREF Sasha service to K8s

524dee3

jackdelv requested a review from jakesmith March 19, 2025 13:20

jakesmith requested changes Mar 20, 2025

View reviewed changes

jackdelv requested a review from jakesmith March 20, 2025 20:48

jakesmith requested changes Mar 21, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

HPCC-30365 Add XRef Sasha service to K8s #19639

HPCC-30365 Add XRef Sasha service to K8s #19639

jackdelv commented Mar 19, 2025 •

edited

Loading

github-actions bot commented Mar 19, 2025

jakesmith left a comment

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith Mar 20, 2025

jakesmith Mar 20, 2025

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith Mar 20, 2025

jackdelv Mar 20, 2025

jakesmith left a comment

jakesmith Mar 21, 2025

jakesmith Mar 21, 2025

jakesmith Mar 21, 2025

jakesmith Mar 21, 2025

jakesmith Mar 21, 2025

HPCC-30365 Add XRef Sasha service to K8s #19639

Are you sure you want to change the base?

HPCC-30365 Add XRef Sasha service to K8s #19639

Conversation

jackdelv commented Mar 19, 2025 • edited Loading

Type of change:

Checklist:

Smoketest:

Testing:

github-actions bot commented Mar 19, 2025

jakesmith left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jakesmith left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jackdelv commented Mar 19, 2025 •

edited

Loading