Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HPCC-30365 Add XRef Sasha service to K8s #19639

Open
wants to merge 3 commits into
base: candidate-9.10.x
Choose a base branch
from

Conversation

jackdelv
Copy link
Contributor

@jackdelv jackdelv commented Mar 19, 2025

Type of change:

  • This change is a bug fix (non-breaking change which fixes an issue).
  • This change is a new feature (non-breaking change which adds functionality).
  • This change improves the code (refactor or other change that does not change the functionality)
  • This change fixes warnings (the fix does not alter the functionality or the generated code)
  • This change is a breaking change (fix or feature that will cause existing behavior to change).
  • This change alters the query API (existing queries will have to be recompiled)

Checklist:

  • My code follows the code style of this project.
    • My code does not create any new warnings from compiler, build system, or lint.
  • The commit message is properly formatted and free of typos.
    • The commit message title makes sense in a changelog, by itself.
    • The commit is signed.
  • My change requires a change to the documentation.
    • I have updated the documentation accordingly, or...
    • I have created a JIRA ticket to update the documentation.
    • Any new interfaces or exported functions are appropriately commented.
  • I have read the CONTRIBUTORS document.
  • The change has been fully tested:
    • I have added tests to cover my changes.
    • All new and existing tests passed.
    • I have checked that this change does not introduce memory leaks.
    • I have used Valgrind or similar tools to check for potential issues.
  • I have given due consideration to all of the following potential concerns:
    • Scalability
    • Performance
    • Security
    • Thread-safety
    • Cloud-compatibility
    • Premature optimization
    • Existing deployed queries will not be broken
    • This change fixes the problem, not just the symptom
    • The target branch of this pull request is appropriate for such a change.
  • There are no similar instances of the same problem that should be addressed
    • I have addressed them here
    • I have raised JIRA issues to address them separately
  • This is a user interface / front-end modification
    • I have tested my changes in multiple modern browsers
    • The component(s) render as expected

Smoketest:

  • Send notifications about my Pull Request position in Smoketest queue.
  • Test my draft Pull Request.

Testing:

jackdelv and others added 2 commits March 13, 2025 16:20
- expandMask wasn't taking stripeNum and dirPerPart into account when creating the physical file name from the part mask
- scanDirectory was calling itselg recursively on the dirPerPart directories causing files parts to be under different cDirDesc which prevented them from being logged as found.
@jackdelv jackdelv requested a review from jakesmith March 19, 2025 13:20
Copy link

Jira Issue: https://hpccsystems.atlassian.net//browse/HPCC-30365

Jirabot Action Result:
Workflow Transition To: Merge Pending
Additional PR: #19639

Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jackdelv - I think changes look logical, but there are various problems with the changes to expandMask.
Could do with some unittests added to prove for various input it is creating the expected output.

@@ -3890,6 +3890,7 @@ extern da_decl void parseFileName(const char *name,StringBuffer &mname,unsigned
mname.append(s);
num = pn;
max = mn;
break;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this change looks sensible, but what was happening before this change was made?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before the break was added, it would iterate over the tail of the string again i.e it would check each character in "_1_of_2" after it had already been processed for the part and max numbers.

}
if (isDirPerPart) {
// MORE: Should maybe check this doesn't contain any subdirectories to make
// sure it is really a dirPerPart directory. Is an all numbers subdirectory valid in ecl?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in practice yes, a scope must have a leading alpha char. So worth clarifying the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clarified comment.

isDirPerPart = false;
}
if (isDirPerPart) {
// MORE: Should maybe check this doesn't contain any subdirectories to make
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can check pdir.dirs after the scanDirectory, should be empty.
Let's add a check and throw an exception if not.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added check (pdir->dirs.ordinality()>0)

unsigned maxMb = serverConfig->getPropInt("DfuXRef/@memoryLimit", DEFAULT_MAXMEMORY);
unsigned maxMb;
if (isContainerized()) {
const char *resourcedMemory = getComponentConfigSP()->queryProp("resources/@memory");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now 'props' is saved, should used it instead of getComponentConfigSP() here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed to use new props member.

const char *s=mask;
if (s)
while (*s) {
char next = *(s++);
if (next=='$') {
if (dirPerPart)
{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

trivial/formatting: Allman vs K&R

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed change.

{
const char * start = buf.str();
const char * slash = start + buf.length();
while (slash > start && *slash != PATHSEPCHAR)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'slash' at the start is potentially beyond the end of the allocated buffer I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted changed.

while (slash > start && *slash != PATHSEPCHAR)
slash--;
buf.insert(slash-start, PATHSEPCHAR);
buf.insert((slash+1)-start, p+1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not safe, buf may have reallocated after 1st insert, meaning 'slash' points to a free'd pointer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These insert also don't look right in that they're inserting "/", but the mask is typically something like "myfilename._$P$_of_10", so it's going to end up with e.g. "/5myfilename._5_of_10" afaics.

If it did find a slash, it would prob be okay, but there isn't necessarily a full path so there's no guarantee there's a slash.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, the mask can be "myfilename._$P$_of_$N$" (see 'N' handleing below)
This insert code is going to be hit twice in that case, and therefore insert the dir-per-part directory twice...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverted changes to expandMask in favor of storing prefix and scope directory paths in cFileDesc and building up full paths in getPartName instead.

{
if (stripeNum>0)
addPathSepChar(buf.append('d').append(stripeNum));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm, how is this strip dir going to be in right place?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed changes to expandMask.

- Change cFileDesc to track the prefix and scope paths separately from name
- When getPartName is called build up stripeNum and dirperpart in between prefix and scope paths
- Find matching file parts by hashing full path minus stripeNum and dirperpart numbers
@jackdelv jackdelv requested a review from jakesmith March 20, 2025 20:48
Copy link
Member

@jakesmith jakesmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed in our meeting, there's are some issues with the way the info. is being scanned and stored in the hierarchical structure built up during the physical file scan and used subsequently.
Currently too much information is being stored per cFileDesc - as before, the file path shouldn't be needed, only the deduced filename mask.
Lost files are probably being misreported (at least during initial phase), because the paths they walk do not match the physical representation on disk (due to striping and dir-per-part directories).

saxref should:

  • ensure runXRef is dealing with 1 plane a time (i.e. if multiple selected, process 1 at a time).
  • store plane details (IPT) so accessible to other xref tasks during scan.
  • during scanDirectories, if striped, detected when at stripe level, and 'skip'
  • detect dir-per-part directory and 'skip'. But, check after recursing the dir, that contains no subdirs.
  • Keep the filename mask only in cFileDesc
  • Add scope/lfn to listOrphans. Build up in same way as 'basedir' is now.
    Think we can get rid of baseDir altogether.
  • during scanLogicalFiles, parse file paths fetched from parts (add helper func), to remove stripe and dir-part-part, so can marry the pathing with the cDirDesc tree.
    NB: LATER- may be better to assume that the cDirDesc is a representation of scopes, and walk scopes of logical file (+part endpoints), instead of getting part directories.
  • In listOrphans(cFileDesc), deduce file path from lfn, partNum and plane. Add a utility func that deduces and uses stripe num and dir-per-part if relevant (based on plane details).

@jackdelv

@@ -3834,7 +3835,8 @@ extern da_decl void parseFileName(const char *name,StringBuffer &mname,unsigned
}
}

if (mname.isEmpty())
// Assume that if prefix is passed in a match is required
if (prefix && prefix->isEmpty())
throw makeStringExceptionV(-1, "Could not find matching prefix in plane definition for file %s", filename);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think throwing an exception here, will mean 1 path that fails to match, will cause the whole xref process to fail?

True of any exception in parseFilename. NB: addFile only issues warnings.

addFile should likely have a try/catch and issue warnings in case of any exception in parseFilename.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's something that doesn't make sense here in fact I think..

How would the files not match prefix?
And, why is it scanning planes every file being added, to determine which plane the prefix path is in?
Given, we are specifically xref'ing a given plane, and therefore start at the prefix path..
This relates to HPCC-33151.

Let's discuss.

mname.append((d+1)-name, name).append(cur-(tailSlash+1), tailSlash+1);
if (dirs)
dirs->append((d+1)-name, name);
mname.append(cur-(tailSlash+1), tailSlash+1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

common with line 3881, could go outside if/else

}

StringBuffer &getPartName(StringBuffer &buf,unsigned p)
{
// In baremetal, buf can be prepoulated with replicate directory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The old code was:

    StringBuffer &getPartName(StringBuffer &buf,unsigned p)
    {
        StringBuffer mask;
        getName(mask);
        return expandMask(buf, mask, p, N);
    }

If buf is prepopulated, didn't that mean that as it was, it added the whole path again?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like it's masking a problem, rather than the correct fix.

Something not quite right here..
the cFileDesc (and cDirDesc) contain the name only, not a path.

getPartName should be returning an expanded form of the name mask only, as it was before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants