Skip to content

Conversation

ZoomRmc
Copy link
Contributor

@ZoomRmc ZoomRmc commented Apr 2, 2025

Adds system.setLenUninit for the string type. Allows setting length without initializing new memory on growth.

  • sysstr housekeeping:
    • Removed redundant len and reserved sets already performed by prior rawNewStringNoInitcalls.

@ZoomRmc
Copy link
Contributor Author

ZoomRmc commented Apr 3, 2025

systr.setLengthStrUninit fails with mm:refc on Linux.
Seems my naïve approach is wrong.
Can this be made to work without introducing new compiler procs?

@arnetheduck
Copy link
Contributor

arnetheduck commented Apr 3, 2025

the same for seq would be nice, ie strings are pretty niche at the end of the day.
never mind, forgot that it was already fixed :)

@arnetheduck
Copy link
Contributor

ping, would be nice to see this PR merged

@ZoomRmc
Copy link
Contributor Author

ZoomRmc commented Sep 14, 2025

I think I was stuck on not being sure if it's done and both string versions actually work, since some of CI errors at the time were unrelated.

Those string/seq impls is a mess of conditionals and I don't trust locally-run tests.

- Required for a followup to nim-lang#15951
- Accompanies nim-lang#19727 but for strings

+ `sysstr` housekeeping:
  - Removed redundant `len` and `reserved` sets already
     performed by prior `rawNewStringNoInit`calls.
@ZoomRmc ZoomRmc marked this pull request as draft September 16, 2025 15:22
Without `hasAlloc` `system/sysstr` is not included in system
so `setLengthStrUninit` is not in the scope.
@ZoomRmc
Copy link
Contributor Author

ZoomRmc commented Sep 16, 2025

Nim/lib/system.nim

Lines 2316 to 2326 in 2e45f61

when notJSnotNims:
when defined(nimSeqsV2):
{.noSideEffect.}:
let str = unsafeAddr s
setLengthStrV2Uninit(cast[ptr NimStringV2](str)[], newlen)
else:
{.noSideEffect.}:
when hasAlloc:
setLengthStrUninit(s, newlen)
else:
s.setLen(n)

Adding a hasAlloc check and and else-downgrade to setLen for the old string branch was necessary for this to compile. I'm not sure if it's correct as I'm not sure what constitutes the string type at this point.

@ZoomRmc ZoomRmc marked this pull request as ready for review September 16, 2025 23:07
return s
else:
result = mnewString(n)
return if n == 0: s else: mnewString(n)
Copy link
Member

@Araq Araq Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is not what the original code did and a distraction. Bring back the old version.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, if we don't return early from this case (and checking for nil is cleraly an "early return" kind of branch),
we're setting len and the nul-byte for the result again at lines 229-230

Nim/lib/system/sysstr.nim

Lines 229 to 230 in 2e45f61

result.len = n
result.data[n] = '\0'
.

mnewString sets the len and the null-byte on each call for the returned string (though, the null-byte implicitly, by zeroMem call):

proc mnewString(len: int): NimString {.compilerproc.} =
result = rawNewStringNoInit(len)
result.len = len
zeroMem(addr result.data[0], len + 1)

Original code for the s == nil and n != 0 went like this:

# start of the mnewString call
result = rawNewStringNoInit(n)
result.len = n
zeroMem(addr result.data[0], n + 1) # n (or len in the proc) + 1 sets the null-byte
# after-coditional tail of setLengthStr
result.len = n
result.data[n] = '\0'

So this is redundant.

How about I separate the nil case into separate if statement to signify it's an early-return case?

Copy link
Contributor Author

@ZoomRmc ZoomRmc Sep 18, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mean, it could look like this:

proc setLengthStr(s: NimString, newLen: int): NimString {.compilerRtl.} =
  let n = max(newLen, 0)
  if s == nil: # early return check
    if n == 0:
      return s
    else:
      return mnewString(n) # sets everything required
  if n <= s.space:
    result = s
  else:
    let sp = max(resize(s.space), n)
    result = rawNewStringNoInit(sp) # len and null-byte not set
    copyMem(addr result.data[0], unsafeAddr(s.data[0]), s.len)
    zeroMem(addr result.data[s.len], n - s.len)
  result.len = n
  result.data[n] = '\0'

This is more explicit and in terms of logic completely equal to the original version, barring reduntant field touches, but I'm not totally sure what's "distracting" to you.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my previous version is more succinct, but indeed it might be better to change the elif at ln.227 to a separate if to separate the early-return statement.

@Araq
Copy link
Member

Araq commented Sep 18, 2025

Apart from my code review remarks, seems fine.

@ZoomRmc
Copy link
Contributor Author

ZoomRmc commented Sep 18, 2025

So I went with expanding that early return conditional at line 232 a bit and added some internal docs. If it's still not up your alley I can change the code to whatever.

If it's ok as is, do you want me to squash it?

BTW, you can't currently build docs with --docInternal for system due to assertion defect in semtypes:

...assertions.nim(34)       raiseAssert
Error: unhandled exception: semtypes.nim(2471, 7) `c.graph.sysTypes[tySequence] == nil`  [AssertionDefect]```

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants