Skip to content

Conversation

ZERICO2005
Copy link
Contributor

@ZERICO2005 ZERICO2005 commented Sep 6, 2025

calloc (when __TICE__ is defined) now uses an inlined implementation of bzero which uses the $E40000 address to speed up the zero filling of memory.

Otherwise, it will use the previous memset implementation when __TICE__ is undefined.

Comment on lines 25 to 27
add hl, bc
or a, a
sbc hl, bc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the add here should never carry?

Suggested change
add hl, bc
or a, a
sbc hl, bc
add hl, bc
sbc hl, bc

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only if you also move up the pop of the size (the pop of the parameter is undefined data because malloc may clobber it)

Copy link
Contributor Author

@ZERICO2005 ZERICO2005 Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does this go off the assumption that malloc won't return an address higher than 0xE40000 on the CE and that the allocation size is less than 0x1C0000 bytes or etc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This relies on the seemingly sound assumption that the pointer to the allocated memory plus the allocation size does not overflow.

Comment on lines 33 to 37
; test if the size is zero
scf
sbc hl, hl
add hl, bc
jr nc, .finish
Copy link
Member

@runer112 runer112 Sep 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I went down a bit of a rabbit hole here... I thought: "What's the point of malloc'ing 0 bytes? Could we treat that as an allocation failure and return null?" And sure enough, the C standard allows for exactly this:

If the size of the space requested is zero, the behavior is implementation-defined: either a null pointer is returned to indicate an error, or the behavior is as if the size were some nonzero value, except that the returned pointer shall not be used to access an object.

If we push the zero-size check into malloc, we can eliminate the check here. We'd have to conditionally keep the check here to be safe if _custom_malloc is being used, but otherwise the check can be cheaply pushed into __simple_malloc:

 	ex	(sp),hl
 	push	bc
 	ld	de,(_heap_ptr)
+	dec hl
 	add	hl,de
 	jr	c,.null
- 	ld	bc,___heaptop
+ 	ld	bc,___heaptop-1
 	sbc	hl,bc
 	jr	nc,.null
 	add	hl,bc

And _standard_malloc:

 
     /* add size of block header to real size */
     const size_t size = alloc_size + sizeof(block_t);
+    /* abort if alloc_size is 0 or size overflowed */
-    if (size < alloc_size)
+    if (size <= alloc_size)
     {
         return NULL;
     }

The same idea can be applied to the no hardware assumptions version of calloc.

@ZERICO2005 ZERICO2005 changed the title optimized the zero filling in calloc (__TICE__ only) changed behaviour of malloc(0) and optimized zero filling in calloc Sep 6, 2025
@ZERICO2005 ZERICO2005 changed the title changed behaviour of malloc(0) and optimized zero filling in calloc changed behaviour of malloc(0) and optimized calloc Sep 6, 2025
@runer112 runer112 self-requested a review September 6, 2025 22:31
; inlined memset/bzero
; assumes that malloc(0) returns NULL, so we can skip the check for zero size
add hl, bc
cpd
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume it is okay for cpd to read nonnull_ptr_from_malloc + size

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants