Consider specifying WTF-8 variant when creating WTF-8 string views

Currently you can make a WTF-8 view on a string with `string.as_wtf8` and read string contents by `stringview_wtf8.encode $wtf8_policy`, or indeed `stringview_wtf8.slice` (which doesn't take a policy).  The intention is that you can process the WTF-8 contents of a string in a streaming way with a fixed-size buffer.  However might it make sense to instead pass the policy argument to `string.as_wtf8` ?  Or in the spirit of #35, perhaps the names would be `string.as_utf8`, `string.as_wtf8`, `string.as_lossy_utf8`, all resulting in the `stringview_wtf8` type.

I think the essential thing this allows you is to move when any trap/assertion might take place, for the strict UTF-8 variant, to the point where you create the view.  An encode would never trap unless the memory is out of range.

For an implementation that doesn't use WTF-8 internally and which eagerly transcodes (substrings of) to WTF-8 when creating a `stringview_wtf8`, having the policy up-front would allow the policy to be applied when the view is created, and `stringview_wtf8.encode` becomes a simple memcpy.  But, this might not be important.  I don't know how viable this "MVP" kind of implementation will be in the long term -- perhaps breadcrumbs will be a comprehensively better solution.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consider specifying WTF-8 variant when creating WTF-8 string views #38

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consider specifying WTF-8 variant when creating WTF-8 string views #38

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions