[bounty] CPU inference support, Mac M1/M2 inference support

There are several projects aiming to make inference on CPU efficient.

The first part is research:
- Which project works better,
- And compatible with Refact license,
- And doesn't bloat the docker too much,
- And allows to use scratchpads similar to how `inference_hf.py` does it (needs a callback that streams output and allows to stop),
- Does it include Mac M1/M2 support, or does it make sense to address Mac separately.

Please finish the first part, get a "go-ahead" for the second part.

The second part is implementation:
- Script similar to `inference_hf.py`,
- Little code,
- Not much dependencies,
- Demonstrate that it works with Refact-1.6b model, as well as StarCoder (at least the smaller sizes),
- Integration with UI and watchdog is a plus, but efficient inference is obviously the priority.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[bounty] CPU inference support, Mac M1/M2 inference support #77

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[bounty] CPU inference support, Mac M1/M2 inference support #77

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions