Skip to content

[bounty] CPU inference support, Mac M1/M2 inference support #77

@olegklimov

Description

@olegklimov

There are several projects aiming to make inference on CPU efficient.

The first part is research:

  • Which project works better,
  • And compatible with Refact license,
  • And doesn't bloat the docker too much,
  • And allows to use scratchpads similar to how inference_hf.py does it (needs a callback that streams output and allows to stop),
  • Does it include Mac M1/M2 support, or does it make sense to address Mac separately.

Please finish the first part, get a "go-ahead" for the second part.

The second part is implementation:

  • Script similar to inference_hf.py,
  • Little code,
  • Not much dependencies,
  • Demonstrate that it works with Refact-1.6b model, as well as StarCoder (at least the smaller sizes),
  • Integration with UI and watchdog is a plus, but efficient inference is obviously the priority.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions