Run on Apple Mac Silicon chip #43

ncamcl · 2024-08-18T18:46:50Z

I found that the current problem can only run on CPU model on Apple silicon chip. I don't know if the GPU method which is 'mps' can be added. I set the parameters as mps, but it didn't work.

conglu1997 · 2024-08-19T15:20:47Z

Hey! This code was only tested using NVIDIA GPUs, I believe they have a guide for PyTorch here: https://developer.apple.com/metal/pytorch/ :)

junhua · 2024-08-22T09:05:23Z

Hey! This code was only tested using NVIDIA GPUs, I believe they have a guide for PyTorch here: https://developer.apple.com/metal/pytorch/ :)

Created a pull request to address this

@conglu1997

cvanvlack · 2024-08-26T10:52:05Z

I checked out the code from the pull request and tried to run it. It's been going most of the weekend on my Mac. :)

Specifically on these commands

# Prepare NanoGPT data
python data/enwik8/prepare.py
python data/shakespeare_char/prepare.py
python data/text8/prepare.py

Do these even need to be run? I am curious what is the actual minimal command set of most of the AI work is actual done through OpenAI, etc...

conglu1997 · 2024-08-26T11:21:33Z

These prepare the data for the nanogpt runs.

The remaining commands run the baseline which should be machine dependent for things like training speed!

cvanvlack · 2024-08-26T12:04:49Z

Right, but I am not using a machine with a GPU. I want to offload as much of this as possible to a third-party tool (Colab? Some other service that has a GPU?).

So the fact that this dependency exists is kind of a blocker for usage.

Maybe there's something else I am missing about what this is for?

conglu1997 · 2024-08-26T12:16:32Z

I think modern machine learning is quite hard without a GPU. Later parts of the pipeline will attempt dozens of runs which could take hours each without a GPU. I would recommend services like lambda where you can rent GPUs per hour.

This component that you are referring to is comparatively extremely cheap compared to what the AI scientist could choose to run.

mruckman1 · 2024-08-28T19:07:32Z

Why not use GPT4o-mini/Claude instead of a local nanogpt? Not totally sure of the value here for a hybrid approach given the cost of cutting off mac users since we're already providing API keys. .

cvanvlack · 2024-08-28T19:58:10Z

Additionally, is it REALLY a dependency? It looks like it creates an artifact that is used later when you’re using the GPT4o-mini.

Is that artifact actually a requirement for subsequent steps?

conglu1997 · 2024-08-28T19:59:33Z

Why not use GPT4o-mini/Claude instead of a local nanogpt? Not totally sure of the value here for a hybrid approach given the cost of cutting off mac users since we're already providing API keys. .

GPT4o/Claude is the foundation model that proposes ideas. NanoGPT is the actual model that is modified and trained. This is a different model for different templates.

conglu1997 · 2024-08-28T20:00:15Z

Additionally, is it REALLY a dependency? It looks like it creates an artifact that is used later when you’re using the GPT4o-mini.

Is that artifact actually a requirement for subsequent steps?

The preparation steps create the training data and baseline for comparison. Very much essential. Different templates have different preparation steps.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Run on Apple Mac Silicon chip #43

Run on Apple Mac Silicon chip #43

ncamcl commented Aug 18, 2024

conglu1997 commented Aug 19, 2024

junhua commented Aug 22, 2024 •

edited

Loading

cvanvlack commented Aug 26, 2024

conglu1997 commented Aug 26, 2024 •

edited

Loading

cvanvlack commented Aug 26, 2024

conglu1997 commented Aug 26, 2024 •

edited

Loading

mruckman1 commented Aug 28, 2024

cvanvlack commented Aug 28, 2024

conglu1997 commented Aug 28, 2024

conglu1997 commented Aug 28, 2024 •

edited

Loading

Run on Apple Mac Silicon chip #43

Run on Apple Mac Silicon chip #43

Comments

ncamcl commented Aug 18, 2024

conglu1997 commented Aug 19, 2024

junhua commented Aug 22, 2024 • edited Loading

cvanvlack commented Aug 26, 2024

conglu1997 commented Aug 26, 2024 • edited Loading

cvanvlack commented Aug 26, 2024

conglu1997 commented Aug 26, 2024 • edited Loading

mruckman1 commented Aug 28, 2024

cvanvlack commented Aug 28, 2024

conglu1997 commented Aug 28, 2024

conglu1997 commented Aug 28, 2024 • edited Loading

junhua commented Aug 22, 2024 •

edited

Loading

conglu1997 commented Aug 26, 2024 •

edited

Loading

conglu1997 commented Aug 26, 2024 •

edited

Loading

conglu1997 commented Aug 28, 2024 •

edited

Loading