Skip to content

Commit d0308d4

Browse files
committed
Add post on local llms
1 parent 6a76119 commit d0308d4

File tree

2 files changed

+195
-0
lines changed

2 files changed

+195
-0
lines changed

content-org/local-llama-models.org

+92
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
#+hugo_base_dir: ~/development/web/jslmorrison.github.io
2+
#+hugo_section: posts
3+
#+options: author:nil
4+
5+
* Local Llama models
6+
:PROPERTIES:
7+
:EXPORT_FILE_NAME: local-llama-models
8+
:EXPORT_DATE: 2024-02-11
9+
:END:
10+
How to install and run open source LLM's locally using [[https://ollama.com/][Ollama]] and integrate it into VSCode editor for assisted code completion etc.
11+
12+
#+hugo: more
13+
#+begin_quote
14+
Ollama is a tool that allows you to run open-source large language models (LLMs) locally on your machine to provide flexibility in working with different models
15+
#+end_quote
16+
You can view a list of [[https://ollama.com/library][supported models here]].
17+
18+
** Running Ollama
19+
For me running Ollama locally is a simple as executing the following in a terminal:
20+
#+begin_src bash :noeval
21+
nix shell nixpkgs#ollama --command ollama serve
22+
#+end_src
23+
This will download Ollama and start the server. If all good at this point you should see in the terminal output that it is =Listening on 127.0.0.1:11434=. If you open that URL in a browser then you should see =Ollama is running=.
24+
To run a specific model, browse to the [[https://ollama.com/library][Ollama models library]] and pick one that suits your needs. For example:
25+
#+begin_src bash :noeval
26+
nix shell nixpkgs#ollama run llama2
27+
#+end_src
28+
#+begin_quote
29+
Llama 2 is released by Meta Platforms, Inc. This model is trained on 2 trillion tokens, and by default supports a context length of 4096. Llama 2 Chat models are fine-tuned on over 1 million human annotations, and are made for chat.
30+
#+end_quote
31+
32+
** Interaction
33+
After running the model as shown above, once it finishes downloading it will provide you with a prompt to start chatting:
34+
#+begin_src bash :noeval
35+
>>> hello
36+
Hello! It's nice to meet you. Is there something I can help you with or would you like
37+
to chat?
38+
39+
>>> Send a message (/? for help)
40+
#+end_src
41+
There is also an API available that you can send requests to:
42+
#+begin_src bash :noeval
43+
curl http://localhost:11434/api/generate -d '{
44+
"model": "llama2",
45+
"prompt": "Hello"
46+
}'
47+
#+end_src
48+
You can [[https://github.com/ollama/ollama/blob/main/docs/api.md][view the API docs here]].
49+
50+
To install a different model, repeat the run command above, specifying a different model:
51+
#+begin_src bash :noeval
52+
nix shell nixpkgs#ollama --command ollama run codellama "Write me a PHP function that outputs the fibonacci sequence"
53+
54+
Here is a PHP function that outputs the Fibonacci sequence:
55+
```
56+
function fibonacci($n) {
57+
if ($n <= 1) {
58+
return $n;
59+
} else {
60+
return fibonacci($n-1) + fibonacci($n-2);
61+
}
62+
}
63+
```
64+
This function takes an integer `$n` as input and returns the `n`-th number in the
65+
Fibonacci sequence. The function is based on the recurrence relation for the Fibonacci
66+
sequence, which states that each number is equal to the sum of the previous two numbers.
67+
The function uses a recursive approach, where it calls itself with the previous two
68+
numbers as input until it reaches the desired output.
69+
70+
For example, if we call the function with `$n = 5`, it will return `8`, since `8` is the
71+
fifth number in the Fibonacci sequence.
72+
```
73+
echo fibonacci(5); // Output: 8
74+
```
75+
Note that this function has a time complexity of O(`2^n`), which means that the running
76+
time grows very quickly as the input increases. This is because each call to the
77+
function creates a new stack frame, and the function calls itself with smaller inputs
78+
until it reaches the base case. As a result, the function can become very slow for large
79+
values of `n`.
80+
#+end_src
81+
82+
To see a list of installed models:
83+
#+begin_src bash :noeval
84+
nix shell nixpkgs#ollama --command ollama list
85+
#+end_src
86+
87+
** Integration with VSCode
88+
Install and configure the =llama-coder= extension from the [[https://marketplace.visualstudio.com/items?itemName=ex3ndr.llama-coder][VSCode marketplace]].
89+
#+begin_quote
90+
Llama Coder is a better and self-hosted Github Copilot replacement for VS Studio Code. Llama Coder uses Ollama and codellama to provide autocomplete that runs on your hardware.
91+
#+end_quote
92+
I'll do a follow up on this with my findings later, after I have some more time using this and compared other models.

content/posts/local-llama-models.md

+103
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,103 @@
1+
+++
2+
title = "Local Llama models"
3+
date = 2024-02-11
4+
draft = false
5+
+++
6+
7+
How to install and run open source LLM's locally using [Ollama](https://ollama.com/) and integrate it into VSCode editor for assisted code completion etc.
8+
9+
<!--more-->
10+
11+
> Ollama is a tool that allows you to run open-source large language models (LLMs) locally on your machine to provide flexibility in working with different models
12+
13+
You can view a list of [supported models here](https://ollama.com/library).
14+
15+
16+
## Running Ollama {#running-ollama}
17+
18+
For me running Ollama locally is a simple as executing the following in a terminal:
19+
20+
```bash
21+
nix shell nixpkgs#ollama --command ollama serve
22+
```
23+
24+
This will download Ollama and start the server. If all good at this point you should see in the terminal output that it is `Listening on 127.0.0.1:11434`. If you open that URL in a browser then you should see `Ollama is running`.
25+
To run a specific model, browse to the [Ollama models library](https://ollama.com/library) and pick one that suits your needs. For example:
26+
27+
```bash
28+
nix shell nixpkgs#ollama run llama2
29+
```
30+
31+
> Llama 2 is released by Meta Platforms, Inc. This model is trained on 2 trillion tokens, and by default supports a context length of 4096. Llama 2 Chat models are fine-tuned on over 1 million human annotations, and are made for chat.
32+
33+
34+
## Interaction {#interaction}
35+
36+
After running the model as shown above, once it finishes downloading it will provide you with a prompt to start chatting:
37+
38+
```bash
39+
>>> hello
40+
Hello! It's nice to meet you. Is there something I can help you with or would you like
41+
to chat?
42+
43+
>>> Send a message (/? for help)
44+
```
45+
46+
There is also an API available that you can send requests to:
47+
48+
```bash
49+
curl http://localhost:11434/api/generate -d '{
50+
"model": "llama2",
51+
"prompt": "Hello"
52+
}'
53+
```
54+
55+
You can [view the API docs here](https://github.com/ollama/ollama/blob/main/docs/api.md).
56+
57+
To install a different model, repeat the run command above, specifying a different model:
58+
59+
````bash
60+
nix shell nixpkgs#ollama --command ollama run codellama "Write me a PHP function that outputs the fibonacci sequence"
61+
62+
Here is a PHP function that outputs the Fibonacci sequence:
63+
```
64+
function fibonacci($n) {
65+
if ($n <= 1) {
66+
return $n;
67+
} else {
68+
return fibonacci($n-1) + fibonacci($n-2);
69+
}
70+
}
71+
```
72+
This function takes an integer `$n` as input and returns the `n`-th number in the
73+
Fibonacci sequence. The function is based on the recurrence relation for the Fibonacci
74+
sequence, which states that each number is equal to the sum of the previous two numbers.
75+
The function uses a recursive approach, where it calls itself with the previous two
76+
numbers as input until it reaches the desired output.
77+
78+
For example, if we call the function with `$n = 5`, it will return `8`, since `8` is the
79+
fifth number in the Fibonacci sequence.
80+
```
81+
echo fibonacci(5); // Output: 8
82+
```
83+
Note that this function has a time complexity of O(`2^n`), which means that the running
84+
time grows very quickly as the input increases. This is because each call to the
85+
function creates a new stack frame, and the function calls itself with smaller inputs
86+
until it reaches the base case. As a result, the function can become very slow for large
87+
values of `n`.
88+
````
89+
90+
To see a list of installed models:
91+
92+
````bash
93+
nix shell nixpkgs#ollama --command ollama list
94+
````
95+
96+
97+
## Integration with VSCode {#integration-with-vscode}
98+
99+
Install and configure the `llama-coder` extension from the [VSCode marketplace](https://marketplace.visualstudio.com/items?itemName=ex3ndr.llama-coder).
100+
101+
> Llama Coder is a better and self-hosted Github Copilot replacement for VS Studio Code. Llama Coder uses Ollama and codellama to provide autocomplete that runs on your hardware.
102+
103+
I'll do a follow up on this with my findings later, after I have some more time using this and compared other models.

0 commit comments

Comments
 (0)