Table of content
Resource
where to setup ollama
There are couple ways to run Ollama API for different purpose
purpose | where to setup | who can use it | active time | docker or native install | access |
---|---|---|---|---|---|
self test or small single job | any computer | local user only | unlimited | both works, native install is slightly easier | curl localhost:11434 |
large sigle job | biohpc with GPU | any computer on UTSW network, including VPN | 20 hours | already installed | localhost:11434 from biohpc terminal or with cluster IP address:11434 on UTSW network including VPN |
production | any computer on campus with wired connection running 24/7 preferrably with GPU | any computer on UTSW network, including VPN | unlimited | docker only, port cannot be accessed via ip address if installed natively | ip address:11434 |
in many situations, the ollama command does not work (e.g. ollama pull llama2
), alternatively, use curl to make modifications
$ curl --noproxy '*' http://localhost:11434/api/pull -d '{ "name": "llama2"}
the --noproxy '*'
is necessary in biohpc to avoid being block by the firewall.
how to setup ollama
setup ollama in biohpc
-
start with web terminal, on-demand jupyterLab (with GPU), or on-deman rstudio
-
in terminal, you can check available ollama version by
$ module avail ollama
you should see something likeollama/0.1.20-img
-
load ollama
$ module load ollama/0.1.20-img
-
start ollama
$ ollama serve &
or start ollama using cutomized model storage
OLLAMA_MODELS=/work/radiology/yxi/ollama ollama serve &
- use
$ ifconfig
to get the ip address for use on other computer on campus/VPN
it takes a while...
setup ollama on your own machine
for local use, just follow official instruction
to run as server
$ docker pull ollama
$ docker run -d -v ollama:/root/.ollama -p 11434:11434 --name ollama ollama/ollama
common useage
check available LLMs
$ curl http://localhost:11434/api/tags
$ curl --noproxy '*' http://localhost:11434/api/tags
pull models
$ curl http://localhost:11434/api/pull -d '{ "name": "llama3"}
$ curl --noproxy '*' http://localhost:11434/api/pull -d '{ "name": "llama3"}
Install ANY GGUF model from Huggingface
- download GGUF file from huggingface using
wget
and save to \blobs, recommended quantization types are q4_K_M or q8_0 ref - run
$ sha256sum blablabla.gguf
and get the sha256 value - rename blablabla.gguf as sha256-...
- alternatively, run
$ curl --noproxy '*' -T blablabla.gguf -X POST localhost:11434/api/blobs/sha256:...
- run this in terminal
$ url --noproxy '*' localhost:11434/api/create -d '{"model":"blablabla","files":{"blablabla.gguf": "sha256-..."}}'
- model should show up in api/tags
test model connection
$ curl 129.112.204.51:11434/api/generate -d '{"model": "llama3","prompt": "how are you", "stream": false}'
example in R
library(curl)
library(jsonlite)
changePrompt <- function(new_system,new_prompt) {
data <- sprintf('{
"model": "llama2",
"prompt": "%s",
"temperature": 0.1,
"system": "%s",
"stream": false
}', new_prompt,new_system)
return(data)
}
ask_llama2 <- function(text,sys){
url <- "http://pedrosa-all-series.dhcp.swmed.org:11434/api/generate"
# Send a POST request using curl
response <- curl_fetch_memory(
url = url,
handle = new_handle(
post = TRUE,
postfields = new_data <- changePrompt(sys,text)
)
)
# Extract the response content
response_content <- rawToChar(response$content)
fromJSON(response_content)$response
}
# example
ask_llama2(text = "how are you?", sys = 'day dreamer')