Deepseek AI has been a trending topic during the recent days. If you are not familiar with it yet, it is an Artificial Intellegence (AI) model that has shot up in the mobile applications stores and shrinked the value of some tech stocks.
Deepseek suggested that their model has costed only $6 million to train, while advanced models with reasoning capabilities from OpenAI and Athropic add up to hunderds of millions.
However, this is not what I, and I guess lots of IT geeks find the most facinating about Deepseek. The company has not only allowed free access on http://chat.deepseek.com, but also released their models for everyone to download on Hugging Face. And while largest models are generally out of reach for average hobbyists due to hardware limitations, smaller models can be run on nearly any hardware.
In this post I will cover the process of setting up Deepseek R1 model to run locally on any machine with Ollama.
Ollama
Ollama is a tool that allows you to run large language models (LLMs) on your own hardware. Currently Ollama is available for all the major OS’s: Linux, Mac OS and most recently Windows support has been added as well.
The installation process is quite straightforward. Just download the tool and follow the instructions for your platform of choice.
Installing and running Deepseek-R1 locally
While we would love to go for the the most advanced and largest Deepseek model V3, with 671 billion parameters, it takes massive 404Gb of space and will require large amount of RAM and a few dedicated GPUs. Someone managed to run it on 8 x M4 Pro 64GB Mac Mini Cluster with 512Gb of total memory.
However, there are a few distilled models, the smallest of which has 1.5B parameters and takes only 1.1Gb of space.
To download and run it, simply enter the following command in your terminal:
1.5b parameter model
ollama run deepseek-r1:1.5b
This will pull the model and run it. The output will look something like this:
~$ ollama run deepseek-r1:1.5b
pulling manifest
pulling aabd4debf0c8... 100% ▕██████████████████████████████████████████████▏ 1.1 GB
pulling 369ca498f347... 100% ▕██████████████████████████████████████████████▏ 387 B
pulling 6e4c38e1172f... 100% ▕██████████████████████████████████████████████▏ 1.1 KB
pulling f4d24e9138dd... 100% ▕██████████████████████████████████████████████▏ 148 B
pulling a85fe2a2e58e... 100% ▕██████████████████████████████████████████████▏ 487 B
verifying sha256 digest
writing manifest
success
After that you will be greeted with a prompt to interact with the model. The model is blazingly fast on my Intel Core i7-1165G7 without a dedicated GPU card.
The cool thing about Deepseek model, that you can actually see reasoning happening within the <think></think>
tags. After the reasoning / “thinking” is over the model will produce the final result.
7b parameter model
I wanted to try how a bigger 7b parameter model would perform on my hardware.
So, again, to download and start the model:
ollama run deepseek-r1:7b
As you can see this time it is significantly slower, but still very much usable.
Final thoughts
I was actually surprised by the speed/quality ratio these distilled models produce on a basic laptop setup. It is also worth noting that there are Deepseek distilled models for math, coding, etc., that are worth trying.
I definitely encourage everyone to try and run Deepseek and other LLMs such as Llama locally. They are very convenient and can handle most tasks free of charge.