Unified memory available:
16 GB
8 GB
128 GB
0 GB of 16 GB
Time to First Token (Latency)
--
Generation Speed
--
A LLM memory calculator is a tool that estimates how much memory a large language model (LLM) needs and how fast it can generate text on your device, helping you choose the right hardware and model size.
The code and formulas of this calculator are open source. Click here to check it out or contribute.
Select a LLM model from the dropdown list, then select your Mac, the memory available, and your task. The tool reads the model metadata (size, layers, hidden dimensions, KV cache, etc.), estimates the amount of memory required and generation speed, and displays the results.
Please note that this tool provides estimates and actual performance can vary depending on your hardware and configuration.