💡

Key Points

Key Takeaways

  • 1

    Sovereign Intelligence

  • 2

    Privacy: Data uploaded to ChatGPT may be used for training. If feeding confidential documents or private diary, Local LLM is only choice.

  • 3

    Mac Studio (M2/M3 Ultra): Unified memory up to 192GB can be installed. What does this mean? Meaning you can run huge 70B (70 billion parameters) class LLM with only GPU memory. Inference speed is not explosive, but model size that can be handled is different order of magnitude.

  • 4

    NVIDIA GeForce RTX 4090: If you want to do not only Inference but also Training/Fine-tuning, it is NVIDIA. CUDA ecosystem is absolute king, and latest papers are always written to run on NVIDIA.

  • 5

    Latency: Response is instantaneous because it does not go through cloud. Can interact at speed of thought.

Introduction: Ownership of Intelligence

Electricity, Water, Gas. And “Intelligence”. AI has become 4th infrastructure.

However, continuing to depend on OpenAI or Google is risk. If they say “Policy Violation”, your brain stops. Place server at home and run open source LLM (Llama 3 or Mistral). That is “Armament” in digital age.

1. The Inference Monster: Mac Studio

Apple Silicon revolutionized AI industry. Thanks to “Unified Memory” sharing VRAM (Video Memory) and Main Memory, you can get machine with 192GB VRAM for under 1 million yen. If trying to do same with NVIDIA, H100 (several million yen) is needed.

Apple Mac Studio (M2 Ultra)

Quietness is wonderful. Fan sound is hardly heard even when running 70B model. Electricity bill is also cheap. World's best cost performance as dedicated inference machine.

LLM Studio

If you use “LM Studio” or “Ollama”, local AI starts up with one command. You can interact with omnipotent AI even if Wi-Fi is turned off.

2. The Training Beast: RTX 4090 Workstation

If you want to raise AI yourself (Fine-tuning), Green Giant (NVIDIA) is needed. LoRA (Additional Learning) ends in tens of minutes by violent computing power of CUDA cores.

GIGABYTE GeForce RTX 4090

VRAM is 24GB. Less than Mac, but speed is overwhelming. If running Image Generation AI (Stable Diffusion), several times faster than Mac.

3. Comparison: Quantity or Speed

項目 Mac Studio (Ultra) RTX 4090 PC
VRAM (Memory) Max 192GB (Huge) 24GB (Small)
Executable Model Llama-3-70B (Easy) Llama-3-8B (Limit)
Generation Speed Fast (50 t/s) Explosive (100 t/s)
Usage Inference / Chat Training / Image Gen

4. The Edge of Network

Please expose this “Silicon Brain” to outside with VPN (Tailscale). Connect to strongest PC at home from iPhone on go, ask private AI questions. Data is not passed to giant tech companies at all.

Conclusion: Your Only Butler

You can say anything where no one is watching. Even if you ask any risky question, you will not be banned. And it knows all your personal secrets (diary, financial status, code base).

Having such AI is strongest privilege in modern times. Let’s buy Mac Studio. It is not just PC but “Vessel of Intelligence”.