Local LLM to "Think LONGER" with This SIMPLE Trick!💥 Local Test-Time Scaling 💥

Paper Abstract 

Test-time scaling is a promising new approach to
language modeling that uses extra test-time compute to improve performance. Recently, OpenAI’s
o1 model showed this capability but did not publicly share its methodology, leading to many replication efforts. We seek the simplest approach to
achieve test-time scaling and strong reasoning performance. First, we curate a small dataset s1K
of 1,000 questions paired with reasoning traces
relying on three criteria we validate through ablations: difficulty, diversity, and quality. Second, we
develop budget forcing to control test-time compute by forcefully terminating the model’s thinking process or lengthening it by appending “Wait”
multiple times to the model’s generation when it
tries to end. This can lead the model to doublecheck its answer, often fixing incorrect reasoning
steps.

In this video, we turn DeepSeek-R1-Distill-Qwen-1.5B into a deep thinking model enabling test time scaling.

Note: this works with all the models that generate thinking tokens! 


🔗 Links 🔗

s1: Simple test-time scaling
https://arxiv.org/pdf/2501.19393

MLX LM - https://pypi.org/project/mlx-lm/

Code by Awni Hannum - https://gist.github.com/awni/9d8b35ef...


❤️ If you want to support the channel ❤️
Support here:
Patreon -   / 1littlecoder  
Ko-Fi - https://ko-fi.com/1littlecoder

🧭 Follow me on 🧭
Twitter -   / 1littlecoder  

로딩 중...

Local LLM to "Think LONGER" with This SIMPLE Trick!💥 Local Test-Time Scaling 💥

Codex CLI Has Just Got WAY Better!

AI 클론 기술이 위협적인 수준까지 도달했다, 나 대신 일해주는 시대

생 초보를 위한 바이브 코딩 1회용 앱 만들기 - 예쁜 디자인 입혀서 쉽게 마크다운 서식 제거기 만들기

해커가 알려주는 보안설정 꿀팁 #shorts

Rust Traits give you Superpowers

The React Hook You Should Be Using When Working With Forms