4mo ago

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

qwenlm.github.io

QwQ-32B: Embracing the Power of Reinforcement Learning

4mo ago

QwQ-32B is a 32 billion parameter language model achieves comparable performance to DeepSeek-R1 with 671 billion parameters, using reinforcement learning for scaling

qwenlm.github.io /blog/qwq-32b/

Hacker News @lemmy.bestiver.se

RSS Bot @lemmy.bestiver.se

BOT

4mo ago

QwQ-32B: Embracing the Power of Reinforcement Learning

qwenlm.github.io /blog/qwq-32b/

You're viewing a single thread.

5 comments

can grab it here
https://ollama.com/library/qwq:32b
https://huggingface.co/Qwen/QwQ-32B
I find it absolutely wild how quickly we went from needing a full blown data centre to run models of this scale to being able to run them on a laptop.
- Well that's interesting because I got something completely different locally
  
  <think> Okay, let me figure out how many times the letter "r" appears in the word "strawberry." First, I'll write down the word to visualize it better: S-T-R-A-W-B-E-R-R-Y. Now, I'll go through each letter one by one. Starting with S, T – that's two letters without an R. The third letter is R, so that's one. Then A, W, B, E – those don't have Rs. Next comes another R, making it two. After that, there's another R, bringing the count to three. Finally, the last letters are Y. Wait, let me double-check because sometimes I might miscount. Let me break it down again: S (1), T (2), R (3) – first R here. Then A (4), W (5), B (6), E (7). Next is R again (8) – that's the second R. Another R comes after (9th position?), so third R, and then Y (10). Hmm, maybe I confused the positions. Let me spell it out with numbers: 1. S 2. T 3. R (1st R) 4. A 5. W 6. B 7. E 8. R (2nd R) 9. R (3rd R) 10. Y Yes, so the letters at positions 3, 8, and 9 are Rs. That totals three times. Wait, does "strawberry" have two or three Rs? Let me confirm by spelling it again: S-T-R-A-W-B-E-R-R-Y. Between the B and E, then after that, it's R-R-Y. So there are three Rs total. </think> The letter “r” appears **3 times** in the word "strawberry." Here’s the breakdown: S T **R** A W B E **R** **R** Y Count ’em: 1 (third letter), 2 (eighth letter), and 3 (ninth letter). 🍓
  
  Yeah mine does not do this, here are the toggles I have on mine which might make a difference
  <think>
  Alright, let me figure out which number is bigger between 9.21 and 9.1999999999. Hmm, okay. Both numbers start with 9. so the whole number part is the same. The difference must be in the decimal parts.
  First, I'll write them down to compare digit by digit after the decimal point:
  9.21 is the same as 9.2100000000 if we extend it to match the same number of decimal places.
  9.1999999999 has more 9s in its decimal places.
  Starting from the first decimal digit:
  The tenths place: both have 2 and 1 respectively. Wait, no—hold on. Let me check again.
  Wait a second, maybe I messed up. Let's parse each number correctly:
  For 9.21:
  Tenths place (first after decimal): 2
  Hundredths place: 1 The rest are zeros beyond that if we need more places.
  For 9.1999999999:
  Tenths place: 1
  Then, the hundredths is 9, followed by all those 9s up to the tenth decimal digit?
  Wait, actually let me clarify the positions properly. Let's write both numbers with enough decimals:
  9.21 can be written as 9.2100000000
  9.1999999999 is already given.
  Comparing digit by digit after the decimal point:
  Tenths place: 2 vs. 1 → Here, 9.21 has a higher tenths digit (2 > 1).
  Since the first differing digit at the tenths place is already larger for 9.21, there's no need to look further into the other decimal places because even if all subsequent digits in 9.199... are 9s, they can't overcome the difference in the tenths place. For example:
  The first number has 0.2 in tenths vs the second having only 0.1. The rest of the decimals add up to less than 0.01 for the second number (since it's 0.09 + 0.009 etc., which sums to 0.09 repeating, approaching but never reaching 0.1).
  Therefore, even with all those trailing 9s in 9.199..., its value is still less than 9.2 (since it's like 9.1 plus almost 0.1 but not quite), so 9.21 is larger.
  </think>
  The number 9.21 is larger than 9.1999999999.
  Here’s why:
  Compare the tenths place (first digit after the decimal):
  9.21 has a 2 in the tenths place.
  9.1999999999 has a 1 in the tenths place.
  Since 2 > 1, 9.21 is already larger at this stage. Even though 9.1999999999 continues with many 9s afterward, they cannot overcome the difference caused by having only 1 in the tenths place compared to the 2 in 9.21.
  Final Answer:
  \boxed{9.21}
  
  Ultimately what matters is whether it gets the correct answer or not. It's interesting that yours wasn't able to do the strawberry test while mine did it with very short thinking cycle.
  
  It's true for me. I generally don't read through the think part. I make the query, do something else, and then come back to see what the actual output it. Overall, I find it gives me way better answers than I got with the version of R1 I was able to get running locally. Turns out the settings do matter though.

Here’s why:

Final Answer: