Skip to content

Update code-gpt article for W_o rollout and logits (no softmax)#5668

Open
ahmadbasyouni10 wants to merge 1 commit intofeature/ml-solution-articles-newfrom
ml/pr2-solution-articles
Open

Update code-gpt article for W_o rollout and logits (no softmax)#5668
ahmadbasyouni10 wants to merge 1 commit intofeature/ml-solution-articles-newfrom
ml/pr2-solution-articles

Conversation

@ahmadbasyouni10
Copy link
Copy Markdown
Collaborator

  • File(s) Modified: articles/code-gpt.md
  • Language(s) Used: python
  • Submission URL: N/A (ML course solution article update, not a LeetCode submission)

Summary

Updates the Code GPT solution article to match the W_o output projection rollout and the softmax → logits change:

  • Removes softmax from the solution — model now returns raw logits, not probabilities
  • Adds output_proj = nn.Linear(model_dim, model_dim, bias=False) to the inner MultiHeadedSelfAttention class
  • Updates all explanatory text from "probabilities" to "logits"
  • Updates prerequisites to reference W^O instead of softmax
  • Removes the softmax row from the shape walkthrough table
  • Updates key takeaways to mention W^O

This is the companion to the neetcode.io PR that rolls out W_o to the Code GPT solution/starter code and updates the expected test output.

- Remove softmax from solution, return raw logits
- Add W_o output projection to inner MHA class
- Update all explanatory text: probabilities → logits
- Update shape table and key takeaways

Made-with: Cursor
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant