Skip to content

Ml/pr3 solution articles#5674

Open
ahmadbasyouni10 wants to merge 1 commit intoml/pr2-solution-articlesfrom
ml/pr3-solution-articles
Open

Ml/pr3 solution articles#5674
ahmadbasyouni10 wants to merge 1 commit intoml/pr2-solution-articlesfrom
ml/pr3-solution-articles

Conversation

@ahmadbasyouni10
Copy link
Copy Markdown
Collaborator

  • File(s) Modified: articles/rms-normalization.md, articles/training-diagnostics.md
  • Language(s) Used: python
  • Submission URL: N/A (ML course solution articles, not LeetCode submissions)

Summary

Adds solution articles for 2 new ML problems:

RMS Normalization (rms-normalization.md):

  • Explains RMSNorm vs LayerNorm (no mean subtraction, no beta parameter)
  • Step-by-step formula walkthrough: RMS computation, normalize, scale by gamma
  • Common pitfalls: forgetting epsilon, using beta parameter, wrong axis
  • Project context: model/rms_normalization.py — used by modern LLMs (LLaMA, Mistral)

Training Diagnostics (training-diagnostics.md):

  • Three diagnostic methods: compute_activation_stats, compute_gradient_stats, diagnose
  • Diagnosis priority: dead_neurons → exploding_gradients → vanishing_gradients → healthy
  • Thresholds and reasoning for each diagnosis category
  • Project context: foundations/training_diagnostics.py — debug tool for the GPT training pipeline

- rms-normalization.md — RMSNorm vs LayerNorm, numpy implementation, Llama/Mistral context
- training-diagnostics.md — 3-method class (activation stats, gradient stats, diagnose), Karpathy debugging recipe

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@ahmadbasyouni10 ahmadbasyouni10 changed the base branch from main to ml/pr2-solution-articles April 6, 2026 23:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant