If i'm not wrong, there is no condition of equivalence for weight decay and L2 regularization, as…

Divyanshu Mishra

Jun 1, 2021

If i'm not wrong, there is no condition of equivalence for weight decay and L2 regularization, as per as your derivation in both cases we end-up having (1-learning_rate * lambda)w (Intrested in only first term)

You have used alpha as learning rate in one case and

η in another case

Sign up to discover human stories that deepen your understanding of the world.

Free

Distraction-free reading. No ads.

Organize your knowledge with lists and highlights.

Tell your story. Find your audience.

Membership

Read member-only stories

Support writers you read most

Earn money for your writing

Listen to audio narrations

Read offline with the Medium app

Written by Rakshith V.

14 Followers

6 Following

No responses yet

Write a response

What are your thoughts?

Also publish to my profile

Help
Status
About
Careers
Press
Blog
Privacy
Terms
Text to speech
Teams