-
Notifications
You must be signed in to change notification settings - Fork 45
Open
Labels
questionFurther information is requestedFurther information is requested
Description
This is a great paper, full of information and ideas. Thank you for this amazing work.
While reading I came across this line, "we want to look at the momentum magnitude of “missing” or zero-valued weights, that is, we want to look at those weights which have been excluded from training before." I was wondering if there is some weight which has large momentum, assuming that this momentum value has gathered over several updates and not just a single update, why were these connections missing in the first place?? is it because connection regrowing step is not done more frequently?? and if this is the reason, then can regrowing connection more frequently give faster convergence??
Thank you for your time and attention.
Vibhas.
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested