- 
                Notifications
    
You must be signed in to change notification settings  - Fork 34
 
13. Regularization: Sparsity
        Antonio Erdeljac edited this page Mar 17, 2019 
        ·
        1 revision
      
    Topic: Sparsity
Course: GMLC
Date: 17 March 2019
Professor: Not specified
- 
Many dimensions in a model (such as feature vectors) take much RAM memory
 - 
Zeroing out features close to 0 saves RAM memory
 - 
L1 vs. L2
- 
L2 penalizes weight squared
 - 
L1 penalizes |weight|
 - 
Derivative of L2 is 2 * weight
 - 
Derivative of L1 is k (independent constant)
 
 - 
 - 
Convex optimization
- Using techniques such as gradient descent to find the minimum of a convex function
 
 
- Know when to use L1 and L2 based on their encouragements to model’s weights
 
- 
L1 is used to save RAM Memory in a process where it encourages weights close to 0 to be exactly 0
 - 
L2 is used to bring weight values close to 0, but not exactly 0