Finding a good set of hyperparameters for a neural network is a complex problem, typically involving a time-consuming trial-and-error process. Often, the set of hyperparameters is chosen too large to solve a specific task, leading to parts of the network lying idle. Therefore, many recent works focus on the automated architecture search [1, 2] or the pruning [3, 4] of neural networks.
The goal of this project is to evaluate the information-theoretic entropy of trained models and to experiment with different compression techniques [5, 6] to reduce its storage size.
The project aims to reveal connections between
- the entropy in the training dataset
- the entropy in the trained model
- the potential to prune and compression ratio
- the theoretical capacity of the hyperparameter space
While choosing the hyperparameters for a neural network, information-theoretic considerations are rarely taken into account. It does, however, seem reasonable, to orient the capacity of the network towards the information content of the dataset which should be represented. Still, accessible techniques to evaluate the saturation of neural network layers during training are missing up-to-date.
- Get familiar with TensorFlow / PyTorch
- Train models of different complexity on example datasets
- Implement a custom way to save the weights of the trained models
- Evaluate the entropy of
- training dataset
- saved weights
- Try different (existing or new) compression techniques on the saved weights
- Analyze connection between information contents
- Programming skills in Python
(preferably also with Pytorch or Tensorflow)
- Basic knowledge of neural networks
- Scope: Bachelor/Master
- Duration: 6 Month Project, 3 Month Thesis (Bachelor) / 6 Month Thesis (Master)
- Start: immediately
- J. Bergstra, D. L. K. Yamins, and D. Cox. “Making a science of model search: Hyperparameter optimization in hundreds of dimensions for vision architectures”. In: International Conference on Machine Learning (2013), pp. 115–123.
- T. Elsken, J. H. Metzen, and F. Hutter. “Neural Architecture Search: A Survey”. In: Journal of Machine Learning Research 20 (Aug. 2018), pp. 1–21.
- J. Frankle and M. Carbin. “The Lottery Ticket Hypothesis: Training Pruned Neural Networks”. In: CoRR abs/1803.03635 (2018). arXiv: 1803.03635.
- The TensorFlow Authors. Magnitude-based weight pruning with Keras. https://github.com/tensorflow/model-optimization/blob/master/tensorflow_model_optimization/g3doc/guide/pruning/pruning_with_keras.ipynb [Online; accessed 24. Sep. 2019]. June 2019.
- S. Wiedemann et al. “DeepCABAC: A Universal Compression Algorithm for Deep Neural Networks”. In: CoRR abs/1907.11900 (2019). arXiv: 1907. 11900.
- M. Zhu and S. Gupta. “To prune, or not to prune: exploring the efficacy of pruning for model compression”. In: arXiv e-prints, arXiv: 1710.01878.