User Tools

Site Tools


deep_learning

Deep Learning 深度学习

Contents

Methods

External Memory

  • Vinyals, O., Fortunato, M. & Jaitly, N. Pointer networks. In Advances in Neural Information Processing Systems Vol. 28 (eds Cortes, C et al.) 2692–2700 (Curran Associates, 2015)
  • Graves, A., Wayne, G. & Danihelka, I. Neural Turing machines. Preprint at http://arxiv.org/abs/1410.5401 (2014)
  • differentiable neural computer (DNC) Hybrid computing using a neural network with dynamic external memory
  • Peter Battaglia, Razvan Pascanu, Matthew Lai, Danilo Jimenez Rezende, et al. Interaction networks for learning about objects, relations and physics. In NIPS, 2016.

Theory

Techniques

Dropout

  • Dropout improves Recurrent Neural Networks for Handwriting Recognition. 2015
  • Recurrent Neural Network Regularization. 2015
  • Dropout: a simple way to prevent neural networks from overfitting. Hinton, G.E., Krizhevsky, A., Srivastava, N., Sutskever, I., & Salakhutdinov, R. (2014). Journal of Machine Learning Research, 15, 1929-1958.

Batch Normalization

activation function

Weight Initiliaztion

目前在神经网络中建议使用的权重初始化策略是将值归一化到范围[-b,b],b为: \begin{equation} b=\sqrt{\frac{6}{H_k+H_{k+1}}} \end{equation} $H_k$和$H_{k+1}$分别是权值向量之前和之后的隐藏层大小。

由Hugo Larochelle推荐,Glorot和Bengio发布(2010)。

梯度检测

如果你手动实现了反向传播算法但是它不起作用,那么有99%的可能是梯度计算中存在Bug。那么就用梯度检测来定位问题。主要思想是运用梯度的定义:如果我们稍微增加某个权重值,模型的误差将会改变多少。 \begin{equation} \frac{\partial f(x)}{\partial x} \approx \frac{f(x+\varepsilon) -f(x-\varepsilon)}{2\varepsilon} \end{equation}

优化方法

超参数优化

训练技巧

Infrastructure

Tools

Framework

  • Deeppy 基于 Theano 高度扩展性的深度学习框架。
  • Brainstorm。来自瑞士人工智能实验室IDSIA
  • ConvNetJS。这是斯坦福大学博士生Andrej Karpathy开发浏览器插件
  • DeepDream
  • Idlf Intel 的深度学习框架 Intel Deep Learning Framework(IDLF)是一个 SDK 库,为深度神经网络提供训练和执行。它包括一些 API,能够把构建神经网络拓扑作为计算工作流程,进行函数图形优化并执行到硬件。我们最初的重点是驱动部署在 CPU(Xeon)和 GPU(Gen)上神经网络的物体识别(ImageNet 拓扑)。这个 API 的设计,使我们未来能很容易支持更多的设备。我们的关键原则是在每个 Intel 支持的平台上实现最大性能。
  • Keras 是非常极简、高度模块化的神经网络库,用 Python 写成,而且能运行在 TensorFlow 和 Thenao 的顶层。它的设计初衷是实现更快的实验,让从想法到结果的时间尽可能少,这是做好研究的关键所在。
  • Marvin。是普林斯顿大学视觉工作组新推出的C++框架
  • Minerva
  • MXNetJS MXNetJS 是一个 DMLC/MXnet 的 Javasript 包。MXNetJS 能给浏览器带来最新水平的深度学习预测 API。它通过 Emscripten 和 Amalgamation 运行。MXNetJS 允许你在各种计算图像中,运行最新水平的深度学习预测,并给客户端带来深度学习的乐趣。
  • Neon。由创业公司Nervana Systems
  • Neural Style
  • OpenDeep OpenDeep 是服务于 Python 的一个深度学习框架,建立在 Theano 的基础上,专注在灵活性和易用性,为行业的数据科学家和前沿研究者服务。OpenDeep 是一个模块化、易扩展的架构,能够用来构建几乎所有的神经网络框架,以解决你的问题。
  • Purine [10],
  • Reinforcejs Reinforcejs 是一个增强学习库,能够执行常见的增强学习算法,而且可以做 Web 端的 Demos。这个库现在包括:动态规划方法(Dynamic Programming Methods),时间差分学习(Temporal Difference Learning)(SARSA/Q-Learning),Deep Q-Learning, Stochastic/Deterministic Policy Gradients 和 Actor Critic 架构
  • Sickit-Neuralnetwork 深度神经网络的实施,而且没有学习崖(Learning Cliff)。这个库能够执行多层感知器,自动编码器和递归神经网络,它运行在稳定的 Future Proof 交互界面,并能和对用户更加友好的 Scikit-Learn 以及 Python 交互界面兼容。
  • Sonnet ← tensorflow
  • Theano-Lights Theano-Lights 是基于 Theano 的研究架构,提供最近一些深度学习模型的实现,以及便于训练和测试功能。这些模型不是隐藏起来的,而是在研究和学习的过程中,有很大的透明性和灵活性。

dataset

Links

Tutorials

Courses

2017

2016

2015

2014

2013

2012

Links

Books

Groups

Conferences

  • ICLR
  • Deep Learning workshop at NIPS

Study

Links

References

2017

* On the Origin of Deep Learning. Haohan Wang, Bhiksha Raj. 2017. https://128.84.21.199/abs/1702.07800 * Alfredo Canziani, Adam Paszke, Eugenio Culurciello. An Analysis of Deep Neural Network Models for Practical Applications. https://arxiv.org/abs/1605.07678

2015

  • Spatial Transformer Networks
  • Semi-Supervised Learning with Ladder Networks
  • Neural Turing Machines
  • Deep Generative Image Models Using A Laplacian Pyramid Of Adversarial Networks
  • Natural Neural Networks
  • Early stopping is nonparametric variational inference
  • Dropout as a Bayesian approximation: Representing model uncertainty in deep learning
  • Dougal Maclaurin, David Duvenaud, Ryan P. Adams. Gradient-based Hyperparameter Optimization through Reversible Learning

2014

  • Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. The Journal of Machine Learning Research, 15(1):1929–1958.

2013

  • Yoshua Bengio et al, Representation Learning: A Review and New Perspectives. 2013 [综述]

2012

  • Hinton. Improving neural networks by preventing co-adaptation of feature detectors. 2012 [Dropout]
  • L. Bottou, Stochastic gradient descent tricks, Neural Networks, Tricks of the Trade Reloaded, LNCS 2012.
  • Y. Bengio, Practical recommendations for gradient-based training of deep architectures, ArXiv 2012

2009

  • Bengio, Yoshua. Learning Deep Architectures for AI. Found. Trends Mach. Learn.. 2009 [综述]

2007

  • Marc’Aurelio Ranzato, Christopher Poultney, Sumit Chopra and Yann LeCun Efficient Learning of Sparse Representations with an Energy-Based Model, in J. Platt et al. (Eds), Advances in Neural Information Processing Systems (NIPS 2006), MIT Press, 2007
  • Yoshua Bengio, Pascal Lamblin, Dan Popovici and Hugo Larochelle, Greedy Layer-Wise Training of Deep Networks, in J. Platt et al. (Eds), Advances in Neural Information Processing Systems 19 (NIPS 2006), pp. 153-160, MIT Press, 2007

2016

  • Hinton, G. E. and Salakhutdinov, R. R. (2006). Reducing the dimensionality of data with neural networks. Science, 313:504-507.
  • Hinton, G. E., Osindero, S. and Teh, Y., A fast learning algorithm for deep belief nets Neural Computation 18:1527-1554, 2006

1998

  • Y. LeCun et al. Efficient BackProp, Neural Networks: Tricks of the Trade, 1998
deep_learning.txt · Last modified: 2019/03/18 08:03 by x