{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 2.5 神经网络学习"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "神经网络模拟人脑神经元的连接来达到学习功能，通过逐层抽象将输入数据逐层映射为概念等高等语义。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.5.1 人脑神经机制"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "人眼在辨识图片时，会先提取边缘特征，再识别部件，最后再得到最高层的模式。也就是说，高层的特征是低层特征的组合，从低层到高层的特征表示越来越抽象，越来越能表现语义。"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"http://imgbed.momodel.cn//20200103102429.png\" width=300>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.5.2  感知机模型"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**感知机模型**：\n",
    "\n",
    "<img src=\"http://imgbed.momodel.cn/感知器模型.png\" width=300/>"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**输入项**：3个，$x_1,x_2,x_3$  \n",
    "**神经元**：1个，用圆圈表示  \n",
    "**权重**：每个输入项均通过权重与神经元相连（比如 $w_i$ 是 $x_i$ 与神经元相连的权重）    \n",
    "**输出**：1个\n",
    "\n",
    "\n",
    "**工作方法**：\n",
    "+ 计算输入项传递给神经元的信息加权总和，即：$y_{sum} = w_1x_1+w_2x_2+w_3x_3$\n",
    "+ 如果 $y_{sum}$ 大于某个预定阀值（比如 0.5），则输出为 1，否则为 0 。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "在输出的判断上，其实不仅可以简单的按照阈值来判断，可以通过一个函数来进行计算，这个函数称为激活函数。常见的激活函数有： sigmoid，tanh，relu 等。下面我们看看这些激活函数的曲线图。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "\n",
    "\n",
    "def plot_activation_function(activation_function):\n",
    "    \"\"\"\n",
    "    绘制激活函数\n",
    "    :param activation_function: 激活函数名\n",
    "    :return:\n",
    "    \"\"\"\n",
    "    x = np.arange(-10, 10, 0.1)\n",
    "    y_activation_function = activation_function(x)\n",
    "\n",
    "    # 绘制坐标轴\n",
    "    ax = plt.gca()\n",
    "    ax.spines['right'].set_color('none')\n",
    "    ax.spines['top'].set_color('none')\n",
    "    ax.xaxis.set_ticks_position('bottom')\n",
    "    ax.yaxis.set_ticks_position('left')\n",
    "    ax.spines['bottom'].set_position(('data', 0))\n",
    "    ax.spines['left'].set_position(('data', 0))\n",
    "\n",
    "    # 绘制曲线图\n",
    "    plt.plot(x, y_activation_function)\n",
    "    \n",
    "    # 展示函数图像\n",
    "    plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def sigmoid(x):\n",
    "    \"\"\"\n",
    "    sigmoid函数\n",
    "    :param x: np.array 格式数据\n",
    "    :return: sigmoid 函数\n",
    "    \"\"\"\n",
    "    return 1 / (1 + np.exp(-x))\n",
    "\n",
    "# 绘制 sigmoid 函数图像\n",
    "plot_activation_function(sigmoid)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def tanh(x):\n",
    "    \"\"\"\n",
    "    tanh函数\n",
    "    :param x: np.array 格式数据\n",
    "    :return: tanh 函数\n",
    "    \"\"\"\n",
    "    return ((np.exp(x) - np.exp(-x)) / (np.exp(x) + np.exp(-x)))\n",
    "\n",
    "# 绘制 tanh 函数图像\n",
    "plot_activation_function(tanh)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def relu(x):\n",
    "    \"\"\"\n",
    "    relu 函数\n",
    "    :param x: np.array 格式数据\n",
    "    :return: relu 函数\n",
    "    \"\"\"\n",
    "    temp = np.zeros_like(x)\n",
    "    if_bigger_zero = (x > temp)\n",
    "    return x * if_bigger_zero\n",
    "\n",
    "# 绘制 relu 函数\n",
    "plot_activation_function(relu)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "我们根据上面的定义可以编写一个简单的感知机模型。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def perceptron(x, w, threshold):\n",
    "    \"\"\"\n",
    "    感知机模型\n",
    "    :param x: 输入数据 np.array 格式\n",
    "    :param w: 权重 np.array 格式，需要与 x 一一对应\n",
    "    :param threshold: 阀值\n",
    "    :return: 0或者1\n",
    "    \"\"\"\n",
    "    x = np.array(x)\n",
    "    w = np.array(w)\n",
    "    y_sum = np.sum(w * x)\n",
    "    # 大于阀值返回 1，否则返回 0\n",
    "    return 1 if y_sum > threshold else 0\n",
    "\n",
    "\n",
    "# 输入数据\n",
    "x = np.array([1, 1, 4])\n",
    "# 输入权重\n",
    "w = np.array([0.5, 0.2, 0.3])\n",
    "# 返回结果\n",
    "perceptron(x, w, 0.8)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.5.3 神经网络"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "<img src=\"http://imgbed.momodel.cn//20200103111837.png\" width=400>\n",
    "\n",
    "与感知机的不同，神经网络：\n",
    "+ 输入层和输出层之间存在若干隐藏层。\n",
    "+ 每个隐藏层中包含若干神经元。\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 2.5.4 搭建神经网络"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "采用 **keras** 框架搭建一个神经网络实现手写体数字识别问题。  \n",
    "1. 导入相关包"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import keras\n",
    "from keras.datasets import mnist\n",
    "from keras.models import Sequential\n",
    "from keras.layers.core import Dense,Activation,Dropout\n",
    "from keras.utils import np_utils\n",
    "import warnings\n",
    "warnings.filterwarnings(\"ignore\")\n",
    "!mkdir -p ~/.keras/datasets\n",
    "!cp ./mnist.npz ~/.keras/datasets/mnist.npz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. 下载 **MNIST** 数据集并将它们转换为模型所能使用的格式。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 获取数据\n",
    "(X_train, y_train),(X_test,y_test) = mnist.load_data()\n",
    "\n",
    "# 将训练集数据形状从（60000,28,28）修改为（60000,784）\n",
    "X_train = X_train.reshape(len(X_train),-1)\n",
    "X_test = X_test.reshape(len(X_test),-1)\n",
    "\n",
    "# 将数据集图像像素点的数据类型从 uint8 修改为 float32\n",
    "X_train = X_train.astype('float32')\n",
    "X_test = X_test.astype('float32')\n",
    "\n",
    "# 把数据集图像的像素值从 0-255 放缩到[-1,1]之间\n",
    "X_train = (X_train - 127)/127\n",
    "X_test = (X_test - 127)/127\n",
    "\n",
    "# 数据集类别个数\n",
    "nb_classes = 10\n",
    "\n",
    "# 把 y_train 和 y_test 变成了 one-hot 的形式，即之前是 0-9 的一个数值， \n",
    "# 现在是一个大小为 10 的向量，它属于哪个数字，就在哪个位置为 1，其他位置都是 0。\n",
    "y_train = np_utils.to_categorical(y_train,nb_classes)\n",
    "y_test = np_utils.to_categorical(y_test,nb_classes)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3. 搭建神经网络模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_model():\n",
    "    \"\"\"\n",
    "    采用 keras 搭建神经网络模型\n",
    "    :return: 神经网络模型\n",
    "    \"\"\"\n",
    "    # 选择模型，选择序贯模型（Sequential())\n",
    "    model = Sequential()\n",
    "    \n",
    "    # 添加全连接层，共 512 个神经元\n",
    "    model.add(Dense(512,input_shape=(784,),kernel_initializer='he_normal'))\n",
    "    \n",
    "    # 添加激活层，激活函数选择 relu \n",
    "    model.add(Activation('relu'))\n",
    "    \n",
    "    # 添加全连接层，共 512 个神经元\n",
    "    model.add(Dense(512,kernel_initializer='he_normal'))\n",
    "    \n",
    "    # 添加激活层，激活函数选择 relu \n",
    "    model.add(Activation('relu'))\n",
    "    \n",
    "    # 添加全连接层，共 10 个神经元\n",
    "    model.add(Dense(nb_classes))\n",
    "    \n",
    "    # 添加激活层，激活函数选择 softmax\n",
    "    model.add(Activation('softmax'))\n",
    "    \n",
    "    return model\n",
    "\n",
    "model = create_model()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "4. 训练和测试神经网络模型"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def fit_and_predict(model, model_path):\n",
    "    \"\"\"\n",
    "    训练模型、模型评估、保存模型\n",
    "    :param model: 搭建好的模型\n",
    "    :param model_path:保存模型路径\n",
    "    :return:\n",
    "    \"\"\"\n",
    "    # 编译模型\n",
    "    model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])\n",
    "    \n",
    "    # 模型训练\n",
    "    model.fit(X_train, y_train, epochs=5, batch_size=64, verbose=1, validation_split=0.05)\n",
    "    \n",
    "    # 保存模型\n",
    "    model.save(model_path)\n",
    "    \n",
    "    # 模型评估，获取测试集的损失值和准确率\n",
    "    loss, accuracy = model.evaluate(X_test, y_test)\n",
    "\n",
    "    # 打印结果\n",
    "    print('Test loss:', loss)\n",
    "    print(\"Accuracy:\", accuracy)\n",
    "\n",
    "# 训练模型和评估模型\n",
    "fit_and_predict(model, model_path='./model.h5')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 实践与体验\n",
    "#### 调节神经网络结构和参数\n",
    "\n",
    "1. 将两层隐藏层改为一层，训练模型并在测试集上测试，得出准确率。\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_model1():\n",
    "    \"\"\"\n",
    "    搭建神经网络模型 model1，比 model 少一层隐藏层\n",
    "    :return: 模型 model1\n",
    "    \"\"\"\n",
    "    # 选择模型，选择序贯模型（Sequential())\n",
    "    model = Sequential()\n",
    "\n",
    "    # 添加全连接层，共 512 个神经元\n",
    "    model.add(Dense(512, input_shape=(784,), kernel_initializer='he_normal'))\n",
    "\n",
    "    # 添加激活层，激活函数选择 relu\n",
    "    model.add(Activation('relu'))\n",
    "\n",
    "    # 添加全连接层，共 10 个神经元\n",
    "    model.add(Dense(nb_classes))\n",
    "\n",
    "    # 添加激活层，激活函数选择 softmax\n",
    "    model.add(Activation('softmax'))\n",
    "\n",
    "    return model\n",
    "\n",
    "# 搭建神经网络\n",
    "model1 = create_model1()\n",
    "\n",
    "# 训练神经网络模型，保存模型和评估模型\n",
    "fit_and_predict(model1, model_path='./model1.h5')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "2. 修改两层隐藏层神经元的数量，然后训练模型得出准确率。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "def create_model2():\n",
    "    \"\"\"\n",
    "    搭建神经网络模型 model2，隐藏层的神经元数目比 model 少一半\n",
    "    :return: 神经网络模型 model2\n",
    "    \"\"\"\n",
    "    # 选择模型，选择序贯模型（Sequential())\n",
    "    model = Sequential()\n",
    "\n",
    "    # 添加全连接层，共 256 个神经元\n",
    "    model.add(Dense(256, input_shape=(784,), kernel_initializer='he_normal'))\n",
    "\n",
    "    # 添加激活层，激活函数选择 relu\n",
    "    model.add(Activation('relu'))\n",
    "\n",
    "    # 添加全连接层，共 256 个神经元\n",
    "    model.add(Dense(256, kernel_initializer='he_normal'))\n",
    "\n",
    "    # 添加激活层，激活函数选择 relu\n",
    "    model.add(Activation('relu'))\n",
    "\n",
    "    # 添加全连接层，共 10 个神经元\n",
    "    model.add(Dense(nb_classes))\n",
    "\n",
    "    # 添加激活层，激活函数选择 softmax\n",
    "    model.add(Activation('softmax'))\n",
    "\n",
    "    return model\n",
    "\n",
    "# 搭建神经网络模型\n",
    "model2 = create_model2()\n",
    "\n",
    "# 训练神经网络模型，保存模型并评估模型\n",
    "fit_and_predict(model2,model_path='./model2.h5')\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "3. 输入一个手写数字，比较三种模型输出结果的差异，对其差异进行分析解释。"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "np.set_printoptions(suppress=True)\n",
    "from keras.models import load_model\n",
    "\n",
    "# 加载模型\n",
    "model = load_model('./model.h5')\n",
    "model1 = load_model('./model1.h5')\n",
    "model2 = load_model('./model2.h5')  "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# 预测结果\n",
    "predict_results = np.round(model.predict(X_test)[0],3)\n",
    "predict_results1 = np.round(model1.predict(X_test)[0],3)\n",
    "predict_results2 = np.round(model2.predict(X_test)[0],3)\n",
    "\n",
    "# 打印预测结果\n",
    "print('原始模型\\n其各类别预测概率:%s，预测值: %s，真实值:%s\\n'  % (predict_results,np.argmax(predict_results),np.argmax(y_test[0])))\n",
    "print('只有一个隐藏层的模型\\n其各类别各类别预测概率:%s，预测值: %s，真实值:%s\\n'  % (predict_results1,np.argmax(predict_results1),np.argmax(y_test[0])))\n",
    "print('隐藏神经元数量更改后的模型\\n其各类别预测概率:%s，预测值: %s，真实值:%s'  % (predict_results2,np.argmax(predict_results2),np.argmax(y_test[0])))"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.5.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
