{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# CS285 Fall 2019 Tensorflow Tutorial" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "This tutorial will provide a brief overview of the core concepts and functionality of Tensorflow. This tutorial will cover the following:\n", "\n", "0. What is Tensorflow\n", "1. How to input data\n", "2. How to perform computations\n", "3. How to create variables\n", "4. How to train a neural network for a simple regression problem\n", "5. Tips and tricks" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "# this just removes verbose warnings\n", "import os\n", "import warnings\n", "os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3' \n", "warnings.filterwarnings('ignore')" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import tensorflow as tf\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import matplotlib.cm as cm\n", "import matplotlib.patches as mpatches" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# for didactic purposes\n", "def tf_reset():\n", " try:\n", " sess.close()\n", " except:\n", " pass\n", " tf.reset_default_graph()\n", " return tf.Session()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 0. What is Tensorflow\n", "\n", "Tensorflow is a framework to define a series of computations. You define inputs, what operations should be performed, and then Tensorflow will compute the outputs for you.\n", "\n", "Below is a simple high-level example:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# create the session you'll work in\n", "# you can think of this as a \"blank piece of paper\" that you'll be writing math on\n", "sess = tf_reset()\n", "\n", "# define your inputs\n", "a = tf.constant(1.0)\n", "b = tf.constant(2.0)\n", "\n", "# do some operations\n", "c = a + b\n", "\n", "# get the result\n", "c_run = sess.run(c)\n", "\n", "print('c = {0}'.format(c_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 1. How to input data\n", "\n", "Tensorflow has multiple ways for you to input data. One way is to have the inputs be constants:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "# define your inputs\n", "a = tf.constant(1.0)\n", "b = tf.constant(2.0)\n", "\n", "# do some operations\n", "c = a + b\n", "\n", "# get the result\n", "c_run = sess.run(c)\n", "\n", "print('c = {0}'.format(c_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "However, having our inputs be constants is inflexible. We want to be able to change what data we input at runtime. We can do this using placeholders:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "# define your inputs\n", "a = tf.placeholder(dtype=tf.float32, shape=[1], name='a_placeholder')\n", "b = tf.placeholder(dtype=tf.float32, shape=[1], name='b_placeholder')\n", "\n", "# do some operations\n", "c = a + b\n", "\n", "# get the result\n", "c0_run = sess.run(c, feed_dict={a: [1.0], b: [2.0]})\n", "c1_run = sess.run(c, feed_dict={a: [2.0], b: [4.0]})\n", "\n", "print('c0 = {0}'.format(c0_run))\n", "print('c1 = {0}'.format(c1_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "But what if we don't know the size of our input beforehand? One dimension of a tensor is allowed to be 'None', which means it can be variable sized:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "# inputs\n", "a = tf.placeholder(dtype=tf.float32, shape=[None], name='a_placeholder')\n", "b = tf.placeholder(dtype=tf.float32, shape=[None], name='b_placeholder')\n", "\n", "# do some operations\n", "c = a + b\n", "\n", "# get outputs\n", "c0_run = sess.run(c, feed_dict={a: [1.0], b: [2.0]})\n", "c1_run = sess.run(c, feed_dict={a: [1.0, 2.0], b: [2.0, 4.0]})\n", "\n", "print(a)\n", "print('a shape: {0}'.format(a.get_shape()))\n", "print(b)\n", "print('b shape: {0}'.format(b.get_shape()))\n", "print('c0 = {0}'.format(c0_run))\n", "print('c1 = {0}'.format(c1_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 2. How to perform computations\n", "\n", "Now that we can input data, we want to perform useful computations on the data." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's create some data to work with:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "# inputs\n", "a = tf.constant([[-1.], [-2.], [-3.]], dtype=tf.float32)\n", "b = tf.constant([[1., 2., 3.]], dtype=tf.float32)\n", "\n", "a_run, b_run = sess.run([a, b])\n", "print('a:\\n{0}'.format(a_run))\n", "print('b:\\n{0}'.format(b_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can do simple operations, such as addition:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = b + b\n", "\n", "c_run = sess.run(c)\n", "print('b:\\n{0}'.format(b_run))\n", "print('c:\\n{0}'.format(c_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Be careful about the dimensions of the tensors, some operations may work even when you think they shouldn't..." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = a + b\n", "\n", "c_run = sess.run(c)\n", "print('a:\\n{0}'.format(a_run))\n", "print('b:\\n{0}'.format(b_run))\n", "print('c:\\n{0}'.format(c_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Also, some operations may be different than what you expect:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c_elementwise = a * b\n", "c_matmul = tf.matmul(b, a)\n", "\n", "c_elementwise_run, c_matmul_run = sess.run([c_elementwise, c_matmul])\n", "print('a:\\n{0}'.format(a_run))\n", "print('b:\\n{0}'.format(b_run))\n", "print('c_elementwise:\\n{0}'.format(c_elementwise_run))\n", "print('c_matmul: \\n{0}'.format(c_matmul_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Operations can be chained together:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# operations can be chained together\n", "c0 = b + b\n", "c1 = c0 + 1\n", "\n", "c0_run, c1_run = sess.run([c0, c1])\n", "print('b:\\n{0}'.format(b_run))\n", "print('c0:\\n{0}'.format(c0_run))\n", "print('c1:\\n{0}'.format(c1_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Finally, Tensorflow has many useful built-in operations:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c = tf.reduce_mean(b)\n", "\n", "c_run = sess.run(c)\n", "print('b:\\n{0}'.format(b_run))\n", "print('c:\\n{0}'.format(c_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 3. How to create variables\n", "\n", "Now that we can input data and perform computations, we want some of these operations to involve variables that are free parameters, and can be trained using an optimizer (e.g., gradient descent)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, let's create some data to work with:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "# inputs\n", "b = tf.constant([[1., 2., 3.]], dtype=tf.float32)\n", "\n", "sess = tf.Session()\n", "\n", "b_run = sess.run(b)\n", "print('b:\\n{0}'.format(b_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We'll now create a variable" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "var_init_value = [[2.0, 4.0, 6.0]]\n", "var = tf.get_variable(name='myvar',\n", " shape=[1, 3],\n", " dtype=tf.float32,\n", " initializer=tf.constant_initializer(var_init_value))\n", "\n", "print(var)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and check that it's been added to Tensorflow's variables list:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "print(tf.global_variables())" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "We can do operations with the variable just like any other tensor:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# can do operations\n", "c = b + var\n", "print(b)\n", "print(var)\n", "print(c)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Before we can run any of these operations, we must first initalize the variables" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "init_op = tf.global_variables_initializer()\n", "sess.run(init_op)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "and then we can run the operations just as we normally would." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "c_run = sess.run(c)\n", "\n", "print('b:\\n{0}'.format(b_run))\n", "print('var:\\n{0}'.format(var_init_value))\n", "print('c:\\n{0}'.format(c_run))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "So far we haven't said yet how to optimize these variables. We'll cover that next in the context of an example." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 4. How to train a neural network for a simple regression problem\n", "\n", "We've discussed how to input data, perform operations, and create variables. We'll now show how to combine all of these---with some minor additions---to train a neural network on a simple regression problem." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "First, we'll create data for a 1-dimensional regression problem:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# generate the data\n", "inputs = np.linspace(-2*np.pi, 2*np.pi, 10000)[:, None]\n", "outputs = np.sin(inputs) + 0.05 * np.random.normal(size=[len(inputs),1])\n", "\n", "plt.scatter(inputs[:, 0], outputs[:, 0], s=0.1, color='k', marker='o')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The below code creates the inputs, variables, neural network operations, mean-squared-error loss, gradient descent optimizer, and runs the optimizer using minibatches of the data." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "def create_model():\n", " # create inputs\n", " input_ph = tf.placeholder(dtype=tf.float32, shape=[None, 1])\n", " output_ph = tf.placeholder(dtype=tf.float32, shape=[None, 1])\n", "\n", " # create variables\n", " W0 = tf.get_variable(name='W0', shape=[1, 20], initializer=tf.contrib.layers.xavier_initializer())\n", " W1 = tf.get_variable(name='W1', shape=[20, 20], initializer=tf.contrib.layers.xavier_initializer())\n", " W2 = tf.get_variable(name='W2', shape=[20, 1], initializer=tf.contrib.layers.xavier_initializer())\n", "\n", " b0 = tf.get_variable(name='b0', shape=[20], initializer=tf.constant_initializer(0.))\n", " b1 = tf.get_variable(name='b1', shape=[20], initializer=tf.constant_initializer(0.))\n", " b2 = tf.get_variable(name='b2', shape=[1], initializer=tf.constant_initializer(0.))\n", "\n", " weights = [W0, W1, W2]\n", " biases = [b0, b1, b2]\n", " activations = [tf.nn.relu, tf.nn.relu, None]\n", "\n", " # create computation graph\n", " layer = input_ph\n", " for W, b, activation in zip(weights, biases, activations):\n", " layer = tf.matmul(layer, W) + b\n", " if activation is not None:\n", " layer = activation(layer)\n", " output_pred = layer\n", " \n", " return input_ph, output_ph, output_pred\n", " \n", "input_ph, output_ph, output_pred = create_model()\n", " \n", "# create loss\n", "mse = tf.reduce_mean(0.5 * tf.square(output_pred - output_ph))\n", "\n", "# create optimizer\n", "opt = tf.train.AdamOptimizer().minimize(mse)\n", "\n", "# initialize variables\n", "sess.run(tf.global_variables_initializer())\n", "# create saver to save model variables\n", "saver = tf.train.Saver()\n", "\n", "# run training\n", "batch_size = 32\n", "for training_step in range(10000):\n", " # get a random subset of the training data\n", " indices = np.random.randint(low=0, high=len(inputs), size=batch_size)\n", " input_batch = inputs[indices]\n", " output_batch = outputs[indices]\n", " \n", " # run the optimizer and get the mse\n", " _, mse_run = sess.run([opt, mse], feed_dict={input_ph: input_batch, output_ph: output_batch})\n", " \n", " # print the mse every so often\n", " if training_step % 1000 == 0:\n", " print('{0:04d} mse: {1:.3f}'.format(training_step, mse_run))\n", " saver.save(sess, '/tmp/model.ckpt')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Now that the neural network is trained, we can use it to make predictions:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "# create the model\n", "input_ph, output_ph, output_pred = create_model()\n", "\n", "# restore the saved model\n", "saver = tf.train.Saver()\n", "saver.restore(sess, \"/tmp/model.ckpt\")\n", "\n", "output_pred_run = sess.run(output_pred, feed_dict={input_ph: inputs})\n", "\n", "plt.scatter(inputs[:, 0], outputs[:, 0], c='k', marker='o', s=0.1)\n", "plt.scatter(inputs[:, 0], output_pred_run[:, 0], c='r', marker='o', s=0.1)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Not so hard after all! There is much more functionality to Tensorflow besides what we've covered, but you now know the basics." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# 5. Tips and tricks" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### (a) Check your dimensions" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "# example of \"surprising\" resulting dimensions due to broadcasting\n", "a = tf.constant(np.random.random((4, 1)))\n", "b = tf.constant(np.random.random((1, 4)))\n", "c = a * b\n", "assert c.get_shape() == (4, 4)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### (b) Check what variables have been created" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "a = tf.get_variable('I_am_a_variable', shape=[4, 6])\n", "b = tf.get_variable('I_am_a_variable_too', shape=[2, 7])\n", "for var in tf.global_variables():\n", " print(var.name)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### (c) Look at the [tensorflow API](https://www.tensorflow.org/api_docs/python/), or open up a python terminal and investigate!" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(tf.reduce_mean)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### (d) Tensorflow has some built-in layers to simplify your code." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "help(tf.contrib.layers.fully_connected)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### (e) Use [variable scope](https://www.tensorflow.org/guide/variables#sharing_variables) to keep your variables organized." ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "sess = tf_reset()\n", "\n", "# create variables\n", "with tf.variable_scope('layer_0'):\n", " W0 = tf.get_variable(name='W0', shape=[1, 20], initializer=tf.contrib.layers.xavier_initializer())\n", " b0 = tf.get_variable(name='b0', shape=[20], initializer=tf.constant_initializer(0.))\n", "\n", "with tf.variable_scope('layer_1'):\n", " W1 = tf.get_variable(name='W1', shape=[20, 20], initializer=tf.contrib.layers.xavier_initializer())\n", " b1 = tf.get_variable(name='b1', shape=[20], initializer=tf.constant_initializer(0.))\n", " \n", "with tf.variable_scope('layer_2'):\n", " W2 = tf.get_variable(name='W2', shape=[20, 1], initializer=tf.contrib.layers.xavier_initializer())\n", " b2 = tf.get_variable(name='b2', shape=[1], initializer=tf.constant_initializer(0.))\n", "\n", "# print the variables\n", "var_names = sorted([v.name for v in tf.global_variables()])\n", "print('\\n'.join(var_names))" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### (f) You can specify which GPU you want to use and how much memory you want to use" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "gpu_device = 0\n", "gpu_frac = 0.5\n", "\n", "# make only one of the GPUs visible\n", "import os\n", "os.environ[\"CUDA_VISIBLE_DEVICES\"] = str(gpu_device)\n", "\n", "# only use part of the GPU memory\n", "gpu_options = tf.GPUOptions(per_process_gpu_memory_fraction=gpu_frac)\n", "config = tf.ConfigProto(gpu_options=gpu_options)\n", "\n", "# create the session\n", "tf_sess = tf.Session(graph=tf.Graph(), config=config)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "##### (g) You can use [tensorboard](https://www.tensorflow.org/guide/summaries_and_tensorboard) to visualize and monitor the training process." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.6" } }, "nbformat": 4, "nbformat_minor": 2 }