diff --git a/homework2/Homework2.ipynb b/homework2/Homework2.ipynb
new file mode 100644
index 0000000..1c50bf3
--- /dev/null
+++ b/homework2/Homework2.ipynb
@@ -0,0 +1,213 @@
+{
+  "nbformat": 4,
+  "nbformat_minor": 0,
+  "metadata": {
+    "colab": {
+      "name": "Homework2.ipynb",
+      "provenance": [],
+      "collapsed_sections": []
+    },
+    "kernelspec": {
+      "name": "python3",
+      "display_name": "Python 3"
+    }
+  },
+  "cells": [
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "tkKY6us_cCg4",
+        "colab_type": "text"
+      },
+      "source": [
+        "# Homework 2\n",
+        "\n",
+        "**For exercises in the week 20-25.11.19**\n",
+        "\n",
+        "**Points: 6 + 1b**\n",
+        "\n",
+        "Please solve the problems at home and bring to class a [declaration form](http://ii.uni.wroc.pl/~jmi/Dydaktyka/misc/kupony-klasyczne.pdf) to indicate which problems you are willing to present on the blackboard.\n",
+        "\n"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8rdqrcVvNTv0",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 1 [1p]\n",
+        "\n",
+        "Let $(x^{(i)},y^{(i)})$ be a data sample with $x^{(i)}\\in\\mathbb{R}^D$, $y^{(i)}\\in\\mathbb{R}$. Let $\\Theta \\in\\mathbb{R}^D$ a parameter vector.\n",
+        "\n",
+        "Find the closed form solution $\\Theta^*$ to \n",
+        "\n",
+        "$$\n",
+        "\\min_\\Theta \\left(\\frac{1}{2}\\sum_i (\\Theta^Tx^{(i)} - y^{(i)})^2 + \\frac{\\lambda}{2}\\sum_{d=1}^D \\Theta_d^2\\right).\n",
+        "$$"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "4PMPpxChQHnR",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 2 [1p]\n",
+        "Let $v\\in\\mathbb{R}^D$ be a vector. Define the gradient of $f(v)\\in\\mathbb{R}$ with respect to $v$ to be $\\frac{\\partial f}{\\partial v} = \\left[\\frac{\\partial f(v)}{\\partial v_1}, \\frac{\\partial f(v)}{\\partial v_2}, ..., \\frac{\\partial f(v)}{\\partial v_D}\\right]$\n",
+        "\n",
+        "Find the following functions' gradients with respect to vector $[x, y, z]^T$:\n",
+        "1. $f_1([x, y, z]^T) = x + y$\n",
+        "2. $f_2([x, y, z]^T) = xy$\n",
+        "3. $f_3([x, y, z]^T) = x^2y^2$\n",
+        "4. $f_4([x, y, z]^T) = (x + y)^2$\n",
+        "5. $f_5([x, y, z]^T) = x^4 + x^2 y z + x y^2 z + z^4$\n",
+        "6. $f_6([x, y, z]^T) = e^{x + 2y}$\n",
+        "7. $f_7([x, y, z]^T) = \\frac{1}{x y^2}$\n",
+        "8. $f_8([x, y, z]^T) = ax + by + c$\n",
+        "9. $f_9([x, y, z]^T) = \\tanh(ax + by + c)$"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "TH9nPvyCMu27",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 3 [0.5p]\n",
+        "\n",
+        "Find the following functions' gradients or Jacobians with respect to vector $\\mathbf{x}$, where $\\mathbf{x}, \\mathbf{b} \\in \\mathbb{R}^{n}$, $\\mathbf{W} \\in \\mathbb{R}^{n \\times n}$:\n",
+        "\n",
+        "1. $\\mathbf{W} \\mathbf{x} + \\mathbf{b}$\n",
+        "2. $\\mathbf{x}^T \\mathbf{W} \\mathbf{x}$,"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8sYUtEqUQWYC",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 4 [1p]\n",
+        "\n",
+        "Find the derivative of\n",
+        "$-\\log(S(\\mathbf{x})_j)$, where $S$ is the\n",
+        "    softmax function\n",
+        "    (https://en.wikipedia.org/wiki/Softmax_function) and we are\n",
+        "    interested in the derivative over the $j$-th output of the\n",
+        "    Softmax."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "w_Nd4O2PP5mq",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 5 [0.5p]\n",
+        "\n",
+        "Consider a dataset with 400 examples of class C1 and 400 of class C2. \n",
+        "Let tree A have 2 leaves with class distributions:\n",
+        "\n",
+        "| Tree A   | C1    | C2  |\n",
+        "|----------|-------|-----|\n",
+        "| Leaf 1   | 100   | 300 |\n",
+        "| Leaf 2   | 300   | 100 |\n",
+        "\n",
+        "and let tree B have 2 leaves with class distribution:\n",
+        "\n",
+        "| Tree B   | C1    | C2  |\n",
+        "|----------|-------|-----|\n",
+        "| Leaf 1   | 200   | 400 |\n",
+        "| Leaf 2   | 200   |   0 |\n",
+        "\n",
+        "What is the misclassification rate for both trees? Which tree is more pure according to Gini or Infogain?"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "X0BwK5qnaxav",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 6 [1p]\n",
+        "\n",
+        "Consider regresion problem, with $M$ predictors $h_m(x)$ trained to aproximate a target $y$. Define the error to be $\\epsilon_m(x) = h_m(x) - y$.\n",
+        "\n",
+        "Suppose you train $M$ independent classifiers with average least squares error\n",
+        "$$\n",
+        "E_{AV} = \\frac{1}{M}\\sum_{m=1}^M \\mathbb{E}_{x}[\\epsilon_m(x)^2].\n",
+        "$$\n",
+        "\n",
+        "Further assume that the errors have zero mean and are uncorrelated:\n",
+        "$$\n",
+        "\\mathbb{E}_{x}[\\epsilon_m(x)] = 0\\qquad\\text{ and }\\qquad\\mathbb{E}_{x}[\\epsilon_m(x)\\epsilon_l(x)] = 0\\text{ for } m \\neq l\n",
+        "$$\n",
+        "\n",
+        "Let the mean predictor be\n",
+        "$$\n",
+        "h_M(x) = \\frac{1}{M}h_m(x).\n",
+        "$$\n",
+        "\n",
+        "What is the average error of $h_M(x)$?"
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "kVFSEXy_g_fo",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 7 [1p]\n",
+        "\n",
+        "Suppose you work on a binary classification problem and train 3 weak classifiers. You combine their prediction by voting. \n",
+        "\n",
+        "Can the training error rate of the voting ensemble smaller that the error rate of the individual weak predictors? Can it be larger? Show an example or prove infeasibility."
+      ]
+    },
+    {
+      "cell_type": "markdown",
+      "metadata": {
+        "id": "8IzGTnZuMxTB",
+        "colab_type": "text"
+      },
+      "source": [
+        "## Problem 8 [1 bonus point]\n",
+        "\n",
+        "While on a walk, you notice that a locomotive has the serial number 50. Assuming that all locomotives used by PKP (the Polish railroad operator) are numbered using consecutive natural numbers, what is your estimate of $N$ the total number of locomotives operated by PKP?\n",
+        "\n",
+        "Tell why the Maximum Likelihood principle may not yield satisfactory results. \n",
+        "\n",
+        "Use the Bayesian approach to find the posterior distribution over\n",
+        "  the number of locomotives. Then compute the expected count of\n",
+        "  locomotives. For the prior use the power law:\n",
+        "  \\begin{equation}\n",
+        "  p(N) =  \\frac{1}{N^\\alpha}\\frac{1}{\\zeta(\\alpha,1)},\n",
+        "  \\end{equation}\n",
+        "  where the $\\zeta(s,q)=\\sum_{n=0}^{\\infty}\\frac{1}{(q+n)^s}$ is the\n",
+        "  Hurwitz Zeta function\n",
+        "  (https://en.wikipedia.org/wiki/Hurwitz_zeta_function)\n",
+        "  available in Python as `scipy.special.zeta`. The use of the\n",
+        "  power law is motivated by the observation that the frequency of\n",
+        "  occurrence of a company is inversely proportional to its size (see\n",
+        "  also: R.L. Axtell, Zipf distribution of US firm sizes\n",
+        "  https://www.sciencemag.org/content/293/5536/1818).\n",
+        "  \n",
+        "  How would your estimate change after seeing 5 locomotives, with the\n",
+        "  biggest serial number among them being 50?\n",
+        "\n",
+        "  **Note**: During the Second World War, a similar problem was\n",
+        "  encountered while trying to estimate the total German tank\n",
+        "  production from the serial numbers of captured machines. The\n",
+        "  statistical estimates were the most precise!"
+      ]
+    }
+  ]
+}
\ No newline at end of file