forked from janchorowski/ml_uwr
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
1 parent
98cedf2
commit c6b3151
Showing
1 changed file
with
213 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,213 @@ | ||
{ | ||
"nbformat": 4, | ||
"nbformat_minor": 0, | ||
"metadata": { | ||
"colab": { | ||
"name": "Homework2.ipynb", | ||
"provenance": [], | ||
"collapsed_sections": [] | ||
}, | ||
"kernelspec": { | ||
"name": "python3", | ||
"display_name": "Python 3" | ||
} | ||
}, | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "tkKY6us_cCg4", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"# Homework 2\n", | ||
"\n", | ||
"**For exercises in the week 20-25.11.19**\n", | ||
"\n", | ||
"**Points: 6 + 1b**\n", | ||
"\n", | ||
"Please solve the problems at home and bring to class a [declaration form](http://ii.uni.wroc.pl/~jmi/Dydaktyka/misc/kupony-klasyczne.pdf) to indicate which problems you are willing to present on the blackboard.\n", | ||
"\n" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "8rdqrcVvNTv0", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 1 [1p]\n", | ||
"\n", | ||
"Let $(x^{(i)},y^{(i)})$ be a data sample with $x^{(i)}\\in\\mathbb{R}^D$, $y^{(i)}\\in\\mathbb{R}$. Let $\\Theta \\in\\mathbb{R}^D$ a parameter vector.\n", | ||
"\n", | ||
"Find the closed form solution $\\Theta^*$ to \n", | ||
"\n", | ||
"$$\n", | ||
"\\min_\\Theta \\left(\\frac{1}{2}\\sum_i (\\Theta^Tx^{(i)} - y^{(i)})^2 + \\frac{\\lambda}{2}\\sum_{d=1}^D \\Theta_d^2\\right).\n", | ||
"$$" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "4PMPpxChQHnR", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 2 [1p]\n", | ||
"Let $v\\in\\mathbb{R}^D$ be a vector. Define the gradient of $f(v)\\in\\mathbb{R}$ with respect to $v$ to be $\\frac{\\partial f}{\\partial v} = \\left[\\frac{\\partial f(v)}{\\partial v_1}, \\frac{\\partial f(v)}{\\partial v_2}, ..., \\frac{\\partial f(v)}{\\partial v_D}\\right]$\n", | ||
"\n", | ||
"Find the following functions' gradients with respect to vector $[x, y, z]^T$:\n", | ||
"1. $f_1([x, y, z]^T) = x + y$\n", | ||
"2. $f_2([x, y, z]^T) = xy$\n", | ||
"3. $f_3([x, y, z]^T) = x^2y^2$\n", | ||
"4. $f_4([x, y, z]^T) = (x + y)^2$\n", | ||
"5. $f_5([x, y, z]^T) = x^4 + x^2 y z + x y^2 z + z^4$\n", | ||
"6. $f_6([x, y, z]^T) = e^{x + 2y}$\n", | ||
"7. $f_7([x, y, z]^T) = \\frac{1}{x y^2}$\n", | ||
"8. $f_8([x, y, z]^T) = ax + by + c$\n", | ||
"9. $f_9([x, y, z]^T) = \\tanh(ax + by + c)$" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "TH9nPvyCMu27", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 3 [0.5p]\n", | ||
"\n", | ||
"Find the following functions' gradients or Jacobians with respect to vector $\\mathbf{x}$, where $\\mathbf{x}, \\mathbf{b} \\in \\mathbb{R}^{n}$, $\\mathbf{W} \\in \\mathbb{R}^{n \\times n}$:\n", | ||
"\n", | ||
"1. $\\mathbf{W} \\mathbf{x} + \\mathbf{b}$\n", | ||
"2. $\\mathbf{x}^T \\mathbf{W} \\mathbf{x}$," | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "8sYUtEqUQWYC", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 4 [1p]\n", | ||
"\n", | ||
"Find the derivative of\n", | ||
"$-\\log(S(\\mathbf{x})_j)$, where $S$ is the\n", | ||
" softmax function\n", | ||
" (https://en.wikipedia.org/wiki/Softmax_function) and we are\n", | ||
" interested in the derivative over the $j$-th output of the\n", | ||
" Softmax." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "w_Nd4O2PP5mq", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 5 [0.5p]\n", | ||
"\n", | ||
"Consider a dataset with 400 examples of class C1 and 400 of class C2. \n", | ||
"Let tree A have 2 leaves with class distributions:\n", | ||
"\n", | ||
"| Tree A | C1 | C2 |\n", | ||
"|----------|-------|-----|\n", | ||
"| Leaf 1 | 100 | 300 |\n", | ||
"| Leaf 2 | 300 | 100 |\n", | ||
"\n", | ||
"and let tree B have 2 leaves with class distribution:\n", | ||
"\n", | ||
"| Tree B | C1 | C2 |\n", | ||
"|----------|-------|-----|\n", | ||
"| Leaf 1 | 200 | 400 |\n", | ||
"| Leaf 2 | 200 | 0 |\n", | ||
"\n", | ||
"What is the misclassification rate for both trees? Which tree is more pure according to Gini or Infogain?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "X0BwK5qnaxav", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 6 [1p]\n", | ||
"\n", | ||
"Consider regresion problem, with $M$ predictors $h_m(x)$ trained to aproximate a target $y$. Define the error to be $\\epsilon_m(x) = h_m(x) - y$.\n", | ||
"\n", | ||
"Suppose you train $M$ independent classifiers with average least squares error\n", | ||
"$$\n", | ||
"E_{AV} = \\frac{1}{M}\\sum_{m=1}^M \\mathbb{E}_{x}[\\epsilon_m(x)^2].\n", | ||
"$$\n", | ||
"\n", | ||
"Further assume that the errors have zero mean and are uncorrelated:\n", | ||
"$$\n", | ||
"\\mathbb{E}_{x}[\\epsilon_m(x)] = 0\\qquad\\text{ and }\\qquad\\mathbb{E}_{x}[\\epsilon_m(x)\\epsilon_l(x)] = 0\\text{ for } m \\neq l\n", | ||
"$$\n", | ||
"\n", | ||
"Let the mean predictor be\n", | ||
"$$\n", | ||
"h_M(x) = \\frac{1}{M}h_m(x).\n", | ||
"$$\n", | ||
"\n", | ||
"What is the average error of $h_M(x)$?" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "kVFSEXy_g_fo", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 7 [1p]\n", | ||
"\n", | ||
"Suppose you work on a binary classification problem and train 3 weak classifiers. You combine their prediction by voting. \n", | ||
"\n", | ||
"Can the training error rate of the voting ensemble smaller that the error rate of the individual weak predictors? Can it be larger? Show an example or prove infeasibility." | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": { | ||
"id": "8IzGTnZuMxTB", | ||
"colab_type": "text" | ||
}, | ||
"source": [ | ||
"## Problem 8 [1 bonus point]\n", | ||
"\n", | ||
"While on a walk, you notice that a locomotive has the serial number 50. Assuming that all locomotives used by PKP (the Polish railroad operator) are numbered using consecutive natural numbers, what is your estimate of $N$ the total number of locomotives operated by PKP?\n", | ||
"\n", | ||
"Tell why the Maximum Likelihood principle may not yield satisfactory results. \n", | ||
"\n", | ||
"Use the Bayesian approach to find the posterior distribution over\n", | ||
" the number of locomotives. Then compute the expected count of\n", | ||
" locomotives. For the prior use the power law:\n", | ||
" \\begin{equation}\n", | ||
" p(N) = \\frac{1}{N^\\alpha}\\frac{1}{\\zeta(\\alpha,1)},\n", | ||
" \\end{equation}\n", | ||
" where the $\\zeta(s,q)=\\sum_{n=0}^{\\infty}\\frac{1}{(q+n)^s}$ is the\n", | ||
" Hurwitz Zeta function\n", | ||
" (https://en.wikipedia.org/wiki/Hurwitz_zeta_function)\n", | ||
" available in Python as `scipy.special.zeta`. The use of the\n", | ||
" power law is motivated by the observation that the frequency of\n", | ||
" occurrence of a company is inversely proportional to its size (see\n", | ||
" also: R.L. Axtell, Zipf distribution of US firm sizes\n", | ||
" https://www.sciencemag.org/content/293/5536/1818).\n", | ||
" \n", | ||
" How would your estimate change after seeing 5 locomotives, with the\n", | ||
" biggest serial number among them being 50?\n", | ||
"\n", | ||
" **Note**: During the Second World War, a similar problem was\n", | ||
" encountered while trying to estimate the total German tank\n", | ||
" production from the serial numbers of captured machines. The\n", | ||
" statistical estimates were the most precise!" | ||
] | ||
} | ||
] | ||
} |