diff --git a/homework3/Homework3.ipynb b/homework3/Homework3.ipynb new file mode 100644 index 0000000..762f10e --- /dev/null +++ b/homework3/Homework3.ipynb @@ -0,0 +1,169 @@ +{ + "nbformat": 4, + "nbformat_minor": 0, + "metadata": { + "colab": { + "name": "Homework3.ipynb", + "provenance": [], + "toc_visible": true + }, + "kernelspec": { + "name": "python3", + "display_name": "Python 3" + } + }, + "cells": [ + { + "cell_type": "markdown", + "metadata": { + "id": "Z3gmko6zus4J", + "colab_type": "text" + }, + "source": [ + "# Homework 3\n", + "\n", + "**For exercises in the week 11-16.12.19**\n", + "\n", + "**Points: 7 + 1b**\n", + "\n", + "Please solve the problems at home and bring to class a [declaration form](http://ii.uni.wroc.pl/~jmi/Dydaktyka/misc/kupony-klasyczne.pdf) to indicate which problems you are willing to present on the blackboard.\n", + "\n", + "$\\def\\R{{\\mathbb R}} \\def\\i{^{(i)}} \\def\\sjt{\\mathrm{s.t. }\\ }$" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "2TNgwkXtvJLT", + "colab_type": "text" + }, + "source": [ + "# Problem 1 [2p]\n", + "\n", + "Let $X\\in \\R^{D\\times N}$ be a data matrix contianing $N$ $D$-dimensional points. Let $Y\\in\\R^{1\\times N}$ be the targets.\n", + "\n", + "We have seen that the least squares problem\n", + "$$\n", + "\\min_{\\Theta} \\frac{1}{2}(\\Theta^T X - Y)(\\Theta^T X - Y)^T\n", + "$$\n", + "has a closed form solution\n", + "$$\n", + "\\Theta^T{}^* = Y X^T(X X^T)^{-1}\n", + "$$\n", + "Where $X^+ = X^T(X X^T)^{-1}$ is the right [Moore-Penrose pseudoinverse](https://en.wikipedia.org/wiki/Moore%E2%80%93Penrose_inverse) of $X$:\n", + "$$\n", + "\\begin{split}\n", + "\\Theta^T X &\\approx Y \\\\\n", + "\\Theta^T X X^+ &\\approx Y X^{+} \\\\\n", + "\\Theta^T &= Y X^{+}\n", + "\\end{split}\n", + "$$\n", + "\n", + "The pseudoinverse also has another form (called a left inverse):\n", + "$$\n", + "X^+ = (X^T X)^{-1}X^T\n", + "$$\n", + "\n", + "## P1.1 [0.5p]\n", + "Say under which conditions the left and right pseudoinverses exist (when $X$ is a rectangular matrix only one index exists). Give examples of machine learning problems that could be solved using each inverse.\n", + "\n", + "## P1.2 [1p]\n", + "Derive the left inverse by solving the regularized least squares problem\n", + "$$\n", + "\\min_\\Theta \\sum_i(\\Theta^T x\\i - y\\i)^2 + \\lambda\\Theta^T\\Theta\n", + "$$\n", + "with arificially introduced variables $\\epsilon\\i$ and constraints $\\epsilon\\i = \\Theta^T x\\i - y\\i$, then see what happens when $\\lambda\\rightarrow 0$.\n", + "\n", + "## P1.3 [0.5p]\n", + "Show that the above dual formulation allows using Kernel functions with linear regression. Express the optimal solution using a weighed avegage of data samples. How many \"support vectors\" there are?\n", + "\n", + "NB: some authors call the kernelized linear regression the \"Least-Squares SVM\"." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "g6rmavsV2ifp", + "colab_type": "text" + }, + "source": [ + "# Problem 2 (Bishop) [1p]\n", + "\n", + "Recall the nearest neighbor classifier. Show that the Euclidean distance\n", + "$||x-y||^2$ can be expressed as a linear combination of dot-products. Using this \n", + "formulation of the Euclidean distance, design a kernelized nearest neighbors method." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "KxVvXJ70-JqP", + "colab_type": "text" + }, + "source": [ + "# Problem 3 (Bishop) [1p]\n", + "\n", + "Recall the SVM training problem\n", + "\n", + "$$\n", + "\\begin{split}\n", + "\\min_{w,b} & \\frac{1}{2}w^T w \\\\\n", + "\\sjt & y\\i(w^Tx\\i+b) \\geq 1\\qquad \\textrm{for all } i.\n", + "\\end{split}\n", + "$$\n", + "\n", + "Show that the solution for the maximum margin hyperplane doesn't change when the $1$\n", + "on the right-hand side of the contraints is replaced by any $\\gamma>0$." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "U5n2hso7_YJu", + "colab_type": "text" + }, + "source": [ + "# Problem 4 [2p]\n", + "\n", + "Show that if $\\kappa$ is the kernel matrix of a dataset, i.e. $\\kappa_{ij} = K(x\\i, x^{(j)}) = \\phi(x\\i)^T\\phi(x^{(j)})$ for some kernel function $K$ and an induced feature expansion function $\\phi$, then:\n", + "\n", + "## 4.1 [0.5p]\n", + "$\\kappa$ is symmetric, i.e. $\\kappa = \\kappa^T$\n", + "\n", + "## 4.2 [1.5p]\n", + "$\\kappa$ is positive semidefinite, i.e. for any vector $c$ we have \n", + "$$\n", + "c^T\\kappa c\\geq 0\n", + "$$\n" + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "LbP_4MV8UAIt", + "colab_type": "text" + }, + "source": [ + "# Problem 5 [1p]\n", + "\n", + "Let $a$ and $b$ be two strings defined over an alphabet. $c$ is a substring of $a$ and $b$ if $a=xcz$ and $b=sct$ for some (possibly empty) strings $x, z, s, t$.\n", + "\n", + "Consider a function that counts the number of distinct substrings that are shared between two strings.\n", + "\n", + "Show that it is a kernel functon by showing how the feature expansion function $\\phi$ could be defined." + ] + }, + { + "cell_type": "markdown", + "metadata": { + "id": "Rt0GFW6EU_Y3", + "colab_type": "text" + }, + "source": [ + "# Problem 6 [1p bonus]\n", + "\n", + "Show how to kernelize logistic regression." + ] + } + ] +} \ No newline at end of file