forked from SPC-education/numerics-2022
2057 lines
53 KiB
Plaintext
2057 lines
53 KiB
Plaintext
{
|
||
"cells": [
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# PART I: NumPy."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# Programming in python"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Writing code\n",
|
||
"\n",
|
||
"* code should be readable\n",
|
||
"\n",
|
||
"* style guides\n",
|
||
" * PEP8 (PEP = Python Enhancement Proposal) http://legacy.python.org/dev/peps/pep-0008/\n",
|
||
" * writing idiomatic code https://david.goodger.org/projects/pycon/2007/idiomatic/handout.html ('idiomatic' means 'conforming to the natural mode of expression in a given context')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Avoiding worst practices\n",
|
||
"http://docs.quantifiedcode.com/python-anti-patterns/"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# NumPy: beginner's references\n",
|
||
"\n",
|
||
"* From Python to NumPy\n",
|
||
"https://www.labri.fr/perso/nrougier/from-python-to-numpy/\n",
|
||
" \n",
|
||
"* 100 NumPy Exercises\n",
|
||
"https://github.com/rougier/numpy-100/blob/master/100_Numpy_exercises.md"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# NumPy"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Constructing matrices and vectors "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 1,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"A = \n",
|
||
"[[3 3 0]\n",
|
||
" [2 1 1]\n",
|
||
" [1 1 0]\n",
|
||
" [3 0 2]]\n",
|
||
"\n",
|
||
" b = \n",
|
||
"[3 2 0]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"\n",
|
||
"A = np.array([[3, 3, 0],\n",
|
||
" [2, 1, 1],\n",
|
||
" [1, 1, 0],\n",
|
||
" [3, 0, 2]])\n",
|
||
"\n",
|
||
"b = np.array([3, 2, 0])\n",
|
||
"\n",
|
||
"print(f'A = \\n{A}\\n\\n b = \\n{b}')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Initializers "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 2,
|
||
"metadata": {
|
||
"scrolled": true,
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[0. 0. 0.]\n",
|
||
"[1. 1. 1.]\n",
|
||
"[[1. 0. 0.]\n",
|
||
" [0. 1. 0.]\n",
|
||
" [0. 0. 1.]]\n",
|
||
"[0 1 2]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(np.zeros(3))\n",
|
||
"print(np.ones(3))\n",
|
||
"print(np.eye(3))\n",
|
||
"print(np.arange(3))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Use numpy arrays, not lists"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 3,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"4.75 µs ± 348 ns per loop (mean ± std. dev. of 7 runs, 100 loops each)\n",
|
||
"173 µs ± 3.06 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"n = 10000\n",
|
||
"%timeit -n 100 np.ones(n)\n",
|
||
"%timeit -n 100 [1 for i in range(n)]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Basic operations "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 4,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([15, 8, 5, 9])"
|
||
]
|
||
},
|
||
"execution_count": 4,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# matrix-vector multiplication: .dot()\n",
|
||
"A.dot(b)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 5,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([15, 8, 5, 9])"
|
||
]
|
||
},
|
||
"execution_count": 5,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# same as\n",
|
||
"A @ b"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Broadcasting\n",
|
||
"https://numpy.org/doc/stable/user/basics.broadcasting.html"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Broadcasting is very important for writing concise and efficient code. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Data types"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 6,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"dtype('float32')"
|
||
]
|
||
},
|
||
"execution_count": 6,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"vector32 = np.array([1.2, 3.2, 4.9, 1.8], dtype=np.single)\n",
|
||
"vector32.dtype"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 7,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"dtype('float64')"
|
||
]
|
||
},
|
||
"execution_count": 7,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"vector64 = np.array([1.2, 3.2, 4.9, 1.8], dtype=np.double)\n",
|
||
"vector64.dtype"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Double-precision calculations are usually MUCH slower on GPU. On CPU, float32 is a bit faster than float64 can be preferred for memory reasons. Normally, you should use single for machine learning applications and double for scientific computations (especially for not-so-well conditioned problems)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 8,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"404 µs ± 5.89 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"b = np.random.normal(size=(1000, 1000)).astype(np.float32)\n",
|
||
"%timeit -n 100 np.sum(b**2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 9,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"927 µs ± 7.27 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"b = np.random.normal(size=(1000, 1000)).astype(np.float64)\n",
|
||
"%timeit -n 100 np.sum(b**2)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Shapes"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 10,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"vector [[1 2 3]] has dimensionality (1, 3)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"b = np.array([[1,2,3]])\n",
|
||
"print(f'vector {b} has dimensionality {b.shape}')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 11,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[1 2 3]\n",
|
||
"(3,)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"# remove axis of length 1\n",
|
||
"b = b.squeeze()\n",
|
||
"print(b)\n",
|
||
"print(b.shape)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## More on broadcasting"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 12,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[1, 2, 3],\n",
|
||
" [2, 3, 4],\n",
|
||
" [3, 4, 5]])"
|
||
]
|
||
},
|
||
"execution_count": 12,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"A = np.array([[1,2,3],[2,3,4],[3,4,5]])\n",
|
||
"A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 13,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([1, 2, 3])"
|
||
]
|
||
},
|
||
"execution_count": 13,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"b"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 14,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[ 1, 4, 9],\n",
|
||
" [ 2, 6, 12],\n",
|
||
" [ 3, 8, 15]])"
|
||
]
|
||
},
|
||
"execution_count": 14,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# multiply matrix by a vector row-wise \n",
|
||
"A*b"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[ 1, 4, 9],\n",
|
||
" [ 2, 6, 12],\n",
|
||
" [ 3, 8, 15]])"
|
||
]
|
||
},
|
||
"execution_count": 15,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# this is the same, of course\n",
|
||
"b*A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 16,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[ 1, 2, 3],\n",
|
||
" [ 4, 6, 8],\n",
|
||
" [ 9, 12, 15]])"
|
||
]
|
||
},
|
||
"execution_count": 16,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"# multiply matrix by a vector row-wise \n",
|
||
"A*b[:,None]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Indexing (slicing) vectors "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 17,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([0, 1, 2, 3, 4, 5])"
|
||
]
|
||
},
|
||
"execution_count": 17,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"b = np.arange(6)\n",
|
||
"b"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"What is the output?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 18,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[0 1 2]\n",
|
||
"5\n",
|
||
"[0 1 2 3]\n",
|
||
"[2 3]\n",
|
||
"[0 2 4]\n",
|
||
"[5 3 1]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(b[:3])\n",
|
||
"print(b[-1])\n",
|
||
"print(b[:-2])\n",
|
||
"print(b[2:4])\n",
|
||
"print(b[::2])\n",
|
||
"print(b[::-2])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Indexing (slicing) matrices"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 19,
|
||
"metadata": {
|
||
"scrolled": true,
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[0, 1, 2],\n",
|
||
" [3, 4, 5]])"
|
||
]
|
||
},
|
||
"execution_count": 19,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"A = b.reshape(2, 3)\n",
|
||
"A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"What is the output?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[0, 1, 2],\n",
|
||
" [3, 4, 5]])"
|
||
]
|
||
},
|
||
"execution_count": 20,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 21,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"[0 3]\n",
|
||
"[3 4 5]\n",
|
||
"[[0 2]\n",
|
||
" [3 5]]\n",
|
||
"[[0 1]\n",
|
||
" [3 4]]\n",
|
||
"[]\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"print(A[:, 0])\n",
|
||
"print(A[-1, :])\n",
|
||
"print(A[:,::2])\n",
|
||
"print(A[:3, :-1])\n",
|
||
"print(A[3:, 1])"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[0, 1, 2],\n",
|
||
" [3, 4, 5]])"
|
||
]
|
||
},
|
||
"execution_count": 22,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[0, 1],\n",
|
||
" [3, 4]])"
|
||
]
|
||
},
|
||
"execution_count": 23,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"A[:2, :2]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## View vs Copy"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 24,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[1, 2, 3],\n",
|
||
" [2, 3, 4],\n",
|
||
" [3, 4, 5]])"
|
||
]
|
||
},
|
||
"execution_count": 24,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"A = np.array([[1,2,3],[2,3,4],[3,4,5]])\n",
|
||
"A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 25,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([1, 2, 3])"
|
||
]
|
||
},
|
||
"execution_count": 25,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"sub_A = A[:,0]\n",
|
||
"sub_A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 26,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([[1, 2, 3],\n",
|
||
" [1, 3, 4],\n",
|
||
" [1, 4, 5]])"
|
||
]
|
||
},
|
||
"execution_count": 26,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"A[:,0] = 1\n",
|
||
"A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 27,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"text/plain": [
|
||
"array([1, 1, 1])"
|
||
]
|
||
},
|
||
"execution_count": 27,
|
||
"metadata": {},
|
||
"output_type": "execute_result"
|
||
}
|
||
],
|
||
"source": [
|
||
"sub_A"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 28,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"# copies have to be created explicitly\n",
|
||
"A_copy = A.copy()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# More benchmarks"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Computing convolutions: einsum"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 29,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"n = 100\n",
|
||
"a = np.random.normal(size=(n, n))\n",
|
||
"b = np.random.normal(size=(n, n))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Consider computation of $\\sum_{ij}a_{ij}b_{ji}$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 30,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"The slowest run took 10.02 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
|
||
"62.8 µs ± 86.4 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit -n 100 np.trace(np.dot(a, b))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 31,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"17.9 µs ± 7.12 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit -n 100 np.einsum('ij,ji->', a, b)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Tensor convolutions can be very tricky and may admit real optimizations. Trivial example:\n",
|
||
"$$\n",
|
||
"\\sum_{ijkl}A_{ij}B_{ji}B_{kl}A_{lk}\n",
|
||
"$$\n",
|
||
"How many FLOPs does it take to compute this sum?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Memory access and performance"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 32,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"a = np.zeros(shape= 64 * 1024 * 1024, dtype=np.int64)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Which of the assignments should run faster?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 33,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"37.2 ms ± 376 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit a[::1] += 2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 34,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"37.2 ms ± 29 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit a[::8] += 2"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 35,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"from timeit import timeit\n",
|
||
"def t(k):\n",
|
||
" return timeit(f'a[::{k}]+=2', setup='import numpy as np; a = np.zeros(shape=64 * 1024 * 1024, dtype=np.int64)', number=32)/32"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 36,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"ks = 2**np.arange(10)\n",
|
||
"ts = [t(k) for k in ks]"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 37,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"data": {
|
||
"image/png": "iVBORw0KGgoAAAANSUhEUgAAAXoAAAD8CAYAAAB5Pm/hAAAAOXRFWHRTb2Z0d2FyZQBNYXRwbG90bGliIHZlcnNpb24zLjUuMSwgaHR0cHM6Ly9tYXRwbG90bGliLm9yZy/YYfK9AAAACXBIWXMAAAsTAAALEwEAmpwYAAAiNUlEQVR4nO3deXxU5dn/8c81kw2SECAJCAkYCAFFdgIEqIpLLfpYkboAVlGrIgJq+7N9aluttbWL9nEvLrjjviK4VIsVbWUPIDuBsIdFEpZA2ELI/fsjI8U0wCQkOTOT7/v1mpfOOffMXONtvjm55sx9zDmHiIhELp/XBYiISN1S0IuIRDgFvYhIhFPQi4hEOAW9iEiEU9CLiES4KK8LqEpKSorLyMjwugwRkbAxb968IudcalX7QjLoMzIyyM3N9boMEZGwYWbrj7UvqNaNmQ02szwzyzezO6vYb2b2WGD/IjPrddS+dWa22My+NjOlt4hIPTvhEb2Z+YHxwPeBAmCumU1xzi07atiFQFbg1g94MvDPb53jnCuqtapFRCRowRzR9wXynXNrnHOlwBvAkEpjhgATXYVZQFMza1XLtYqISA0EE/RpwMaj7hcEtgU7xgH/MLN5ZjbqWC9iZqPMLNfMcgsLC4MoS0REghFM0FsV2yqvhHa8MQOdc72oaO+MNbOzqnoR59wE51y2cy47NbXKD45FRKQGggn6AqDNUffTgc3BjnHOffvPbcAkKlpBIiJST4IJ+rlAlpm1M7MYYDgwpdKYKcDIwNk3OUCxc26LmcWbWSKAmcUDFwBLarH+kOOcY+HGXZSWlXtdiogIEMRZN865MjMbB3wK+IHnnXNLzWx0YP9TwMfARUA+sA+4PvDwlsAkM/v2tV5zzn1S6+8iRBwsO8zd7y/hrdwC+mY05+lretMsPsbrskSkgbNQvPBIdna2C7cvTBWVHOSWV+Yxd91OLu3Rmo+XbKV1UhzPX9eH9qkJXpcnIhHOzOY557Kr2qe1bmrB8i27GfK36SwqKObxET15ZHhPXr8phz0Hyhj6xAxmrdnudYki0oAp6E/SP5Zu5bInZ1BWXs7bo/vzw+6tAeh9ajPeHzuQ1MRYrnluNu/MK/C4UhFpqBT0NeSc44kv8rn5lXlktUhgyrjv0S296XfGtGnemHdvGUC/dsn8/O2F/N+neZSXh16rTEQim4K+Bg4cOszP3vyaBz7J44fdWvPmzf1p2SSuyrFJjaJ54fo+jOjbhr9Ny+fWNxZw4NDheq5YRBqykFy9MpRt232AUS/P4+uNu/jFDzoxZlAmgbOKjina7+NPQ7vSLiWeP/99BZt27ueZkdmkJsbWU9Ui0pDpiL4almwqZsj46az8Zg9PXd2bsed0OGHIf8vMGHVWJk/+uDcrtu5m6BMVzyMiUtcU9EH6aNEWLn9qBj4z3hk9gMFdTqnR8wzucgpv3dyfg2XlXPbEDP69Suv6iEjdUtCfQHm545HPVjL2tfl0aZ3E5HED6dy6yUk9Z7f0pkweO5C0Zo247oW5vDZ7Qy1VKyLy3xT0x7G/9DC3vr6ARz5bxeW903n1pn6kJNROX71100a8c8sAzsxK4deTFvPHj5ZxWGfkiEgd0Iexx7CleD83Tcxl6ebd/Oai07nxzHZB9+ODlRAbxbMjs7nvo+U88++1rN++j0eG96BxjKZFRGqPjuirsGDDTi7523TWFe3j+Wv7cNNZ7Ws95L8V5ffxu0vO4Hc/7Mxny79h2NOz+Gb3gTp5LRFpmBT0lUxaUMCwCbNoFO1n0pgBnHNai3p53esGtuPZa7NZU1jCpeOns2zz7np5XRGJfAr6gPJyx/2frOBnby6kV9uKD0uzWibWaw3nntaSt0cPAODyp2bw+Ypv6vX1RSQyKeiBkoNljHp5Hk9+sZqr+rXl5Rv6eba8cOfWTZg8diCZqQnc+FIuL0xf60kdIhI5GnzQb9yxj8ufnMG0vG3ce8kZ/PHSLkT7vf3P0qJJHG/enMP5p7fk3g+Wcc/kJZQd1oVMRKRmGnTQz1m7gyHjp7N5135evL4P1w7IqLMPXaurcUwUT13dm1Fnteelmeu5cWIuew4c8rosEQlDDTbo35y7gR8/O4umjaJ5f+xAzswKvQuS+3zGry86nT8N7cq/VxVxxVMz2bRrv9dliUiYaXBBX3a4nD98uIxfvruYnPbJTBozMOSvAHVVv7a8eH0fNu3cz6Xjp7OoYJfXJYlIGGlQQb/7wCFueCmX575ay3UDMnjhuj4kNY72uqygnJmVyntjBhAb5ePKp2fyyZKtXpckImGiwQT92qK9DB0/nen5RfxpaFd+d8kZRHn8oWt1ZbVMZNKYgZzeqgm3vDqPp79cTShe81dEQkt4JV0NTc8v4tLx09mxt5RXbuzHVf3ael1SjaUmxvL6TTlc1LUVf/77Cn713mIO6YwcETmOiF9U5eWZ6/jdB8vITI3n2ZF9aJvc2OuSTlpctJ/Hh/ekXXI8f5uWz8ad+3jix71JahQebSgRqV8Re0R/6HA5d72/mLsnL2VQx1TevWVARIT8t3w+4+c/6MT/XdGdOWt38KMnprNh+z6vyxKREBSRQb9rXynXPj+HV2Zt4Oaz2zNhZDaJcZF5tHt573Qm/qQfRSWlDH1iOvPW7/C6JBEJMREX9Pnb9jBk/HRy1+3kwSu686sLT8fvC40vQdWV/pnJTBozgMS4KEY8M5spCzd7XZKIhJCICvppedsYOn4Gew+W8fqoHC7rne51SfWmfWoC740ZSI/0ptz2+gIe/+cqnZEjIkAEBf2ufaWMe3U+bZo3ZvK479H71GZel1TvmsfH8PKNfRnaM40Hp67k9je+ZsbqInZr6QSRBs1C8agvOzvb5ebmVvtxc9ft4IzWTRr8FZqcczz+eT6PfLaSb69O2D4lnq7pSXRNS6JbelPOaN2E+NiG/d9JJJKY2TznXHaV+yIp6OW7du4tZfGmYhYV7GJRQTGLNxWzpbji6lU+gw4tEuia1pRu6Ul0S0/i9FZNiIv2e1y1iNSEgl6O2LbnAEs2FbNwY/GRXwJFJaUARPmMji0T6ZaeRNf0JLqlNaXTKYnEREVMh08kYino5Zicc2zdfSAQ/P858t+1r6KvH+P3cXqrxCPB3zU9iawWCWG3fIRIpFPQS7U45yjYuZ9FBf9p+yzZVMyeg2UAxEX7OKP1t/3+ilu7lISIP41VJJQp6OWklZc71m3fGwj/iqP/JZt2s//QYQDiY/x0CQR/1/SmdEtL4tTkxiFzIReRSKeglzpxuNyxurCEhRt3Bfr9xSzbspvSsopF1prERdEtvSmDOqVyw/faKfRF6tDxgl7n10mN+QMf3nZsmcgV2W2AijWG8rbuORL8Czbs5L6PlhMT5WNk/wxvCxZpoBT0Uqui/T66pCXRJS2JEX0rWj43vDSX+z5cTq+2zeiSluR1iSINjk6dkDrl8xkPXtmD5vExjH1tvr6lK+KBoILezAabWZ6Z5ZvZnVXsNzN7LLB/kZn1qrTfb2YLzOzD2ipcwkfz+Bgev6onBTv386t3F2sNHpF6dsKgNzM/MB64EOgMjDCzzpWGXQhkBW6jgCcr7b8dWH7S1UrY6pPRnJ9f0ImPFm/hldkbvC5HpEEJ5oi+L5DvnFvjnCsF3gCGVBozBJjoKswCmppZKwAzSwf+B3i2FuuWMHTzWe0Z1CmVP3ywjCWbir0uR6TBCCbo04CNR90vCGwLdswjwP8Cx72wqZmNMrNcM8stLCwMoiwJNz6f8eAV3Y/06/eoXy9SL4IJ+qpOfq7cZK1yjJldDGxzzs070Ys45yY457Kdc9mpqalBlCXhKDkh9ki//s731K8XqQ/BBH0B0Oao++lA5UsYHWvMQOASM1tHRcvnXDN7pcbVSkTok9GcOy7oyEeLtvCq+vUidS6YoJ8LZJlZOzOLAYYDUyqNmQKMDJx9kwMUO+e2OOd+5ZxLd85lBB73uXPu6tp8AxKeRp+VydkdU/n9h+rXi9S1Ewa9c64MGAd8SsWZM28555aa2WgzGx0Y9jGwBsgHngHG1FG9EiF8PuOhK7vTrHE049SvF6lTWutGPDVn7Q6GT5jJRV1b8fiInloPR6SGjrfWjb4ZK57q2645d1zQiQ8XbeG1OerXi9QFBb147pazMzkzK4V7P1jG0s3q14vUNgW9eM7nMx4e1iPQr19ASeACJyJSOxT0EhJSEmJ5bHhP1m/fy691fr1IrVLQS8jo1z6Z//f9jkxZuJnX52w88QNEJCgKegkpYwZ14MysFH73wVKWbd7tdTkiEUFBLyHl235900YV59erXy9y8hT0EnJSEmJ5bERP1m3fy28mqV8vcrIU9BKScton87PzOzL56828OVf9epGToaCXkDXmnIp+/T1TlrJ8i/r1IjWloJeQ5fcZD13ZgyaNohn7qvr1IjWloJeQlpoYy6PDe7Bu+17uUr9epEYU9BLyBmSm8NPzO/L+15t5K1f9epHqUtBLWBh7TgcGdkjmt5OXsmKr+vUi1aGgl7Dg9xmPDOtJk0bRjHl1PnvVrxcJmoJewsaRfn3RXu56f4n69SJBUtBLWBmQmcLt53Vk0oJNvJ1b4HU5ImFBQS9hZ9y5HRiQmcxvpywhb+ser8sRCXkKegk7fp/xyPAeJMRGM+bVeerXi5yAgl7CUovEOB4b3oM1RXu5W/16keNS0EvYGtAhhdvOzeK9BZt4e5769SLHoqCXsHbbeVn0b5/MbycvYeU36teLVEVBL2HN7zMeHfFtv34++0rVrxepTEEvYa9FYhyPDu/B6sIS7n5/qdfliIQcBb1EhIEdUrj13CzenV/A21oPR+Q7FPQSMW4/L4uc9s25W/16ke9Q0EvE8PuMx4b3JCE2irHq14scoaCXiNKiSRyPDOtJfmEJv52sfr0IKOglAn0vK4Vbz+nAO/MKeEfn14so6CUy3X5+x4p+/ftLWKV+vTRwCnqJSH6f8ejwnjSO8TP2tfnsLz3sdUkinlHQS8Rq2SSOh4f1YNW2Eu6ZssTrckQ8o6CXiHZWx1TGndOBt3ILeFf9emmgFPQS8W4/L4u+7Zpzz5SlbCne73U5IvVOQS8RL8rv4/8u705ZebmWNJYGSUEvDULb5Mbc8f1OfLZ8Gx8u2uJ1OSL1SkEvDcb1AzPolp7E76YsZefeUq/LEak3QQW9mQ02szwzyzezO6vYb2b2WGD/IjPrFdgeZ2ZzzGyhmS01s3tr+w2IBCvK7+MvP+pG8f5D3PfRcq/LEak3Jwx6M/MD44ELgc7ACDPrXGnYhUBW4DYKeDKw/SBwrnOuO9ADGGxmObVTukj1dW7dhJvPbs+78wv418pCr8sRqRfBHNH3BfKdc2ucc6XAG8CQSmOGABNdhVlAUzNrFbhfEhgTHbjpkzDx1K3nZtE+NZ5fT1qsC4tLgxBM0KcBRy/wXRDYFtQYM/Ob2dfANmCqc252VS9iZqPMLNfMcgsLdaQldScu2s9fftSNgp37efAfK70uR6TOBRP0VsW2ykflxxzjnDvsnOsBpAN9zaxLVS/inJvgnMt2zmWnpqYGUZZIzfVt15yrc9rywoy1LNiw0+tyROpUMEFfALQ56n46sLm6Y5xzu4AvgMHVLVKkLvxy8Gm0TIzjzncXU1pW7nU5InUmmKCfC2SZWTsziwGGA1MqjZkCjAycfZMDFDvntphZqpk1BTCzRsD5wIraK1+k5hLjornv0i7kfbOHp75c7XU5InXmhEHvnCsDxgGfAsuBt5xzS81stJmNDgz7GFgD5APPAGMC21sB08xsERW/MKY65z6s5fcgUmPnd27Jxd1a8bfP88nfpuWMJTJZKH4dPDs72+Xm5npdhjQQRSUHOf+hL+mQmsBbN/fH56vqIyeR0GZm85xz2VXt0zdjpcFLSYjlrv/pTO76nbwye73X5YjUOgW9CHBZrzTOzErh/r+vYNMurXApkUVBLwKYGX8a2pVyB3dNWqwVLiWiKOhFAto0b8wdF3RkWl4hUxZWPoNYJHwp6EWOcv3AdnRv05R7P1jGDq1wKRFCQS9yFL/PuP+yruzef4g/fLjM63JEaoWCXqSS005pwphBmUxasIkv8rZ5XY7ISVPQi1Rh7LkdyEyN5zeTlmiFSwl7CnqRKsRG+bn/sm5sLt7PXz/N87ockZOioBc5huyM5lyTcyovzVzHvPVa4VLCl4Je5Dj+d/BpnNIkjjvfXaQVLiVsKehFjiMhNoo/Du3Cqm0lPPFFvtfliNSIgl7kBM49rSWXdG/N+Gn5rPxGK1xK+FHQiwThnh92JiE2il++u4jD5VoeQcKLgl4kCMkJsdx9cWcWbNjFyzPXeV2OSLUo6EWCNLRnGmd1TOWBT/Mo2LnP63JEgqagFwlSxQqXFde2/82kJVrhUsKGgl6kGtKbNeYXP+jElysLmfy1VriU8KCgF6mmkf0z6NGmKfd+sJTtJQe9LkfkhBT0ItXk9xkPXN6NkoNl/F4rXEoYUNCL1EDHlomMGdSByV9v5vMV33hdjshxKehFamjMOZlktUjgrklLKNEKlxLCFPQiNRQb5ecvl3Vjy+4D/PWTFV6XI3JMCnqRk9D71GZc2z+DibPWM2/9Dq/LEamSgl7kJP38B51ondSIX767mINlh70uR+S/KOhFTlJCbBT3De1C/rYSxk9b7XU5Iv9FQS9SC87p1IJLe7TmyS/yyduqFS4ltCjoRWrJb394Bolx0VrhUkKOgl6kljSPj+GeH3bm6427eGnGOq/LETlCQS9Siy7p3ppBnVL566d5bNyhFS4lNCjoRWqRmfHHoV3xGfx60mKtcCkhQUEvUsvSmjbifwefxr9XFfHe/E1elyOioBepC9fknEqvtk35w0fLKNIKl+IxBb1IHfD5jPsv68a+g4e59wOtcCneUtCL1JGslomMPacDHyzczD+Xa4VL8Y6CXqQO3TIok44tE7jr/SXsOXDI63KkgVLQi9ShmCgff7msG1t3H+CBT/K8LkcaqKCC3swGm1memeWb2Z1V7Dczeyywf5GZ9Qpsb2Nm08xsuZktNbPba/sNiIS6Xm2bcd2ADF6etZ6567TCpdS/Ewa9mfmB8cCFQGdghJl1rjTsQiArcBsFPBnYXgbc4Zw7HcgBxlbxWJGI9/MLOpHWtBG/fHcRBw5phUupX8Ec0fcF8p1za5xzpcAbwJBKY4YAE12FWUBTM2vlnNvinJsP4JzbAywH0mqxfpGwEB8bxZ9+1JU1hXv588fL9UUqqVfBBH0asPGo+wX8d1ifcIyZZQA9gdlVvYiZjTKzXDPLLSwsDKIskfBydsdUfjKwHS/NXM/vP1ymsJd6ExXEGKtiW+X/Q487xswSgHeBnzrndlf1Is65CcAEgOzsbP0ESES6++LTcThemL6OQ4fL+f0lXfD5qvrxEak9wQR9AdDmqPvpwOZgx5hZNBUh/6pz7r2alyoS/syM317cmZgoH09/uYZDZY4//6irwl7qVDBBPxfIMrN2wCZgOHBVpTFTgHFm9gbQDyh2zm0xMwOeA5Y75x6qxbpFwpaZcefg04jx+3j883wOlZfz18u741fYSx05YdA758rMbBzwKeAHnnfOLTWz0YH9TwEfAxcB+cA+4PrAwwcC1wCLzezrwLZfO+c+rtV3IRJmzIw7LuhEtN/HQ1NXUnbY8dCV3Yny66stUvuCOaInEMwfV9r21FH/7oCxVTzuK6ru34sIcNt5WUT5jQc+yaOsvJxHh/ckWmEvtSyooBeRujNmUAdi/D7u+2g5hw7P529X9SQ2yu91WRJBdOggEgJuPLM9vx9yBlOXfcPol+fpS1VSqxT0IiFiZP8M/jS0K9PyCrlpYi77SxX2UjsU9CIh5Kp+bXng8m58lV/ET16cy77SMq9LkgigoBcJMVdmt+GhK7sze+12rn1+DiUHFfZychT0IiFoaM90HhvRk/kbdnHNc7PZrbXs5SQo6EVC1MXdWjP+ql4s2VTM1c/OZte+Uq9LkjCloBcJYYO7nMKTP+7Nii17uOqZ2ezYq7CX6lPQi4S48zu3ZMLI3qwuLGHEhFkUlRz0uiQJMwp6kTAwqFMLnr+uD+t37GX4hFls233A65IkjCjoRcLEwA4pvHh9Xzbv2s+wCbPYUrzf65IkTCjoRcJITvtkJv6kL4V7DjLs6VkU7NzndUkSBhT0ImEmO6M5r9zYj537Shn29Cw2bFfYy/Ep6EXCUI82TXn9phz2lpYxbMJM1hbt9bokCWEKepEw1SUtidduzOFgWTnDnp5J/rY9XpckIUpBLxLGOrduwhujcih3MHzCLPK2KuzlvynoRcJcx5aJvHlzDn6fMXzCTJZuLva6JAkxCnqRCJCZmsCbo/rTKNrPVc/MZnGBwl7+Q0EvEiEyUuJ58+b+JMRGcdWzs5i/YafXJUmIUNCLRJA2zRvz1uj+NI+PYeRzc5i7bofXJUkIUNCLRJi0po14c1R/WiTGcu3zc5i5ervXJYnHFPQiEeiUpDjeuDmHtKaNuP7FOXy1qsjrksRDCnqRCNUiMY7XR+WQkRzPT16ay7S8bV6XJB5R0ItEsJSEWF6/KYesFgncPHEeU5d943VJ4gEFvUiEaxYfw2s35nB6q0RueWUef1+8xeuSpJ4p6EUagKTG0bx8Yz+6pScx7vUFTFm42euSpB4p6EUaiCZx0Uy8oR+9T23GT99YwP2frNB1aBsIBb1IA5IQG8WL1/fhku6teerL1Xzv/mk8NHUlxfsPeV2a1CFzznldw3/Jzs52ubm5XpchEtFWbN3No5+t4u9LttIkLoqbzmzPdQMzSIyL9ro0qQEzm+ecy65yn4JepGFburmYRz5bxdRl39C0cTSjzmrPtf0ziI+N8ro0qQYFvYic0KKCXTw8dSXT8gppHh/D6LPbc01OBo1i/F6XJkFQ0ItI0OZv2MnDU1fy71VFpCTEMmZQJlf1a0tctAI/lCnoRaTa5q7bwcNTVzJj9XZaNoll7DkdGNanDbFRCvxQpKAXkRqbuXo7D09dyZx1O2idFMfYcztwRe82xETppL1QoqAXkZPinGN6/nYenJrHgg27SG/WiNvOzWJorzSi/Qr8UKCgF5Fa4Zzjy5WFPDx1JQsLijk1uTG3nZvFkB6tiVLge+p4Qa+ZEZGgmRmDOrXg/bEDee7abBJio7jj7YVc8PC/mPz1Jg6Xh96BowQZ9GY22MzyzCzfzO6sYr+Z2WOB/YvMrNdR+543s21mtqQ2CxcR75gZ553ekg9v/R5PXd2bmCgft7/xNYMf+RcfLdpCuQI/pJww6M3MD4wHLgQ6AyPMrHOlYRcCWYHbKODJo/a9CAyujWJFJLSYGYO7nMLHt53J+Kt64YCxr83nosf+zSdLthKKreGGKJgj+r5AvnNujXOuFHgDGFJpzBBgoqswC2hqZq0AnHP/AnThSpEI5vMZ/9OtFZ/+9CweHd6D0rJyRr8yj4sf/4rPln2jwPdYMEGfBmw86n5BYFt1xxyXmY0ys1wzyy0sLKzOQ0UkRPh9xpAeafzjZ2fx4BXd2XOgjBsn5nLp+Ol8kbdNge+RYILeqthWebaCGXNczrkJzrls51x2ampqdR4qIiEmyu/jst7p/POOs3ngsm4UlZRy3QtzuezJGXy1qkiBX8+CCfoCoM1R99OBylctCGaMiDQw0X4fV/Zpw7SfD+KPQ7uwpfgAVz83m2FPz2Lm6u1el9dgBBP0c4EsM2tnZjHAcGBKpTFTgJGBs29ygGLnnK5XJiIAxET5+HG/U/niF4P4/ZAzWLd9LyOemcXwCTN5eeY68reV6Ci/Dp1wHVLnXJmZjQM+BfzA8865pWY2OrD/KeBj4CIgH9gHXP/t483sdWAQkGJmBcA9zrnnavuNiEjoi43yM7J/Bldmt+G12Rt47qu13D15KQAtEmMZkJnMgMwU+mcm06Z5Y4+rjRz6ZqyIeMY5x/rt+5i5ZjszVm9n5uoiikoqLm+Y3qzRd4K/ZZM4j6sNbVoCQUTCgnOOVdtKmLl6OzNWFzFrzY4jlzlsnxp/JPhz2ifTPD7G42pDi4JeRMLS4XLH8i27jwT/nLU72Ft6GIDTTklkQGYKAzKT6du+OU0a+CUQFfQiEhEOHS5nUUExs9ZUBH/uup0cLCvHZ9A1LYn+geDPzmhG45iGdSlEBb2IRKQDhw6zYMMuZq6p6O8v2LCLsnJHtN/o2aYZOZnJDMhMpmfbphF/wRQFvYg0CHsPlpG7ficzAx/sLt5UTLmD2CgffTKa0z8zmf6ZyXRLS4q4ZZWPF/QN628bEYlo8bFRnN0xlbM7Vny7vnj/Ieas3cGM1UXMXL2dv36aB0BCbBR92zVnQGYyOe2T6dyqCT5fVV/wjwwKehGJWEmNovl+55Z8v3NLALaXHGTWmkDwr9nO5yu2AZAYG0WHlglkplbcOrRIIDM1nrbNG0fEkb9aNyLSYG0tPsDMNUXMW7+T1dv2srqwhG17Dh7ZH+03Tk2Op0NqApkt4o/8IshskUBCbGgdJ6tHLyISpOL9h1hTWMLqworgX72thPzCEtZv3/edK2id0iTuSPhX/AVQcWvZJBaz+m8DqUcvIhKkpEbR9GzbjJ5tm31ne2lZORt27KsI/8IS8rdV/DKYNH8Tew6WHRkXH+Mns0VC4K+AihZQZmoCpybHExPlTRtIQS8iEoSYKB8dWlQcvR/NOUfhnoOB4P/PXwIz12znvQWbjozz+4xTmzemfaAN9J9fBAkkNarbL3sp6EVEToKZ0aJJHC2axDGgQ8p39pUcLGPtty2gI38FlPCvlYWUHi4/Mi4lIZbM1Hg6nZLIvZecUeutHwW9iEgdSYiNomt6El3Tk76zvexwOQU7938n/FcX7mVhQXGd9PcV9CIi9SzK7yMjJZ6MlHjOO71lnb9e+J8gKiIix6WgFxGJcAp6EZEIp6AXEYlwCnoRkQinoBcRiXAKehGRCKegFxGJcCG5eqWZFQLrj9qUBBRXMbSq7SlAUR2VVh3Hqrk+n6s6jwtm7InGVGeejrU9EufvZJ6vPuewuvs0f7X7uJOdv1Odc6lV7nHOhfwNmBDsdiDX63qPV3N9Pld1HhfM2BONqc48NaT5C5c5rO4+zV9ozd/xbuHSuvmgmttDQW3WVtPnqs7jghl7ojHVnaeGMn8n83z1OYfV3af5q93H1cbPYJVCsnVzMsws1x1j8X0JfZq/8Kb5C03hckRfHRO8LkBOiuYvvGn+QlDEHdGLiMh3ReIRvYiIHEVBLyIS4RT0IiIRLqKD3szizewlM3vGzH7sdT1SfWbW3syeM7N3vK5Fqs/MLg38/E02swu8rqehCrugN7PnzWybmS2ptH2wmeWZWb6Z3RnY/CPgHefcTcAl9V6sVKk6c+icW+Ocu8GbSqUq1Zy/9wM/f9cBwzwoVwjDoAdeBAYfvcHM/MB44EKgMzDCzDoD6cDGwLDD9VijHN+LBD+HEnpepPrzd1dgv3gg7ILeOfcvYEelzX2B/MDRXynwBjAEKKAi7CEM32ukquYcSoipzvxZhfuBvzvn5td3rVIhUsIvjf8cuUNFwKcB7wGXmdmThPbXteUYc2hmyWb2FNDTzH7lTWkShGP9DN4KnA9cbmajvShMIMrrAmqJVbHNOef2AtfXdzFSI8eaw+2AAiL0HWv+HgMeq+9i5Lsi5Yi+AGhz1P10YLNHtUjNaA7Dm+YvhEVK0M8FssysnZnFAMOBKR7XJNWjOQxvmr8QFnZBb2avAzOBTmZWYGY3OOfKgHHAp8By4C3n3FIv65Rj0xyGN81f+NGiZiIiES7sjuhFRKR6FPQiIhFOQS8iEuEU9CIiEU5BLyIS4RT0IiIRTkEvIhLhFPQiIhFOQS8iEuH+P8IbSt4ZQ+FNAAAAAElFTkSuQmCC\n",
|
||
"text/plain": [
|
||
"<Figure size 432x288 with 1 Axes>"
|
||
]
|
||
},
|
||
"metadata": {
|
||
"needs_background": "light"
|
||
},
|
||
"output_type": "display_data"
|
||
}
|
||
],
|
||
"source": [
|
||
"import matplotlib.pyplot as plt\n",
|
||
"plt.plot(ks, ts)\n",
|
||
"plt.xscale('log')"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Modern CPUs do not access memory byte by byte. Instead, they fetch memory in chunks of (typically) 64 bytes, called cache lines. When you read a particular memory location, the entire cache line is fetched from the main memory into the cache. And, accessing other values from the same cache line is cheap!\n",
|
||
"\n",
|
||
"Since 16 ints take up 64 bytes (one cache line), for-loops with a step between 1 and 16 have to touch the same number of cache lines: all of the cache lines in the array. But once the step is 32, we’ll only touch roughly every other cache line, and once it is 64, only every fourth."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"More: http://igoro.com/archive/gallery-of-processor-cache-effects/"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## More performance tips"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"https://shihchinw.github.io/2019/03/performance-tips-of-numpy-ndarray.html"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## JIT"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"JIT = Just-In-Time compilation"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 38,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"def matmul(a, b):\n",
|
||
" n = a.shape[0]\n",
|
||
" k = a.shape[1]\n",
|
||
" m = b.shape[1] \n",
|
||
" c = np.zeros((n, m))\n",
|
||
" for i in range(n):\n",
|
||
" for j in range(m):\n",
|
||
" for s in range(k):\n",
|
||
" c[i, j] += a[i, s] * b[s, j]\n",
|
||
" \n",
|
||
" return c"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 39,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"from numba import jit\n",
|
||
"\n",
|
||
"@jit(nopython=True)\n",
|
||
"def numba_matmul(a, b):\n",
|
||
" n = a.shape[0]\n",
|
||
" k = a.shape[1]\n",
|
||
" m = b.shape[1]\n",
|
||
" c = np.zeros((n, m))\n",
|
||
" for i in range(n):\n",
|
||
" for j in range(m):\n",
|
||
" for s in range(k):\n",
|
||
" c[i, j] += a[i, s] * b[s, j]\n",
|
||
" return c"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"Let us compare computational times."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 40,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"392 ms ± 33.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)\n",
|
||
"The slowest run took 162.76 times longer than the fastest. This could mean that an intermediate result is being cached.\n",
|
||
"23.4 ms ± 55 ms per loop (mean ± std. dev. of 7 runs, 2 loops each)\n",
|
||
"59.4 µs ± 19.2 µs per loop (mean ± std. dev. of 7 runs, 4 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"n = 100\n",
|
||
"a = np.random.randn(n, n)\n",
|
||
"b = np.random.randn(n, n)\n",
|
||
"\n",
|
||
"%timeit -n 1 matmul(a, b)\n",
|
||
"%timeit -n 2 numba_matmul(a, b)\n",
|
||
"%timeit -n 4 np.dot(a, b)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Enter JAX"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"JAX $\\approx$ Just After eXecution"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 41,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"import jax.numpy as jnp\n",
|
||
"import numpy as np"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 42,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [],
|
||
"source": [
|
||
"size = (1000, 1000)\n",
|
||
"a = np.random.normal(size=size)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 43,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"17.5 ms ± 61 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit -n 10 np.sum(a**3)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 44,
|
||
"metadata": {},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"1.55 ms ± 120 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit -n 10 np.sum(a*a*a)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 45,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stderr",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"WARNING:absl:No GPU/TPU found, falling back to CPU. (Set TF_CPP_MIN_LOG_LEVEL=0 and rerun for more info.)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"a = jnp.asarray(a)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 46,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"544 µs ± 5.9 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"%timeit jnp.sum(a**3).block_until_ready()"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Simply replacing numpy with jax.numpy, we got a 20x speed-up. \n",
|
||
"jax.numpy can be even faster than pure C code due to very clever optimizations. We will discuss jax in more details in Lecture 7.\n",
|
||
"Meanwhile: https://github.com/google/jax/discussions/11078"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# Part II: Vector and matrix norms. Orthogonal matrices."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# Vector and matrix norms. Orthonogal matrices. "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Reading\n",
|
||
"- Trefethen NLA, Lectures 1 - 3\n",
|
||
"- Tyrtyshnikov, Lecture 1"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Notations\n",
|
||
"\n",
|
||
"We use notation (for matrix of $m$ rows and $n$ columns) \n",
|
||
"\n",
|
||
"$$A= \\begin{bmatrix} a_{11} & \\dots & a_{1n} \\\\ \\vdots & \\ddots & \\vdots \\\\ a_{m1} & \\dots & a_{mn} \\end{bmatrix}.$$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Matrix-vector multiplication "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Let matrix $A$ be composed from columns $a_i$: $A = \\bigg[a_1\\bigg|a_2\\bigg|...\\bigg|a_n\\bigg]$ \n",
|
||
"\n",
|
||
"Then matrix-vector multiplication can be understood as follows:\n",
|
||
"$$\n",
|
||
"Ax = \\bigg[a_1\\bigg|a_2\\bigg|...\\bigg|a_n\\bigg]x = x_1 \\bigg[a_1\\bigg] + x_2 \\bigg[a_2\\bigg] + ... + x_n \\bigg[a_n\\bigg],\n",
|
||
"$$\n",
|
||
"that is column vector $b=Ax$ is a linear combination of columns of $A$."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Matrix-vector multiplication: complexity\n",
|
||
"\n",
|
||
"Multiplication of an $n\\times n$ matrix $A$ by a vector $x$ of size $n\\times 1$ ($y=Ax$):\n",
|
||
"\n",
|
||
"$$\n",
|
||
"y_{i} = \\sum_{i=1}^n a_{ij} x_j\n",
|
||
"$$\n",
|
||
"\n",
|
||
"requires $n^2$ mutliplications and $n(n-1)$ additions. Thus, the overall complexity is $\\mathcal{O}(n^2)$ "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Matrix-matrix product\n",
|
||
"A product of an $n \\times k$ matrix $A$ and a $k \\times m$ matrix $B$ is a $n \\times m$ matrix $C$ with the elements \n",
|
||
"$$\n",
|
||
" c_{ij} = \\sum_{s=1}^k a_{is} b_{sj}, \\quad i = 1, \\ldots, n, \\quad j = 1, \\ldots, m \n",
|
||
"$$\n",
|
||
"For $m=k=n$ complexity of a naive algorithm is $\\mathcal{O}(n^3)$. \n",
|
||
"\n",
|
||
"The rather elementary https://en.wikipedia.org/wiki/Strassen_algorithm allows to achieve $\\approx O(N^{2.81})$. Theoretically even faster algorithms (the best is $\\mathcal{O}(n^{2.37})$) exist, see a popular article https://www.quantamagazine.org/mathematicians-inch-closer-to-matrix-multiplication-goal-20210323/\n"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Range and nullspace "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"- range(A) is the set of vectors, which can be expressed as $Ax$\n",
|
||
"- range(A) is the space spanned by the columns of $A$\n",
|
||
"- null(A) is the set of $x$ such that $Ax=0$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Column and row rank "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"- column rank is dimension of the column space\n",
|
||
"- row rank is dimension of the row space\n",
|
||
"- column rank is always equal to row rank"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Full rank matrix and inverse"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"- The highest rank for a $m\\times n$ matrix is $\\min(m, n)$\n",
|
||
"- The $m\\times n$ matrix of a rank $\\min(m, n)$ is called full rank matrix\n",
|
||
"- Square $m\\times m$ matrix of a full rank is called non-singular matrix\n",
|
||
"- Non-singular matrix has an inverse $A^{-1}$ such that $A A^{-1} = A^{-1} A = 1$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Inverse matrix times a vector "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Geometric interpretation: Consider $x=A^{-1}b$. Its equivalent to $b=Ax$, hence $x$ is a vector of coefficients of expansion of $b$ over columns of $A$.\n",
|
||
"In practice, if your algorithm needs to compute a matrix inverse, you are probably doing it wrong (for example, applying inverse of a matrix to a vector is better implemented as solution of the linear system)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# Vector norm\n",
|
||
"\n",
|
||
"- The notions of size and distance in vector space are captured by norms \n",
|
||
"\n",
|
||
"- Norm is a measure of magnitude of a vector, denoted as $\\Vert x \\Vert$.\n",
|
||
"\n",
|
||
"The norm should satisfy certain properties:\n",
|
||
"\n",
|
||
"- $\\Vert \\alpha x \\Vert = |\\alpha| \\Vert x \\Vert$\n",
|
||
"- $\\Vert x + y \\Vert \\leq \\Vert x \\Vert + \\Vert y \\Vert$ (triangle inequality)\n",
|
||
"- If $\\Vert x \\Vert = 0$ then $x = 0$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"The distance between two vectors is then defined as\n",
|
||
"\n",
|
||
"$$ d(x, y) = \\Vert x - y \\Vert. $$\n",
|
||
"\n",
|
||
"The most well-known and widely used norm is euclidean norm:\n",
|
||
"\n",
|
||
"$$\\Vert x \\Vert_2 = \\sqrt{\\sum_{i=1}^n |x_i|^2},$$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## $p$-norm\n",
|
||
"Euclidean norm, or $2$-norm, is a subclass of an important class of $p$-norms:\n",
|
||
"\n",
|
||
"$$ \\Vert x \\Vert_p = \\Big(\\sum_{i=1}^n |x_i|^p\\Big)^{1/p}. $$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Equivalence of the norms\n",
|
||
"All norms are equivalent in the sense that\n",
|
||
"\n",
|
||
"$$ C_1 \\Vert x \\Vert_* \\leq \\Vert x \\Vert_{**} \\leq C_2 \\Vert x \\Vert_* $$ \n",
|
||
"\n",
|
||
"for some positive constants $C_1(n), C_2(n)$, $x \\in \\mathbb{R}^n$ for any pairs of norms $\\Vert \\cdot \\Vert_*$ and $\\Vert \\cdot \\Vert_{**}$. The equivalence of the norms means that if the vector is small in one norm, it is small in another norm. However, the constants can be large."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Computing norms in Python"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "code",
|
||
"execution_count": 47,
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"outputs": [
|
||
{
|
||
"name": "stdout",
|
||
"output_type": "stream",
|
||
"text": [
|
||
"Relative error in L1 norm: 0.0007970883810244821\n",
|
||
"Relative error in L2 norm: 0.001010323277420861\n",
|
||
"Relative error in L_{inf} norm: 0.0033282301825079508\n"
|
||
]
|
||
}
|
||
],
|
||
"source": [
|
||
"import numpy as np\n",
|
||
"n = 100\n",
|
||
"a = np.ones(n)\n",
|
||
"b = a + 1e-3 * np.random.randn(n)\n",
|
||
"print('Relative error in L1 norm:', np.linalg.norm(a - b, 1) / np.linalg.norm(b, 1))\n",
|
||
"print('Relative error in L2 norm:', np.linalg.norm(a - b) / np.linalg.norm(b))\n",
|
||
"print('Relative error in L_{inf} norm:', np.linalg.norm(a - b, np.inf) / np.linalg.norm(b, np.inf))"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Why $L_1$-norm can be important?\n",
|
||
"\n",
|
||
"$L_1$ norm has sparsifying property. Consider underdetermined system of equations:\n",
|
||
"\n",
|
||
"$$Ax = f,$$ where $A$ is an $n \\times m$ matrix, $A$ is known and $n\\ll m$.\n",
|
||
"\n",
|
||
"The question: can we find the solution $x$?"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Solution is not unique; a natural approach is to find the solution that is 'minimal' in some sense:\n",
|
||
"\n",
|
||
"\\begin{align*}\n",
|
||
"& \\Vert x \\Vert \\rightarrow \\min_x \\mbox{ subject to } Ax = f\n",
|
||
"\\end{align*}\n",
|
||
"\n",
|
||
"- Typical choice of $\\Vert x \\Vert = \\Vert x \\Vert_2$ leads to the solution vector of a minimal length \n",
|
||
"\n",
|
||
"- The choice $\\Vert x \\Vert = \\Vert x \\Vert_1$ typically yields the most sparse solution vector"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Matrix norms\n",
|
||
"$\\Vert \\cdot \\Vert$ is called a matrix norm if it is a vector norm on the vector space of $m \\times n$ matrices:\n",
|
||
"1. $|A| \\geq 0$ and if $|A| = 0$ then $A = 0$\n",
|
||
"2. $|\\alpha A| = |\\alpha| |A|$\n",
|
||
"3. $|A+B| \\leq |A| + |B|$ (triangle inequality)"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"### Example: Frobenius norm"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"$$\n",
|
||
"\\|A\\|_{\\text{F}}={\\sqrt {\\sum _{i=1}^{m}\\sum _{j=1}^{n}|a_{ij}|^{2}}}={\\sqrt {\\operatorname {trace} \\left(A^{*}A\\right)}}\n",
|
||
"$$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Induced norms\n",
|
||
"\n",
|
||
"- The most important class of the matrix norms is the class of induced norms. They are defined as\n",
|
||
"\n",
|
||
"$$ \\Vert A \\Vert_{*} = \\sup_{x \\ne 0} \\frac{\\Vert A x \\Vert_*}{\\Vert x \\Vert_*}, $$\n",
|
||
"\n",
|
||
"where $\\Vert \\cdot \\Vert_*$ is a certain vector norm.\n",
|
||
"\n",
|
||
"- Frobenius norm is a matrix norm, but not an induced norm, i.e. you can not find $\\Vert \\cdot \\Vert_*$ which induces it."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Induced $p$-norms\n",
|
||
"\n",
|
||
"Important case of operator norms are matrix $p$-norms, which are defined for $\\|\\cdot\\|_* = \\|\\cdot\\|_p$. <br>\n",
|
||
"\n",
|
||
"Among all $p$-norms three norms are the most common ones: \n",
|
||
"\n",
|
||
"- $p = 1, \\quad \\Vert A \\Vert_{1} = \\displaystyle{\\max_j \\sum_{i=1}^n} |a_{ij}|$.\n",
|
||
"\n",
|
||
"- $p = 2, \\quad$ spectral norm, denoted by $\\Vert A \\Vert_2$.\n",
|
||
"\n",
|
||
"- $p = \\infty, \\quad \\Vert A \\Vert_{\\infty} = \\displaystyle{\\max_i \\sum_{j=1}^m} |a_{ij}|$."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Spectral norm\n",
|
||
"\n",
|
||
"- Spectral norm, $\\Vert A \\Vert_2$ is one of the most used matrix norms (along with the Frobenius norm). \n",
|
||
"- It can not be computed directly from the entries using a simple formula, like the Frobenius norm, however, there are efficient algorithms to compute it. \n",
|
||
"- It is directly related to the singular value decomposition (SVD) of the matrix (next lectures)."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"# Orthogonal vectors and matrices "
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"- A pair of vectors $x$ and $y$ is orthogonal if $x^*y=0$\n",
|
||
"- A set of non-zero vectors is orthogonal if its elements are pairwise orthogonal\n",
|
||
"- A set of non-zero vectors is orthonormal if it is orthogonal and all vectors have unit length $x^* x = 1$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"- A square complex matrix $Q$ is called unitary (orthogonal in real case) if $Q^* Q = 1$ \n",
|
||
"- In terms of columns: $q_i^*q_j=\\delta_{ij}$ and the columns of $Q$ form an orthonormal basis\n",
|
||
"- Unitary matrices preserve vector length:\n",
|
||
"$$\n",
|
||
"||Qx||_2 = ||x||_2\n",
|
||
"$$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"- For rectangular matrices of size $m\\times n$ ($n\\not= m$) only one of the equalities can hold\n",
|
||
"\n",
|
||
" - $ Q^*Q = I_n$ – left unitary for $m>n$\n",
|
||
" - $ QQ^* = I_m$ – right unitary for $m<n$\n",
|
||
"\n",
|
||
"- In the case of real matrices $Q^* = Q^T$ and matrices such that\n",
|
||
"\n",
|
||
"$$ Q^TQ = QQ^T = I $$\n",
|
||
"\n",
|
||
"are called orthogonal."
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "fragment"
|
||
}
|
||
},
|
||
"source": [
|
||
"Important property: a product of two unitary matrices is a unitary matrix: \n",
|
||
"\n",
|
||
"$$(UV)^* UV = V^* (U^* U) V = V^* V = I,$$"
|
||
]
|
||
},
|
||
{
|
||
"cell_type": "markdown",
|
||
"metadata": {
|
||
"slideshow": {
|
||
"slide_type": "slide"
|
||
}
|
||
},
|
||
"source": [
|
||
"## Unitary invariance of $\\|\\cdot\\|_2$ and $\\|\\cdot\\|_F$ norms\n",
|
||
"\n",
|
||
"- For vector 2-norm we have already discussed that $\\Vert U z \\Vert_2 = \\Vert z \\Vert_2$ for any unitary $U$.\n",
|
||
"\n",
|
||
"- One can show that unitary matrices also do not change matrix norms $\\|\\cdot\\|_2$ and $\\|\\cdot\\|_F$, i.e. for any square $A$ and unitary $U$,$V$: \n",
|
||
"\n",
|
||
"$$ \\| UAV\\|_2 = \\| A \\|_2 \\qquad \\| UAV\\|_F = \\| A \\|_F.$$"
|
||
]
|
||
}
|
||
],
|
||
"metadata": {
|
||
"celltoolbar": "Слайд-шоу",
|
||
"kernelspec": {
|
||
"display_name": "Python 3 (ipykernel)",
|
||
"language": "python",
|
||
"name": "python3"
|
||
},
|
||
"language_info": {
|
||
"codemirror_mode": {
|
||
"name": "ipython",
|
||
"version": 3
|
||
},
|
||
"file_extension": ".py",
|
||
"mimetype": "text/x-python",
|
||
"name": "python",
|
||
"nbconvert_exporter": "python",
|
||
"pygments_lexer": "ipython3",
|
||
"version": "3.9.12"
|
||
}
|
||
},
|
||
"nbformat": 4,
|
||
"nbformat_minor": 4
|
||
}
|