"The method shown before finds the neural network parameters which maximize the log-likelihood of the data. But not all parameters are equally likely and we can estimate an uncertainty for them.\n",
"The method shown before finds the neural network parameters which maximize the log-likelihood of the data. But not all parameters are equally likely and we can estimate an uncertainty for them.\n",
"\n",
"\n",
...
...
%% Cell type:markdown id:1bba0128 tags:
%% Cell type:markdown id:1bba0128 tags:
# Supervised regression
# Supervised regression
This is an example of how to build and optimize neural networks with PyTorch with the objective of predicting a known feature in the training dataset.
This is an example of how to build and optimize neural networks with PyTorch with the objective of predicting a known feature in the training dataset.
The logic here is very similar to the classification problem and the code is also very close, but the final objective is to minimize the error in the prediction of the missing feature. This is achieved by minimizing the mean-squared-error between the prediction and the target features. As noticed during the presentation, by minimizing the mean-squared-error, we assume that the underlying error distribution is Gaussian. One could use the mean absolute error instead when assuming the distribution is Laplacian.
The logic here is very similar to the classification problem and the code is also very close, but the final objective is to minimize the error in the prediction of the missing feature. This is achieved by minimizing the mean-squared-error between the prediction and the target features. As noticed during the presentation, by minimizing the mean-squared-error, we assume that the underlying error distribution is Gaussian. One could use the mean absolute error instead when assuming the distribution is Laplacian.
Requirement already satisfied: torchvision in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (0.10.0)
Requirement already satisfied: torchvision in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (0.10.0)
Requirement already satisfied: torch in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (1.9.0)
Requirement already satisfied: torch in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (1.9.0)
Requirement already satisfied: pandas in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (1.3.0)
Requirement already satisfied: pandas in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (1.3.0)
Requirement already satisfied: numpy in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (1.19.2)
Requirement already satisfied: numpy in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (1.19.2)
Requirement already satisfied: matplotlib in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (3.4.2)
Requirement already satisfied: matplotlib in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (3.4.2)
Requirement already satisfied: ipympl in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (0.8.2)
Requirement already satisfied: ipympl in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (0.8.2)
Collecting torchbnn
Collecting torchbnn
Downloading torchbnn-1.2-py3-none-any.whl (12 kB)
Downloading torchbnn-1.2-py3-none-any.whl (12 kB)
Requirement already satisfied: pillow>=5.3.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from torchvision) (8.3.1)
Requirement already satisfied: pillow>=5.3.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from torchvision) (8.3.1)
Requirement already satisfied: typing_extensions in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from torch) (3.10.0.0)
Requirement already satisfied: typing_extensions in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from torch) (3.10.0.0)
Requirement already satisfied: python-dateutil>=2.7.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from pandas) (2.8.2)
Requirement already satisfied: python-dateutil>=2.7.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from pandas) (2.8.2)
Requirement already satisfied: pytz>=2017.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from pandas) (2021.1)
Requirement already satisfied: pytz>=2017.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from pandas) (2021.1)
Requirement already satisfied: six>=1.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.16.0)
Requirement already satisfied: six>=1.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from python-dateutil>=2.7.3->pandas) (1.16.0)
Requirement already satisfied: pyparsing>=2.2.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from matplotlib) (2.4.7)
Requirement already satisfied: pyparsing>=2.2.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from matplotlib) (2.4.7)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from matplotlib) (1.3.1)
Requirement already satisfied: kiwisolver>=1.0.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from matplotlib) (1.3.1)
Requirement already satisfied: cycler>=0.10 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied: cycler>=0.10 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from matplotlib) (0.10.0)
Requirement already satisfied: ipywidgets>=7.6.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipympl) (7.6.3)
Requirement already satisfied: ipywidgets>=7.6.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipympl) (7.6.3)
Requirement already satisfied: ipykernel>=4.7 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipympl) (5.5.5)
Requirement already satisfied: ipykernel>=4.7 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipympl) (5.5.5)
Requirement already satisfied: tornado>=4.2 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (6.1)
Requirement already satisfied: tornado>=4.2 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (6.1)
Requirement already satisfied: jupyter-client in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (6.1.12)
Requirement already satisfied: jupyter-client in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (6.1.12)
Requirement already satisfied: ipython>=5.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (7.25.0)
Requirement already satisfied: ipython>=5.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (7.25.0)
Requirement already satisfied: traitlets>=4.1.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (5.0.5)
Requirement already satisfied: traitlets>=4.1.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipykernel>=4.7->ipympl) (5.0.5)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (3.0.19)
Requirement already satisfied: prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (3.0.19)
Requirement already satisfied: pickleshare in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.7.5)
Requirement already satisfied: pickleshare in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.7.5)
Requirement already satisfied: decorator in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (5.0.9)
Requirement already satisfied: decorator in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (5.0.9)
Requirement already satisfied: jedi>=0.16 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.18.0)
Requirement already satisfied: jedi>=0.16 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.18.0)
Requirement already satisfied: pygments in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (2.9.0)
Requirement already satisfied: pygments in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (2.9.0)
Requirement already satisfied: backcall in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.2.0)
Requirement already satisfied: backcall in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.2.0)
Requirement already satisfied: setuptools>=18.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (49.6.0.post20210108)
Requirement already satisfied: setuptools>=18.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (49.6.0.post20210108)
Requirement already satisfied: matplotlib-inline in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.1.2)
Requirement already satisfied: matplotlib-inline in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.1.2)
Requirement already satisfied: pexpect>4.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (4.8.0)
Requirement already satisfied: pexpect>4.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipython>=5.0.0->ipykernel>=4.7->ipympl) (4.8.0)
Requirement already satisfied: widgetsnbextension~=3.5.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipywidgets>=7.6.0->ipympl) (3.5.1)
Requirement already satisfied: widgetsnbextension~=3.5.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipywidgets>=7.6.0->ipympl) (3.5.1)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipywidgets>=7.6.0->ipympl) (1.0.0)
Requirement already satisfied: jupyterlab-widgets>=1.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipywidgets>=7.6.0->ipympl) (1.0.0)
Requirement already satisfied: nbformat>=4.2.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipywidgets>=7.6.0->ipympl) (5.1.3)
Requirement already satisfied: nbformat>=4.2.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from ipywidgets>=7.6.0->ipympl) (5.1.3)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jedi>=0.16->ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.8.2)
Requirement already satisfied: parso<0.9.0,>=0.8.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jedi>=0.16->ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.8.2)
Requirement already satisfied: jupyter-core in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (4.7.1)
Requirement already satisfied: jupyter-core in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (4.7.1)
Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (3.2.0)
Requirement already satisfied: jsonschema!=2.5.0,>=2.4 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (3.2.0)
Requirement already satisfied: ipython-genutils in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (0.2.0)
Requirement already satisfied: ipython-genutils in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (0.2.0)
Requirement already satisfied: pyrsistent>=0.14.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (0.17.3)
Requirement already satisfied: pyrsistent>=0.14.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (0.17.3)
Requirement already satisfied: importlib-metadata in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (1.7.0)
Requirement already satisfied: importlib-metadata in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (1.7.0)
Requirement already satisfied: attrs>=17.4.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (21.2.0)
Requirement already satisfied: attrs>=17.4.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (21.2.0)
Requirement already satisfied: ptyprocess>=0.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from pexpect>4.3->ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.7.0)
Requirement already satisfied: ptyprocess>=0.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from pexpect>4.3->ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.7.0)
Requirement already satisfied: wcwidth in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.2.5)
Requirement already satisfied: wcwidth in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from prompt-toolkit!=3.0.0,!=3.0.1,<3.1.0,>=2.0.0->ipython>=5.0.0->ipykernel>=4.7->ipympl) (0.2.5)
Requirement already satisfied: notebook>=4.4.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (6.4.0)
Requirement already satisfied: notebook>=4.4.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (6.4.0)
Requirement already satisfied: nbconvert in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (6.1.0)
Requirement already satisfied: nbconvert in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (6.1.0)
Requirement already satisfied: prometheus-client in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.11.0)
Requirement already satisfied: prometheus-client in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.11.0)
Requirement already satisfied: jinja2 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (3.0.1)
Requirement already satisfied: jinja2 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (3.0.1)
Requirement already satisfied: Send2Trash>=1.5.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.7.1)
Requirement already satisfied: Send2Trash>=1.5.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.7.1)
Requirement already satisfied: argon2-cffi in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (20.1.0)
Requirement already satisfied: argon2-cffi in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (20.1.0)
Requirement already satisfied: pyzmq>=17 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (22.1.0)
Requirement already satisfied: pyzmq>=17 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (22.1.0)
Requirement already satisfied: terminado>=0.8.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.10.1)
Requirement already satisfied: terminado>=0.8.3 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.10.1)
Requirement already satisfied: cffi>=1.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.14.6)
Requirement already satisfied: cffi>=1.0.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.14.6)
Requirement already satisfied: pycparser in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from cffi>=1.0.0->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (2.20)
Requirement already satisfied: pycparser in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from cffi>=1.0.0->argon2-cffi->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (2.20)
Requirement already satisfied: zipp>=0.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from importlib-metadata->jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (3.5.0)
Requirement already satisfied: zipp>=0.5 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from importlib-metadata->jsonschema!=2.5.0,>=2.4->nbformat>=4.2.0->ipywidgets>=7.6.0->ipympl) (3.5.0)
Requirement already satisfied: MarkupSafe>=2.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jinja2->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (2.0.1)
Requirement already satisfied: MarkupSafe>=2.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from jinja2->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (2.0.1)
Requirement already satisfied: entrypoints>=0.2.2 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.3)
Requirement already satisfied: entrypoints>=0.2.2 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.3)
Requirement already satisfied: pandocfilters>=1.4.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.4.2)
Requirement already satisfied: pandocfilters>=1.4.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.4.2)
Requirement already satisfied: mistune<2,>=0.8.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.8.4)
Requirement already satisfied: mistune<2,>=0.8.1 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.8.4)
Requirement already satisfied: jupyterlab-pygments in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.1.2)
Requirement already satisfied: jupyterlab-pygments in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.1.2)
Requirement already satisfied: bleach in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (3.3.1)
Requirement already satisfied: bleach in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (3.3.1)
Requirement already satisfied: defusedxml in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.7.1)
Requirement already satisfied: defusedxml in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.7.1)
Requirement already satisfied: nbclient<0.6.0,>=0.5.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.5.3)
Requirement already satisfied: nbclient<0.6.0,>=0.5.0 in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.5.3)
Requirement already satisfied: testpath in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.5.0)
Requirement already satisfied: testpath in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.5.0)
Requirement already satisfied: async-generator in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.10)
Requirement already satisfied: async-generator in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.10)
Requirement already satisfied: nest-asyncio in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.5.1)
Requirement already satisfied: nest-asyncio in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from nbclient<0.6.0,>=0.5.0->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (1.5.1)
Requirement already satisfied: packaging in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (21.0)
Requirement already satisfied: packaging in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (21.0)
Requirement already satisfied: webencodings in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.5.1)
Requirement already satisfied: webencodings in /home/danilo/miniconda3/envs/mlmkl/lib/python3.7/site-packages (from bleach->nbconvert->notebook>=4.4.1->widgetsnbextension~=3.5.0->ipywidgets>=7.6.0->ipympl) (0.5.1)
Installing collected packages: torchbnn
Installing collected packages: torchbnn
Successfully installed torchbnn-1.2
Successfully installed torchbnn-1.2
%% Cell type:code id:23feddde tags:
%% Cell type:code id:23feddde tags:
``` python
``` python
%matplotlibnotebook
%matplotlibnotebook
fromtypingimportTuple
fromtypingimportTuple
# import standard PyTorch modules
# import standard PyTorch modules
importtorch
importtorch
importtorch.nnasnn
importtorch.nnasnn
importtorch.nn.functionalasF
importtorch.nn.functionalasF
importtorchbnnasbnn
importtorchbnnasbnn
# import torchvision module to handle image manipulation
# import torchvision module to handle image manipulation
importtorchvision
importtorchvision
importtorchvision.transformsastransforms
importtorchvision.transformsastransforms
importpandasaspd
importpandasaspd
importnumpyasnp
importnumpyasnp
importmatplotlib.pyplotasplt
importmatplotlib.pyplotasplt
frommpl_toolkits.mplot3dimportaxes3d
frommpl_toolkits.mplot3dimportaxes3d
```
```
%% Cell type:markdown id:bb1286f0 tags:
%% Cell type:markdown id:bb1286f0 tags:
We start by generating some fake dataset, which is simple enough that we can visualize the results easily. For this reason, the dataset will contain only two independent variables and a third feature variable which we want to determine in test data.
We start by generating some fake dataset, which is simple enough that we can visualize the results easily. For this reason, the dataset will contain only two independent variables and a third feature variable which we want to determine in test data.
The simulated example data will be $f(x) = (3 + \kappa) x^2 + \epsilon$, where $\epsilon \sim \mathcal{N}(\mu=0, \sigma=10)$ and $\kappa \sim \mathcal{N}(\mu=0, \sigma=0.03)$.
The simulated example data will be $f(x) = (3 + \kappa) x^2 + \epsilon$, where $\epsilon \sim \mathcal{N}(\mu=0, \sigma=10)$ and $\kappa \sim \mathcal{N}(\mu=0, \sigma=0.03)$.
In this case we do know the true model, so it is interesting to take some time to pinpoint the role of $\kappa$ and $\epsilon$. These variables add fluctuation to the results. $\epsilon$ adds Gaussian noise in a way that is completely independent from $x$ and cannot be traced down to a particular functional dependence. $\kappa$ changes a specific parameter of the model, in this case the coefficient 3, by around 1%.
In this case we do know the true model, so it is interesting to take some time to pinpoint the role of $\kappa$ and $\epsilon$. These variables add fluctuation to the results. $\epsilon$ adds Gaussian noise in a way that is completely independent from $x$ and cannot be traced down to a particular functional dependence. $\kappa$ changes a specific parameter of the model, in this case the coefficient 3, by around 1%.
When fitting a model, the nomenclature *epistemic uncertainty* is often used to refer to uncertainties coming to effects related to different functional models. That is, one can imagine that there are different functions that may fit the data due to the effect of $\kappa$, such as: $g(x) = 3x^2$ or $h(x) = 2.95x^2$.
When fitting a model, the nomenclature *epistemic uncertainty* is often used to refer to uncertainties coming to effects related to different functional models. That is, one can imagine that there are different functions that may fit the data due to the effect of $\kappa$, such as: $g(x) = 3x^2$ or $h(x) = 2.95x^2$.
The nomenclature *aleatoric uncertainty* is used to refer to whichever uncertainty cannot be tracked down to a given model dependence. In this example, different constant factors could be added to the model $g$ to account for the fluctuations in $\epsilon$.
The nomenclature *aleatoric uncertainty* is used to refer to whichever uncertainty cannot be tracked down to a given model dependence. In this example, different constant factors could be added to the model $g$ to account for the fluctuations in $\epsilon$.
We will see these two effects later on, when we discuss Bayesian Neural Networks, so that we can predict the effect of those uncertainties.
We will see these two effects later on, when we discuss Bayesian Neural Networks, so that we can predict the effect of those uncertainties.
PyTorch allows you to create a class that outputs a single data entry and use that to feed input to your neural network. We will use the following class to feed the data to the neural network. This looks useless if all your data fits in a Numpy array, but notice that if you have a lot of data and cannot load it all in memory, this allows you to read data on demand, as you need it and only the needed samples are stored at a single time.
PyTorch allows you to create a class that outputs a single data entry and use that to feed input to your neural network. We will use the following class to feed the data to the neural network. This looks useless if all your data fits in a Numpy array, but notice that if you have a lot of data and cannot load it all in memory, this allows you to read data on demand, as you need it and only the needed samples are stored at a single time.
And now let us define the neural network. In PyTorch, neural networks always extend `nn.Module`. They define their sub-parts in their constructor, which are convolutional layers and fully connected linear layers in this case, and the method `forward` is expected to receive an input image and output the network target.
And now let us define the neural network. In PyTorch, neural networks always extend `nn.Module`. They define their sub-parts in their constructor, which are convolutional layers and fully connected linear layers in this case, and the method `forward` is expected to receive an input image and output the network target.
The network parameters are the weights of the `Conv2d` and `Linear` layers, which are conveniently hidden here, but can be accessed if you try to access their `weights` elements.
The network parameters are the weights of the `Conv2d` and `Linear` layers, which are conveniently hidden here, but can be accessed if you try to access their `weights` elements.
We will not directly output the label probabilities, since we do not actually need it to optimize the neural network: we need only the logits.
We will not directly output the label probabilities, since we do not actually need it to optimize the neural network: we need only the logits.
%% Cell type:code id:d908ef86 tags:
%% Cell type:code id:d908ef86 tags:
``` python
``` python
classNetwork(nn.Module):
classNetwork(nn.Module):
"""
"""
This is our parametrized function.
This is our parametrized function.
It stores all the parametrized weights theta inside the model object.
It stores all the parametrized weights theta inside the model object.
For such a simple example data, it was not necessary to have such a complex model:
For such a simple example data, it was not necessary to have such a complex model:
this was only done here to show the interface provided by PyTorch.
this was only done here to show the interface provided by PyTorch.
The forward function receives the x values and outputs an estimate of the target.
The forward function receives the x values and outputs an estimate of the target.
The nn.Sequential object allows one to apply each step in the given list of parameter
The nn.Sequential object allows one to apply each step in the given list of parameter
in a sequential way. An alternative would be to create each of these layers manually
in a sequential way. An alternative would be to create each of these layers manually
and apply them one after the other in the forward method.
and apply them one after the other in the forward method.
This function is called when one does my_network(x) and it represents the action
This function is called when one does my_network(x) and it represents the action
of our parametrized function in the input.
of our parametrized function in the input.
"""
"""
returnself.model(x)
returnself.model(x)
```
```
%% Cell type:markdown id:9c5620dc tags:
%% Cell type:markdown id:9c5620dc tags:
Let us create one instance of this network. We also create an instance of PyTorch's `DataLoader`, which has the task of taking a given number of data elements and outputing it in a single object. This "mini-batch" of data is used during training, so that we do not need to load the entire data in memory during the optimization procedure.
Let us create one instance of this network. We also create an instance of PyTorch's `DataLoader`, which has the task of taking a given number of data elements and outputing it in a single object. This "mini-batch" of data is used during training, so that we do not need to load the entire data in memory during the optimization procedure.
We also create an instance of the Adam optimizer, which is used to tune the parameters of the network.
We also create an instance of the Adam optimizer, which is used to tune the parameters of the network.
Now we actually repeatedly try to optimize the network parameters. Each time we go through all the data we have, we go through one "epoch". For each epoch, we take several "mini-batches" of data (given by the `DataLoader` in `loader`) and use it to make one training step.
Now we actually repeatedly try to optimize the network parameters. Each time we go through all the data we have, we go through one "epoch". For each epoch, we take several "mini-batches" of data (given by the `DataLoader` in `loader`) and use it to make one training step.
%% Cell type:code id:d15d655d tags:
%% Cell type:code id:d15d655d tags:
``` python
``` python
epochs=100
epochs=100
# for each epoch
# for each epoch
forepochinrange(epochs):
forepochinrange(epochs):
losses=list()
losses=list()
# for each mini-batch given by the loader:
# for each mini-batch given by the loader:
forbatchinloader:
forbatchinloader:
# get the input in the mini-batch
# get the input in the mini-batch
# this has size (B, C)
# this has size (B, C)
# where B is the mini-batch size
# where B is the mini-batch size
# C is the number of features (1 in this case)
# C is the number of features (1 in this case)
features=batch["data"]
features=batch["data"]
# get the targets in the mini-batch (there shall be B of them)
# get the targets in the mini-batch (there shall be B of them)
target=batch["target"]
target=batch["target"]
# get the output of the neural network:
# get the output of the neural network:
prediction=network(features)
prediction=network(features)
# calculate the loss function being minimized
# calculate the loss function being minimized
# in this case, it is the mean-squared error between the prediction and the target values
# in this case, it is the mean-squared error between the prediction and the target values
loss=F.mse_loss(prediction,target)
loss=F.mse_loss(prediction,target)
# exactly equivalent to:
# exactly equivalent to:
#loss = ((prediction - target)**2).mean()
#loss = ((prediction - target)**2).mean()
# clean the optimizer temporary gradient storage
# clean the optimizer temporary gradient storage
optimizer.zero_grad()
optimizer.zero_grad()
# calculate the gradient of the loss function as a function of the gradients
# calculate the gradient of the loss function as a function of the gradients
loss.backward()
loss.backward()
# ask the Adam optimizer to change the parameters in the direction of - gradient
# ask the Adam optimizer to change the parameters in the direction of - gradient
# Adam scales the gradient by a constant which is adaptively tuned
# Adam scales the gradient by a constant which is adaptively tuned
# take a look at the Adam paper for more details: https://arxiv.org/abs/1412.6980
# take a look at the Adam paper for more details: https://arxiv.org/abs/1412.6980
optimizer.step()
optimizer.step()
losses.append(loss.detach().cpu().item())
losses.append(loss.detach().cpu().item())
avg_loss=np.mean(np.array(losses))
avg_loss=np.mean(np.array(losses))
print(f"Epoch {epoch}/{epochs}: average loss {avg_loss:.5f}")
print(f"Epoch {epoch}/{epochs}: average loss {avg_loss:.5f}")
```
```
%% Output
%% Output
Epoch 0/100: average loss 378.81588
Epoch 0/100: average loss 378.81588
Epoch 1/100: average loss 275.14966
Epoch 1/100: average loss 275.14966
Epoch 2/100: average loss 209.67680
Epoch 2/100: average loss 209.67680
Epoch 3/100: average loss 177.45487
Epoch 3/100: average loss 177.45487
Epoch 4/100: average loss 161.36641
Epoch 4/100: average loss 161.36641
Epoch 5/100: average loss 151.22071
Epoch 5/100: average loss 151.22071
Epoch 6/100: average loss 143.62892
Epoch 6/100: average loss 143.62892
Epoch 7/100: average loss 137.58497
Epoch 7/100: average loss 137.58497
Epoch 8/100: average loss 132.61164
Epoch 8/100: average loss 132.61164
Epoch 9/100: average loss 128.38996
Epoch 9/100: average loss 128.38996
Epoch 10/100: average loss 124.72414
Epoch 10/100: average loss 124.72414
Epoch 11/100: average loss 121.49631
Epoch 11/100: average loss 121.49631
Epoch 12/100: average loss 118.62273
Epoch 12/100: average loss 118.62273
Epoch 13/100: average loss 116.04727
Epoch 13/100: average loss 116.04727
Epoch 14/100: average loss 113.73365
Epoch 14/100: average loss 113.73365
Epoch 15/100: average loss 111.67534
Epoch 15/100: average loss 111.67534
Epoch 16/100: average loss 109.85956
Epoch 16/100: average loss 109.85956
Epoch 17/100: average loss 108.26555
Epoch 17/100: average loss 108.26555
Epoch 18/100: average loss 106.88248
Epoch 18/100: average loss 106.88248
Epoch 19/100: average loss 105.69173
Epoch 19/100: average loss 105.69173
Epoch 20/100: average loss 104.67635
Epoch 20/100: average loss 104.67635
Epoch 21/100: average loss 103.80502
Epoch 21/100: average loss 103.80502
Epoch 22/100: average loss 103.06343
Epoch 22/100: average loss 103.06343
Epoch 23/100: average loss 102.43223
Epoch 23/100: average loss 102.43223
Epoch 24/100: average loss 101.89756
Epoch 24/100: average loss 101.89756
Epoch 25/100: average loss 101.43507
Epoch 25/100: average loss 101.43507
Epoch 26/100: average loss 101.03641
Epoch 26/100: average loss 101.03641
Epoch 27/100: average loss 100.68191
Epoch 27/100: average loss 100.68191
Epoch 28/100: average loss 100.35821
Epoch 28/100: average loss 100.35821
Epoch 29/100: average loss 100.06578
Epoch 29/100: average loss 100.06578
Epoch 30/100: average loss 99.79758
Epoch 30/100: average loss 99.79758
Epoch 31/100: average loss 99.54630
Epoch 31/100: average loss 99.54630
Epoch 32/100: average loss 99.31432
Epoch 32/100: average loss 99.31432
Epoch 33/100: average loss 99.08812
Epoch 33/100: average loss 99.08812
Epoch 34/100: average loss 98.87219
Epoch 34/100: average loss 98.87219
Epoch 35/100: average loss 98.67368
Epoch 35/100: average loss 98.67368
Epoch 36/100: average loss 98.48651
Epoch 36/100: average loss 98.48651
Epoch 37/100: average loss 98.31420
Epoch 37/100: average loss 98.31420
Epoch 38/100: average loss 98.15310
Epoch 38/100: average loss 98.15310
Epoch 39/100: average loss 97.99930
Epoch 39/100: average loss 97.99930
Epoch 40/100: average loss 97.85031
Epoch 40/100: average loss 97.85031
Epoch 41/100: average loss 97.71002
Epoch 41/100: average loss 97.71002
Epoch 42/100: average loss 97.57293
Epoch 42/100: average loss 97.57293
Epoch 43/100: average loss 97.44047
Epoch 43/100: average loss 97.44047
Epoch 44/100: average loss 97.31417
Epoch 44/100: average loss 97.31417
Epoch 45/100: average loss 97.18990
Epoch 45/100: average loss 97.18990
Epoch 46/100: average loss 97.07419
Epoch 46/100: average loss 97.07419
Epoch 47/100: average loss 96.96548
Epoch 47/100: average loss 96.96548
Epoch 48/100: average loss 96.86184
Epoch 48/100: average loss 96.86184
Epoch 49/100: average loss 96.76805
Epoch 49/100: average loss 96.76805
Epoch 50/100: average loss 96.67791
Epoch 50/100: average loss 96.67791
Epoch 51/100: average loss 96.59360
Epoch 51/100: average loss 96.59360
Epoch 52/100: average loss 96.51472
Epoch 52/100: average loss 96.51472
Epoch 53/100: average loss 96.43937
Epoch 53/100: average loss 96.43937
Epoch 54/100: average loss 96.36539
Epoch 54/100: average loss 96.36539
Epoch 55/100: average loss 96.29459
Epoch 55/100: average loss 96.29459
Epoch 56/100: average loss 96.22356
Epoch 56/100: average loss 96.22356
Epoch 57/100: average loss 96.15634
Epoch 57/100: average loss 96.15634
Epoch 58/100: average loss 96.08934
Epoch 58/100: average loss 96.08934
Epoch 59/100: average loss 96.02401
Epoch 59/100: average loss 96.02401
Epoch 60/100: average loss 95.96307
Epoch 60/100: average loss 95.96307
Epoch 61/100: average loss 95.90349
Epoch 61/100: average loss 95.90349
Epoch 62/100: average loss 95.84973
Epoch 62/100: average loss 95.84973
Epoch 63/100: average loss 95.79636
Epoch 63/100: average loss 95.79636
Epoch 64/100: average loss 95.74215
Epoch 64/100: average loss 95.74215
Epoch 65/100: average loss 95.69529
Epoch 65/100: average loss 95.69529
Epoch 66/100: average loss 95.64951
Epoch 66/100: average loss 95.64951
Epoch 67/100: average loss 95.60449
Epoch 67/100: average loss 95.60449
Epoch 68/100: average loss 95.56373
Epoch 68/100: average loss 95.56373
Epoch 69/100: average loss 95.52165
Epoch 69/100: average loss 95.52165
Epoch 70/100: average loss 95.48233
Epoch 70/100: average loss 95.48233
Epoch 71/100: average loss 95.44179
Epoch 71/100: average loss 95.44179
Epoch 72/100: average loss 95.39826
Epoch 72/100: average loss 95.39826
Epoch 73/100: average loss 95.35763
Epoch 73/100: average loss 95.35763
Epoch 74/100: average loss 95.31944
Epoch 74/100: average loss 95.31944
Epoch 75/100: average loss 95.27754
Epoch 75/100: average loss 95.27754
Epoch 76/100: average loss 95.23919
Epoch 76/100: average loss 95.23919
Epoch 77/100: average loss 95.20086
Epoch 77/100: average loss 95.20086
Epoch 78/100: average loss 95.16258
Epoch 78/100: average loss 95.16258
Epoch 79/100: average loss 95.12233
Epoch 79/100: average loss 95.12233
Epoch 80/100: average loss 95.08201
Epoch 80/100: average loss 95.08201
Epoch 81/100: average loss 95.04595
Epoch 81/100: average loss 95.04595
Epoch 82/100: average loss 95.01281
Epoch 82/100: average loss 95.01281
Epoch 83/100: average loss 94.97996
Epoch 83/100: average loss 94.97996
Epoch 84/100: average loss 94.94827
Epoch 84/100: average loss 94.94827
Epoch 85/100: average loss 94.91624
Epoch 85/100: average loss 94.91624
Epoch 86/100: average loss 94.88639
Epoch 86/100: average loss 94.88639
Epoch 87/100: average loss 94.85546
Epoch 87/100: average loss 94.85546
Epoch 88/100: average loss 94.82733
Epoch 88/100: average loss 94.82733
Epoch 89/100: average loss 94.79647
Epoch 89/100: average loss 94.79647
Epoch 90/100: average loss 94.77049
Epoch 90/100: average loss 94.77049
Epoch 91/100: average loss 94.74167
Epoch 91/100: average loss 94.74167
Epoch 92/100: average loss 94.71930
Epoch 92/100: average loss 94.71930
Epoch 93/100: average loss 94.69341
Epoch 93/100: average loss 94.69341
Epoch 94/100: average loss 94.66904
Epoch 94/100: average loss 94.66904
Epoch 95/100: average loss 94.64581
Epoch 95/100: average loss 94.64581
Epoch 96/100: average loss 94.61936
Epoch 96/100: average loss 94.61936
Epoch 97/100: average loss 94.59652
Epoch 97/100: average loss 94.59652
Epoch 98/100: average loss 94.57301
Epoch 98/100: average loss 94.57301
Epoch 99/100: average loss 94.55085
Epoch 99/100: average loss 94.55085
%% Cell type:markdown id:a4980bf4 tags:
%% Cell type:markdown id:a4980bf4 tags:
Let us check what the network says about some new data it has never seen before.
Let us check what the network says about some new data it has never seen before.
%% Cell type:code id:09646d29 tags:
%% Cell type:code id:09646d29 tags:
``` python
``` python
test_data=generate_data(N=1000)
test_data=generate_data(N=1000)
```
```
%% Cell type:markdown id:e315b5dc tags:
%% Cell type:markdown id:e315b5dc tags:
And now we can plot again the new images, now showing what the network tells us about it.
And now we can plot again the new images, now showing what the network tells us about it.
The method shown before finds the neural network parameters which maximize the log-likelihood of the data. But not all parameters are equally likely and we can estimate an uncertainty for them.
The method shown before finds the neural network parameters which maximize the log-likelihood of the data. But not all parameters are equally likely and we can estimate an uncertainty for them.
With an uncertainty for the parameters, we can propagate the uncertainty through the neural network and obtain an uncertainty on the prediction of the regression output.
With an uncertainty for the parameters, we can propagate the uncertainty through the neural network and obtain an uncertainty on the prediction of the regression output.
This can be done assuming each weight in the network function has a given probability distribution and instead of fitting a single value for the weight, we fit the parameters of this probability distribution. For the example shown here, we assume that the probability distribution of the weights is Gaussian and we aim to obtain the mean and variance of the Gaussian.
This can be done assuming each weight in the network function has a given probability distribution and instead of fitting a single value for the weight, we fit the parameters of this probability distribution. For the example shown here, we assume that the probability distribution of the weights is Gaussian and we aim to obtain the mean and variance of the Gaussian.
We are going to include the epistemic uncertainty through the variation of the weights. That is, the fact that the weights vary and lead to different effective functions allow us to model different $f(x)$ dependence relationships and this is attributed to the epistemic uncertainty.
We are going to include the epistemic uncertainty through the variation of the weights. That is, the fact that the weights vary and lead to different effective functions allow us to model different $f(x)$ dependence relationships and this is attributed to the epistemic uncertainty.
We additionally assume that the data collected has some aleatoric uncertainty, which means that every point is uncertain by some fixed unknown amount. To model this effect, we assume that the likelihood function $p(\text{data}|\theta)$ can be modelled by a Gaussian distribution with a certain standard deviation $\sigma_a$. This standard deviation will be used to model the aleatoric uncertainty.
We additionally assume that the data collected has some aleatoric uncertainty, which means that every point is uncertain by some fixed unknown amount. To model this effect, we assume that the likelihood function $p(\text{data}|\theta)$ can be modelled by a Gaussian distribution with a certain standard deviation $\sigma_a$. This standard deviation will be used to model the aleatoric uncertainty.
The first term is assumed to be a Gaussian with the standard deviation given by the aleatoric uncertainty (assumed to be the same for every data point, but this could be changed to be data-point specific as well!). The second term corresponds to a penalty for varying the weights away from the prior assumption that the weights are Gaussian with a mean zero and standard deviation 0.1. In this equation, $M$ is the number of batches used.
The first term is assumed to be a Gaussian with the standard deviation given by the aleatoric uncertainty (assumed to be the same for every data point, but this could be changed to be data-point specific as well!). The second term corresponds to a penalty for varying the weights away from the prior assumption that the weights are Gaussian with a mean zero and standard deviation 0.1. In this equation, $M$ is the number of batches used.
It can be shown that by minimizing this loss function, we obtain weights mean and standard deviations that approximately optimize the posterior probability given by the Bayes rule: $p(\text{weights}|\text{data}) = \frac{p(\text{data}|\text{weights}) p(\text{weights})}{p(\text{data})}$. The proof follows by algebraically trying to minimize the Kullback-Leibler divergence between the true posterior given by the Bayes rule and the approximate posterior, on which the weights are assumed to be Gaussian and the likelihood is assumed to be Gaussian.
It can be shown that by minimizing this loss function, we obtain weights mean and standard deviations that approximately optimize the posterior probability given by the Bayes rule: $p(\text{weights}|\text{data}) = \frac{p(\text{data}|\text{weights}) p(\text{weights})}{p(\text{data})}$. The proof follows by algebraically trying to minimize the Kullback-Leibler divergence between the true posterior given by the Bayes rule and the approximate posterior, on which the weights are assumed to be Gaussian and the likelihood is assumed to be Gaussian.


The details of the derivation can be consulted in the following paper:
The details of the derivation can be consulted in the following paper:
https://arxiv.org/pdf/1505.05424.pdf
https://arxiv.org/pdf/1505.05424.pdf
%% Cell type:code id:f8d501ff tags:
%% Cell type:code id:f8d501ff tags:
``` python
``` python
classBayesianNetwork(nn.Module):
classBayesianNetwork(nn.Module):
"""
"""
A model Bayesian Neural network.
A model Bayesian Neural network.
Each weight is represented by a Gaussian with a mean and a standard deviation.
Each weight is represented by a Gaussian with a mean and a standard deviation.
Each evaluation of forward leads to a different choice of the weights, so running
Each evaluation of forward leads to a different choice of the weights, so running
forward several times we can check the effect of the weights variation on the same input.
forward several times we can check the effect of the weights variation on the same input.
The nll function implements the negative log likelihood to be used as the first part of the loss
The nll function implements the negative log likelihood to be used as the first part of the loss
function (the second shall be the Kullback-Leibler divergence).
function (the second shall be the Kullback-Leibler divergence).
The negative log-likelihood is simply the negative log likelihood of a Gaussian
The negative log-likelihood is simply the negative log likelihood of a Gaussian
between the prediction and the true value. The standard deviation of the Gaussian is left as a
between the prediction and the true value. The standard deviation of the Gaussian is left as a
# the Kullback-Leibler divergence should be scaled by 1/number_of_batches
# the Kullback-Leibler divergence should be scaled by 1/number_of_batches
# see https://arxiv.org/abs/1505.05424 for more information on this
# see https://arxiv.org/abs/1505.05424 for more information on this
number_of_batches=len(my_dataset)/float(B)
number_of_batches=len(my_dataset)/float(B)
weight_kl=1.0/float(number_of_batches)
weight_kl=1.0/float(number_of_batches)
```
```
%% Cell type:markdown id:c68ba2e2 tags:
%% Cell type:markdown id:c68ba2e2 tags:
The criteria for finding the optimal weights are based on the Bayes' theorem, on which the posterior probability of the weights is proportional to the likelihood of the data given the weights and to the prior probability of the weights. We assume the prior probability of the weights is Gaussian corresponding to a unit Gaussian centred at zero and with standard deviation 0.1. This prior has a regularizing effect, preventing overtraining.
The criteria for finding the optimal weights are based on the Bayes' theorem, on which the posterior probability of the weights is proportional to the likelihood of the data given the weights and to the prior probability of the weights. We assume the prior probability of the weights is Gaussian corresponding to a unit Gaussian centred at zero and with standard deviation 0.1. This prior has a regularizing effect, preventing overtraining.
We can translate the Bayes theorem and the assumption that the final posterior distribution is also Gaussian into an optimization procedure to find the posterior mean and variance of the posterior distribution. The function optimized to obtain the mean and variances of the Gaussians for the weights is the sum between the mean-squared-error (corresponding to a Gaussian log-likelihood of the data) and the Kullback-Leibler divergence between the weights distribution and the prior Gaussian.
We can translate the Bayes theorem and the assumption that the final posterior distribution is also Gaussian into an optimization procedure to find the posterior mean and variance of the posterior distribution. The function optimized to obtain the mean and variances of the Gaussians for the weights is the sum between the mean-squared-error (corresponding to a Gaussian log-likelihood of the data) and the Kullback-Leibler divergence between the weights distribution and the prior Gaussian.
To evaluate the effect of the uncertainty, we perform the prediction many times for the same data and take the average and root-mean-squared-error of the predictions, since each prediction performed with the Bayesian Neural Network leads to a different results, using a different weight, selected from the final Gaussian.
To evaluate the effect of the uncertainty, we perform the prediction many times for the same data and take the average and root-mean-squared-error of the predictions, since each prediction performed with the Bayesian Neural Network leads to a different results, using a different weight, selected from the final Gaussian.
We can now take the average result for each sample, and their root-mean-squared-error as an estimate of the mean and epistemic uncertainty for the results.
We can now take the average result for each sample, and their root-mean-squared-error as an estimate of the mean and epistemic uncertainty for the results.
The aleatoric uncertainty is fitted as an independent parameter. Since we assume the aleatoric uncertainty is independent, we can calculate the total uncertainty as the sum of squares of the epistemic and aleatoric uncertainty.
The aleatoric uncertainty is fitted as an independent parameter. Since we assume the aleatoric uncertainty is independent, we can calculate the total uncertainty as the sum of squares of the epistemic and aleatoric uncertainty.
Note that the aleatoric uncertainty is very close to the standard deviation of the $\epsilon$ component of the model we created in the beginning! Clearly the model could fit the uncertainty coming from that component of the noise.
Note that the aleatoric uncertainty is very close to the standard deviation of the $\epsilon$ component of the model we created in the beginning! Clearly the model could fit the uncertainty coming from that component of the noise.
It is not easy to estimate the effect of the epistemic uncertainty, as it is different for every data point (as it is scaled by $x^2$), but we can plot it to take a look at its effect.
It is not easy to estimate the effect of the epistemic uncertainty, as it is different for every data point (as it is scaled by $x^2$), but we can plot it to take a look at its effect.
Note that the uncertainties are the standard deviations of Gaussian models and therefore they correspond to a $1\sigma$ quantile band, which is a 67% confidence band. The quantile corresponding to $2\sigma$ corresponds to a 95% confidence band in a Gaussian model.
Note that the uncertainties are the standard deviations of Gaussian models and therefore they correspond to a $1\sigma$ quantile band, which is a 67% confidence band. The quantile corresponding to $2\sigma$ corresponds to a 95% confidence band in a Gaussian model.