iml_exam_24/FinalExam_lastname_mnr.ipynb

1059 lines
28 KiB
Plaintext
Raw Normal View History

2024-06-06 07:01:03 +00:00
{
"cells": [
{
"cell_type": "markdown",
"id": "11174c4d-dcb5-49d8-96fc-b83ad35a6193",
"metadata": {},
"source": [
"<div>\n",
"<img src=\"https://cps.unileoben.ac.at/wp/CPS_Logo_Black.png\" width=\"260\"/>\n",
"</div>\n",
"Chair of Cyber-Physical-Systems, Austria"
]
},
{
"cell_type": "markdown",
"id": "a04ead7b-4fce-4ea2-9501-d60a112aadcb",
"metadata": {},
"source": [
"<style>\n",
"td, th {\n",
" border: none!important;\n",
"}\n",
"</style>\n",
"\n",
"| Credentials | |\n",
"|----|---|\n",
"|Host | Montanuniversitaet Leoben |\n",
"|Web | https://cps.unileoben.ac.at |\n",
"|Mail | cps@unileoben.ac.at |\n",
"|Authors | Melanie Neubauer & Elmar Rückert|\n",
"|Corresponding Authors | melanie.neubauer@unileoben.ac.at, rueckert@unileoben.ac.at |\n",
"|Last edited | 07.06.2024 |\n"
]
},
{
"cell_type": "markdown",
"id": "35a0d7a0",
"metadata": {},
"source": [
"# Final Exam"
]
},
{
"cell_type": "raw",
"id": "3dfb754b-ee67-4090-aa8c-394880ede449",
"metadata": {},
"source": [
"Name:\n",
2024-06-06 10:21:17 +00:00
"Mat Number:\n",
"Jupyter Username:"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "raw",
"id": "36aa65d0-1304-413f-88e9-04a98e2add1a",
"metadata": {},
"source": [
"Point overview:\n",
"- Import Section /2\n",
"- Q1 Descr. Testing /4\n",
"- Q2 write function /4\n",
"- Q3 answer questions /2\n",
"- Q4 databases /6\n",
"- Q5 write class /4\n",
"- Q6 descr. code and answer /4\n",
"- Q7 linear reg basics /4\n",
"- Q8 linear reg coding /4\n",
"- Q9 gaussian processes /6\n",
"\n",
"You reached /40 Points!"
]
},
{
"cell_type": "markdown",
"id": "4b05e025-9843-410e-80cd-f2db24148da8",
"metadata": {},
"source": [
"<div class=\"alert alert-block alert-success\">\n",
"Please read the following tasks carefully. Good luck :)"
]
},
{
"cell_type": "markdown",
"id": "480d04cd-83c9-43e7-827f-2cf3d357ab7e",
"metadata": {},
"source": [
"### Install Section"
]
},
{
"cell_type": "code",
"execution_count": 56,
"id": "9c4525f3-aeab-4dcb-9cc5-eb79ac3e04a2",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: matplotlib in /opt/conda/lib/python3.10/site-packages (3.8.3)\n",
"Requirement already satisfied: contourpy>=1.0.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (1.2.0)\n",
"Requirement already satisfied: cycler>=0.10 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (0.12.1)\n",
"Requirement already satisfied: fonttools>=4.22.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (4.50.0)\n",
"Requirement already satisfied: kiwisolver>=1.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (1.4.5)\n",
"Requirement already satisfied: numpy<2,>=1.21 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (1.26.4)\n",
"Requirement already satisfied: packaging>=20.0 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (23.2)\n",
"Requirement already satisfied: pillow>=8 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (10.2.0)\n",
"Requirement already satisfied: pyparsing>=2.3.1 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (3.1.2)\n",
"Requirement already satisfied: python-dateutil>=2.7 in /opt/conda/lib/python3.10/site-packages (from matplotlib) (2.8.2)\n",
"Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.7->matplotlib) (1.16.0)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"pip install matplotlib"
]
},
{
"cell_type": "code",
"execution_count": 57,
"id": "0e5aad9c-bbba-429a-b541-36a36591b68a",
"metadata": {
"tags": []
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"Requirement already satisfied: pandas in /opt/conda/lib/python3.10/site-packages (2.2.2)\n",
"Requirement already satisfied: numpy>=1.22.4 in /opt/conda/lib/python3.10/site-packages (from pandas) (1.26.4)\n",
"Requirement already satisfied: python-dateutil>=2.8.2 in /opt/conda/lib/python3.10/site-packages (from pandas) (2.8.2)\n",
"Requirement already satisfied: pytz>=2020.1 in /opt/conda/lib/python3.10/site-packages (from pandas) (2023.3)\n",
"Requirement already satisfied: tzdata>=2022.7 in /opt/conda/lib/python3.10/site-packages (from pandas) (2024.1)\n",
"Requirement already satisfied: six>=1.5 in /opt/conda/lib/python3.10/site-packages (from python-dateutil>=2.8.2->pandas) (1.16.0)\n",
"Note: you may need to restart the kernel to use updated packages.\n"
]
}
],
"source": [
"pip install pandas"
]
},
{
"cell_type": "markdown",
"id": "e807646d-e717-4825-a3d3-9d4a652d4851",
"metadata": {},
"source": [
"_________________________"
]
},
{
"cell_type": "markdown",
"id": "9208862e-7b8b-435a-8ef5-4bbf09fe9feb",
"metadata": {},
"source": [
"### Import Section (2 Points)\n",
"Import all the needed libaries here in the beginning! Do not import libaries below this section!"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "34bbdd51-4cd9-4dd3-a27e-3809b3705f60",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"# import all necessary libaries here!\n",
"import os\n",
"#add code"
]
},
{
"cell_type": "markdown",
"id": "c7cac7dd-43b1-4c10-a186-950f3f826700",
"metadata": {},
"source": [
"__________________________________"
]
},
{
"cell_type": "markdown",
"id": "3b371f61-68f9-4f46-a624-c243f559027c",
"metadata": {},
"source": [
"### Question 1: Describing and Testing the Code (4 Points)\n",
"\n",
2024-06-06 10:21:17 +00:00
"a) Break down and explain each line of the provided code snippet. Additionally, describe the role of the '%' symbol in this context. Why is it necessary within this code?"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "code",
"execution_count": 60,
"id": "6458bfad-fab3-48aa-b42c-b67329fc8332",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"def calc_median(dataset):\n",
" tmp_dataset = dataset.copy()\n",
" tmp_dataset.sort()\n",
" if len(tmp_dataset) % 2 == 0:\n",
" median = (tmp_dataset[len(tmp_dataset) // 2] + tmp_dataset[len(tmp_dataset) // 2 - 1]) / 2\n",
" else:\n",
" median = tmp_dataset[len(tmp_dataset) // 2]\n",
" return median\n",
"\n",
"def get_dataset(path):\n",
" with open(path, 'r') as f:\n",
" reader = csv.reader(f)\n",
" dataset = list(reader)\n",
" dataset = [float(x[0]) for x in dataset]\n",
" return dataset"
]
},
2024-06-06 10:21:17 +00:00
{
"cell_type": "raw",
"id": "654d26a8-ed35-4c99-b0ab-43c9371a7b21",
"metadata": {},
"source": [
"# add your Answer to a)"
]
},
2024-06-06 07:01:03 +00:00
{
"cell_type": "markdown",
"id": "afa185ff-bd11-4c2b-ae55-277217c63bf8",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"b) Test the above code on the *data.csv* and print the result."
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "code",
2024-06-06 10:21:17 +00:00
"execution_count": 3,
2024-06-06 07:01:03 +00:00
"id": "5d7a7ac6-5a5c-422c-b310-82b2779cd7fe",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
2024-06-06 10:21:17 +00:00
"# your code for b)"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "5b407d76-41d0-4834-bdc5-a71526d2f416",
"metadata": {},
"source": [
"___________________________"
]
},
{
"cell_type": "markdown",
"id": "686678ee-16ab-4dd9-a9a6-dddeb74ae0f9",
"metadata": {},
"source": [
"### Question 2: Write a short Function (4 Points)\n",
2024-06-06 10:21:17 +00:00
"a) Write **your own** short function which calculates the variance of a dataset and returns it. (hint: to perform x² use ** instead of ^; only core-functions of python are allowed)"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "399589fd-dc12-4c8f-8902-9bfe374add4a",
"metadata": {
"tags": []
},
"source": [
"The variance of a dataset can be calculated using the formula:\n",
"\n",
"$ \\text{mean} = \\frac{1}{n} \\sum_{i=1}^{n} x_i$\n",
"\n",
"$ \\text{variance} = \\frac{\\sum_{i=1}^{n} (x_i - \\text{mean})^2}{n}$\n",
"\n",
"\n",
"where:\n",
"- $n$ is the number of elements in the dataset,\n",
"- $x_i$ represents each individual value in the dataset, and\n",
"- $\\text{mean}$ is the mean of the dataset."
]
},
{
"cell_type": "code",
2024-06-06 10:21:17 +00:00
"execution_count": 1,
2024-06-06 07:01:03 +00:00
"id": "8df48b65-79be-4a7f-8f47-cdecaa11d390",
"metadata": {
"tags": []
},
2024-06-06 10:21:17 +00:00
"outputs": [
{
"ename": "IndentationError",
"evalue": "expected an indented block after function definition on line 2 (3899112808.py, line 5)",
"output_type": "error",
"traceback": [
"\u001b[0;36m Cell \u001b[0;32mIn[1], line 5\u001b[0;36m\u001b[0m\n\u001b[0;31m def calc_variance(dataset):\u001b[0m\n\u001b[0m ^\u001b[0m\n\u001b[0;31mIndentationError\u001b[0m\u001b[0;31m:\u001b[0m expected an indented block after function definition on line 2\n"
]
}
],
2024-06-06 07:01:03 +00:00
"source": [
2024-06-06 10:21:17 +00:00
"# your function for a)\n",
"def calc_variance(dataset):\n",
" # your Code"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "c7ff344b-f9d2-4ee8-8ad9-ce3be8ecbb33",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"b) Test your code on the **data.csv** table below and print the results:"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "code",
2024-06-06 10:21:17 +00:00
"execution_count": 4,
2024-06-06 07:01:03 +00:00
"id": "705b0c3b-72a0-438d-a9b7-dbb5600aa9c3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
2024-06-06 10:21:17 +00:00
"# your test code for b)"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "6eb9cbcb-7944-4779-a0f8-4666a08fba0e",
"metadata": {},
"source": [
"____________________________________________"
]
},
{
"cell_type": "markdown",
"id": "8058570f-45f7-447d-b956-ee2fd63042f3",
"metadata": {},
"source": [
"### Question 3: Answer the following Questions (2 Points)\n",
"a) Why is it necessary to preprocess the data? "
]
},
{
"cell_type": "raw",
"id": "36d73f9a-2716-4e39-81e8-bb906380f250",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for a):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "d1ceee7e-8269-4cc0-af46-90886fb27cae",
"metadata": {},
"source": [
"b) How would you preprocess your data? Describe the steps."
]
},
{
"cell_type": "raw",
"id": "61969d27-0b98-410c-bffc-16a1de912972",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for b):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "aae08c0a-5ffe-482f-8463-e67be8d93e41",
"metadata": {},
"source": [
"_____________________________"
]
},
{
"cell_type": "markdown",
"id": "e6bf5855-09f5-4bee-a42c-5360593945fb",
"metadata": {},
"source": [
"### Question 4: Databases (6 Points)\n",
2024-06-06 10:21:17 +00:00
"Put the following code into a **logical** correct order and run it. Describe each line with comments:"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "df3b243b-e660-4bb2-b1c8-9c556401ddbc",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"for i in my_text:\n",
" cur.execute(\"INSERT INTO finalExam VALUES (?)\", (i,))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d10e84ab-8ae3-4bb5-a7f6-223e1026e443",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"conn.commit()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f7d0ed77-eb9f-47af-8213-876036175b0d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"cur.execute(\"DELETE FROM finalExam WHERE w = 'Bad'\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "9f3abf3c-c1b6-4fe1-a5ba-c4f42acd5279",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"cur = conn.cursor()"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "866766f6-6791-4b48-9162-cd0a27c78c51",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"cur.execute(\"INSERT INTO finalExam VALUES (?)\", ('Nice',))"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "31cf9cf5-2bef-4133-989e-c0f9e6db6b1d",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"cur.execute(\"SELECT * FROM finalExam\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5eb267d8-b049-4a26-ad4f-556c842921b3",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"print(cur.fetchall())\n",
"conn.close()\n",
"os.remove('new.db')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f43304ba-6b54-4615-896e-68c8c7250667",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"conn = sqlite3.connect('new.db')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "003e239e-a23f-4de5-92d1-f2130c43b500",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"cur.execute(\"SELECT COUNT(*) FROM finalExam\")"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "8ec6794f-5645-4fae-b31c-2c4403407470",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"my_text = 'My Final Exam Is Bad'\n",
"my_text = my_text.split(' ')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "19b1b7a0-f0fd-4405-88ef-e2c106f1b6f1",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"cur.execute('''CREATE TABLE IF NOT EXISTS finalExam\n",
" (w Text)''')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "69c2daf9-ec2e-40c6-9d11-8dab7ef31982",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"print(cur.fetchone())"
]
},
{
"cell_type": "markdown",
"id": "6f401683-92a3-4f14-8a34-b5c451fa79a7",
"metadata": {},
"source": [
"________________________________________________"
]
},
{
"cell_type": "markdown",
"id": "29a762ac-a1af-44e4-9eef-5e9617ec2ab2",
"metadata": {},
"source": [
"### Question 5: Data Analyzer (4 Points)\n",
2024-06-06 10:21:17 +00:00
"a) Design a Python class, DataAnalyzer, to facilitate the analysis and visualization of data stored in a database. The class should provide methods to create a SQLite database from a DataFrame and plot the data using Matplotlib. Write down only the code for the **create_database** method."
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "e1f08f56-dd0c-4df3-95b8-cffed58b1d07",
"metadata": {},
"source": [
"The class should be initialized with the following parameters:\n",
"- database_name: A string representing the name of the SQLite database.\n",
"- x_values: A list with x-values.\n",
"- y_values: A list with y-values.\n",
"\n",
2024-06-06 10:21:17 +00:00
"**Method: create_database**\n",
2024-06-06 07:01:03 +00:00
"- Connect to the SQLite database specified by database_name.\n",
"- Create a table and insert the x and y values.\n",
"- Print the number of entries in the database after creating.\n",
"\n",
"Method: plot_values\n",
"- Plot the data with Matplotlib.\n",
"- Include axis labels and title for better visualization."
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "cdd18556-9fd8-441d-9c1f-8f40bbd0ad01",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
2024-06-06 10:21:17 +00:00
"# your Class from a):\n",
2024-06-06 07:01:03 +00:00
"class DataAnalyzer:\n",
" def __init__(self, database_name, x_values, y_values):\n",
" self.database_name = database_name\n",
" self.x_values = x_values\n",
" self.y_values = y_values\n",
"\n",
" def create_database(self):\n",
" # Your Code\n",
"\n",
" def plot_values(self):\n",
" \"\"\"\n",
" Plots the X and Y values using Matplotlib.\n",
" \"\"\"\n",
" plt.plot(self.x_values, self.y_values, marker='o', linestyle='')\n",
" plt.title('Data Plot')\n",
" plt.xlabel('X')\n",
" plt.ylabel('Y')\n",
" plt.grid(True)\n",
" plt.show()"
]
},
{
"cell_type": "markdown",
"id": "30605de1-8eb0-4403-a36b-d22418341e9e",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"b) Test your code below on **coordinates.csv**. The name of the database should be your **matr. nr.** You should test both functions! (for import you can use pandas or csv)"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "code",
2024-06-06 10:21:17 +00:00
"execution_count": 8,
2024-06-06 07:01:03 +00:00
"id": "6ea4c395-1246-4530-994b-9c764a5c57ed",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
2024-06-06 10:21:17 +00:00
"# your test code from b):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "8eec9931-ddac-4128-9b42-dbd349186c10",
"metadata": {},
"source": [
"____________________________"
]
},
{
"cell_type": "markdown",
"id": "7fccb26e-6d90-480b-b449-82d8d3c24a8e",
"metadata": {
"tags": []
},
"source": [
"### Question 6: Describe the code and answer the questions (4 Points)"
]
},
{
"cell_type": "markdown",
"id": "f110bc78-13dc-4251-af7a-32bff74ce5bc",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"a) Describe the following code with comments and rename the function. (directly in the code)"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "code",
2024-06-06 10:21:17 +00:00
"execution_count": 6,
2024-06-06 07:01:03 +00:00
"id": "3f6d6805-5e97-4fbd-b202-4020e9485643",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
2024-06-06 10:21:17 +00:00
"def func_q6(X, y, factor=0.2):\n",
2024-06-06 07:01:03 +00:00
" idx = np.arange(X.shape[0])\n",
" np.random.shuffle(idx)\n",
" X = X[idx]\n",
" y = y[idx]\n",
" split = int((1 - factor) * X.shape[0])\n",
" X_train, X_test = X[:split], X[split:]\n",
" y_train, y_test = y[:split], y[split:]\n",
" return X_train, X_test, y_train, y_test"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "d99078f3-a670-4d04-bb7a-f93e9054b50d",
"metadata": {},
"outputs": [],
"source": [
"X = normalized_data.drop(target_column, axis=1).values\n",
"y = normalized_data[target_column].values\n",
"X_train, X_test, y_train, y_test = function(X, y, factor=0.2)"
]
},
{
"cell_type": "markdown",
"id": "c9dbe438-5c74-4a02-8ea3-efc0e966e483",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"b) Describe the purpose of the above code?"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "raw",
"id": "161a52a4-79e2-4401-ba8a-6f680d47771e",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for b):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "f117c05a-8d98-4dbd-a4c4-6c252bb10fb3",
"metadata": {
"tags": []
},
"source": [
"c) Why do we drop the target_column? "
]
},
{
"cell_type": "raw",
"id": "e998c866-416e-457d-aee8-d1034ae21f84",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for c):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "1a5e037a-f9d7-4e13-88c0-3f88241e4398",
"metadata": {},
"source": [
"d) What would happen if we change the factor to 0.4?"
]
},
{
"cell_type": "raw",
"id": "8d3ec404-4508-413c-9d00-9ce1f24c492f",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for d):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "7585983a-1ec5-48e2-930a-e8316604fc97",
"metadata": {},
"source": [
"_______________________"
]
},
{
"cell_type": "markdown",
"id": "99b839a4-c249-43c0-a035-0a240e1bb350",
"metadata": {},
"source": [
"### Question 7: Linear Regression Basics (4 Points)"
]
},
{
"cell_type": "code",
"execution_count": 48,
"id": "fb9fef79-41bb-4be3-8c40-57095e133c47",
"metadata": {
"tags": []
},
"outputs": [],
"source": [
"class LinearRegression:\n",
" def __init__(self):\n",
" self.coefficients = None\n",
"\n",
" def train(self, X, y):\n",
" X = np.hstack((np.ones((X.shape[0], 1)), X))\n",
" self.coefficients = np.linalg.inv(X.T @ X) @ X.T @ y\n",
"\n",
" def predict(self, X):\n",
" X = np.hstack((np.ones((X.shape[0], 1)), X))\n",
" return X @ self.coefficients"
]
},
{
"cell_type": "markdown",
"id": "d67a67c9-b129-412c-b29b-2869892fcf88",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"a) Describe the purpose of linear regression?"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "raw",
"id": "29d16a49-69d4-4c59-921a-33201fa04d95",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for a):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "43ef5af3-9da0-4dc0-84e1-4731f5687040",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"b) How does the result differ when using the scikit-learn library for linear regression and the class above?"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "raw",
"id": "b6bcafd8-8e89-4fa6-888f-f30a93b019ec",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for b):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "fdb3bd28-59ca-47ec-a750-79652cbc7d74",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"c) Write down the function of the mean squared error (mathematically)? When do you use it?"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "raw",
"id": "791be49f-75ea-43a0-825a-0567ecf72aa1",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for c):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "24b784d2-005e-4537-b0ce-c534f0303ced",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"d) Describe step by step how you would perform linear regression on a given dataset. (more detailed answer)"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "raw",
"id": "c6b02c04-2ad1-48f2-9719-14df04846d8e",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for d):"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "43b95d6a-44ea-4631-bdad-524754d1d11a",
"metadata": {},
"source": [
"_________"
]
},
{
"cell_type": "markdown",
"id": "484c2dc0-67a1-48ab-87b4-db41ebfb6d02",
"metadata": {},
"source": [
"### Question 8: Linear Regression Coding (4 Points)\n",
2024-06-06 10:21:17 +00:00
"a) Perform linear regression on the **winequality-red.csv** with the target **alcohol** (pandas is allowed). Calculate the **mean squared error** and **print** it. Perform the task without preprocessing your data. It is necessary to **split the data**. "
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "code",
2024-06-06 10:21:17 +00:00
"execution_count": 7,
2024-06-06 07:01:03 +00:00
"id": "8f36613b-ecb5-40b0-9918-4119cb8943e2",
"metadata": {},
"outputs": [],
"source": [
2024-06-06 10:21:17 +00:00
"# your Code for a)"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "bda2c424-b619-4e88-8431-c2232fb7babc",
"metadata": {},
"source": [
"_______________"
]
},
{
"cell_type": "markdown",
"id": "ea23296e-ef5b-4268-9398-e36190b8ea96",
"metadata": {},
"source": [
"### Question 9: Gaussian Processes (6 Points)\n",
2024-06-06 10:21:17 +00:00
"a) Fit the names to the different kernels below:"
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "markdown",
"id": "e3680ed3-eb35-41a5-aa35-4a1b5bf45ec3",
"metadata": {
"tags": []
},
"source": [
2024-06-06 10:21:17 +00:00
"Add the Names to the following lines:\n",
2024-06-06 07:01:03 +00:00
"\n",
"1. **... Kernel**: $k(x, x') = \\exp \\left( -2 \\sin^2 \\left( \\frac{\\pi ||x - x'||}{p} \\right) \\right) / \\sigma^2 $\n",
"2. **... Kernel**: $k(x, x') = x^T x' $\n",
"3. **... Kernel**: $k(x, x') = \\exp \\left( - \\frac{||x - x'||^2}{2\\sigma^2} \\right)$\n",
"4. **... Kernel**: $k(x, x') = (x^T x' + 1)^d $"
]
},
{
"cell_type": "markdown",
"id": "1759762b-5b0a-4e6f-9651-277bc4643fb4",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"b) Fit the above **numbers** to the lines below."
2024-06-06 07:01:03 +00:00
]
},
{
"cell_type": "raw",
"id": "0c6ddc30-f804-472c-92d1-8a03d3c3b020",
"metadata": {},
"source": [
2024-06-06 10:21:17 +00:00
"Your Answer for b):\n",
2024-06-06 07:01:03 +00:00
" function1 = ...\n",
" function2 = ...\n",
" function3 = ...\n",
" function4 = ..."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "530bc9da-a454-4de1-ba07-d4b62a4fae1e",
"metadata": {},
"outputs": [],
"source": [
"class function1:\n",
" def __init__(self, theta=1.0):\n",
" self.theta = theta\n",
" self.bounds = ((1e-5, 1e5),)\n",
" self.num_params = 1\n",
"\n",
" def __call__(self, X1, X2):\n",
" return self.theta * np.dot(X1, X2.T)\n",
"\n",
" def set_params(self, params):\n",
" self.theta = params[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "c7518ccc-f5f5-4401-8d1d-d9167cc6dde8",
"metadata": {},
"outputs": [],
"source": [
"class function2:\n",
" def __init__(self, theta=1.0, d=3):\n",
" self.theta = theta\n",
" self.d = d\n",
" self.bounds = ((1e-5, 1e5), (1, 10))\n",
" self.num_params = 2\n",
"\n",
" def __call__(self, X1, X2):\n",
" return (self.theta * np.dot(X1, X2.T) + 1) ** self.d\n",
"\n",
" def set_params(self, params):\n",
" self.theta = params[0]\n",
" self.d = params[1]\n",
"\n",
" def gradient(self, X):\n",
" return np.array(\n",
" [\n",
" self.d * self.theta * np.dot(X, X.T) ** (self.d - 1),\n",
" self.theta * np.dot(X, X.T) ** self.d * np.log(np.dot(X, X.T)),\n",
" ]\n",
" )"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "ce47fa6e-038c-488f-b05e-a5ddddbc4e90",
"metadata": {},
"outputs": [],
"source": [
"class function3:\n",
" def __init__(self, theta=1.0):\n",
" self.theta = theta\n",
" self.bounds = ((1e-5, 1e5),)\n",
" self.num_params = 1\n",
"\n",
" def __call__(self, X1, X2):\n",
" return np.exp(-2 * np.sin(np.pi * cdist(X1, X2) / self.theta) ** 2)\n",
"\n",
" def set_params(self, params):\n",
" self.theta = params[0]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "07bad441-5c3c-42c2-8507-b884d827acf0",
"metadata": {},
"outputs": [],
"source": [
"class function4:\n",
" def __init__(self, theta=1.0):\n",
" self.theta = theta\n",
" self.bounds = ((1e-5, 1e5),)\n",
" self.num_params = 1\n",
"\n",
" def __call__(self, X1, X2):\n",
" return np.exp(-self.theta * cdist(X1, X2) ** 2)\n",
"\n",
" def set_params(self, params):\n",
" self.theta = params[0]"
]
},
{
"cell_type": "markdown",
"id": "df5f819b-98d9-4e76-bc42-d9c71ca7b076",
"metadata": {},
"source": [
"___________________________"
]
2024-06-06 10:21:17 +00:00
},
{
"cell_type": "markdown",
"id": "b5042b38-0752-4e23-aae3-3651a5f367cf",
"metadata": {},
"source": [
"### How to submit your file:\n",
"1) Save your file in the Jupyter Notebook (Floppy Disk Symbol)\n",
"2) Download only your jupyter-file (right-click and download)\n",
"3) Change the window to Moodle again (bottom-left corner)\n",
"4) Upload your file on Moodle\n",
"5) Submit your full exam\n",
" - Finish attempt ...\n",
" - Submit all and finish (First Time)\n",
" - Submit all and finish (Second Time)"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "5808cbd3-51da-40fd-a770-057156cd7465",
"metadata": {},
"outputs": [],
"source": []
2024-06-06 07:01:03 +00:00
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3 (ipykernel)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.11"
}
},
"nbformat": 4,
"nbformat_minor": 5
}