Solve more statistics assignments

author: Julian T <julian@jtle.dk> 2021-03-19 12:05:05 +0100
committer: Julian T <julian@jtle.dk> 2021-03-19 12:05:05 +0100
commit: 501f3e928cf652e691853acad6fed4de25338f63 (patch)
tree: b10083cba1bcaa46c32d382ca68d329f4501c211
parent: 83ced2d4cee2e46fe8d47e3e192b34efaa37bf0f (diff)
1 files changed, 204 insertions, 0 deletions
diff --git a/sem6/prob/stat5/Opgaver.ipynb b/sem6/prob/stat5/Opgaver.ipynb
new file mode 100644
index 0000000..4942f92
--- /dev/null
+++ b/sem6/prob/stat5/Opgaver.ipynb
@@ -0,0 +1,204 @@
+{
+ "cells": [
+  {
+   "cell_type": "code",
+   "execution_count": 8,
+   "metadata": {},
+   "outputs": [],
+   "source": [
+    "import numpy as np\n",
+    "from scipy import stats"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Problem 1\n",
+    "\n",
+    "> It is claimed that a certain type of bipolar transistor has a mean value of current\n",
+    "> gain that is at least 210. A sample of these transistors is tested. If the sample mean\n",
+    "> value of current gain is 200 with a sample standard deviation of 35, would the\n",
+    "> claim be rejected at the 5 percent level of signiﬁcance if\n",
+    "> \n",
+    "> - (a) the sample size is 25;\n",
+    "> - (b) the sample size is 64?\n",
+    "\n",
+    "First we define our $H_0$, which we set to $\\mu < 10$.\n",
+    "\n",
+    "We dont know the varience of the distribution, but instead the sample varience $S$.\n",
+    "Then we can use the one sided *t-test*.\n",
+    "\n",
+    "$$H_0 : \\mu < 210$$\n",
+    "$$H_1 : \\mu \\geq 210$$\n",
+    "\n",
+    "First TS is calculated with \n",
+    "$$\n",
+    "TS = \\sqrt{n} (\\bar{X} - \\mu_0) / S\n",
+    "$$\n",
+    "\n",
+    "Then the p value is calculated\n",
+    "$$\n",
+    "p = P{T_{n-1} \\geq TS} = 1 - T(TS)\n",
+    "$$\n",
+    "\n",
+    "Then one can check if the *p-value* is smaller than $0.95$.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 14,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "TS: -1.4285714285714286, p_value: 0.9169932070815955\n",
+      "Assignment A: True\n",
+      "TS: -2.2857142857142856, p_value: 0.987176989403574\n",
+      "Assignment B: False\n"
+     ]
+    }
+   ],
+   "source": [
+    "mu_min = 210\n",
+    "mu_sample = 200\n",
+    "sigma_sample= 35\n",
+    "alpha = 0.05\n",
+    "accept = 1 - alpha\n",
+    "\n",
+    "def test_with_n(n):\n",
+    "    TS = np.sqrt(n) * (mu_sample - mu_min) / sigma_sample\n",
+    "    p_value = 1 - stats.t.cdf(TS, n-1)\n",
+    "    \n",
+    "    print(f\"TS: {TS}, p_value: {p_value}\")\n",
+    "    return p_value < accept\n",
+    "\n",
+    "# Part A\n",
+    "print(f\"Assignment A: {test_with_n(25)}\")\n",
+    "print(f\"Assignment B: {test_with_n(64)}\")"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Problem 2\n",
+    "\n",
+    "> A question of medical importance is whether jogging leads to a reduction in\n",
+    "one’s pulse rate. To test this hypothesis, 8 nonjogging volunteers agreed to begin\n",
+    "a 1-month jogging program. After the month their pulse rates were determined\n",
+    "and compared with their earlier values. If the data are as follows, can we conclude\n",
+    "that jogging has had an effect on the pulse rates?\n",
+    "\n",
+    "I wont put the table from the book in :-(.\n",
+    "\n",
+    "Here the after is dependent of the before.\n",
+    "We therefore have to look at the differences\n",
+    "\n",
+    "We let $H_0$ be that the pulse is lowered, thus the difference mean $\\mu_d < 0$.\n",
+    "\n",
+    "We assume $\\alpha = 0.05$"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 41,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "TS: -1.2629741003498156, p-value: 0.12351970736529827\n",
+      "H_0 will be accepted with all alpha=0.12351970736529827\n"
+     ]
+    }
+   ],
+   "source": [
+    "before = np.array([74, 86, 98, 102, 78, 84, 79, 70])\n",
+    "after = np.array([70, 85, 90, 110, 71, 80, 69, 74])\n",
+    "diff = after - before\n",
+    "n = len(diff)\n",
+    "\n",
+    "\n",
+    "mu_s = np.mean(diff)\n",
+    "var_s = np.sqrt(np.sum((diff - mu_s)**2 / (n - 1)))\n",
+    "\n",
+    "TS = np.sqrt(n) * (mu_s - 0) / var_s\n",
+    "p_value = 1 - stats.t.cdf(np.abs(TS), n-1)\n",
+    "print(f\"TS: {TS}, p-value: {p_value}\")\n",
+    "print(f\"H_0 will be accepted with all alpha={p_value}\")\n"
+   ]
+  },
+  {
+   "cell_type": "markdown",
+   "metadata": {},
+   "source": [
+    "# Problem 3\n",
+    "\n",
+    "> According to the U.S. Bureau of the Census, 25.5 percent of the population of\n",
+    "those age 18 or over smoked in 1990. A scientist has recently claimed that this\n",
+    "percentage has since increased, and to prove her claim she randomly sampled 500\n",
+    "individuals from this population. If 138 of them were smokers, is her claim proved?\n",
+    "Use the 5 percent level of signiﬁcance.\n",
+    "\n",
+    "The $H_0$ is that the new percentage is lower of equal than 25.5.\n",
+    "Because each person is a coin flip, this is a Bernoulli distribution.\n",
+    "\n",
+    "$H_0$ is therefore $p \\leq p_0$ where $p$ is the Bernoulli probability and $p_0 = 0.255$.\n",
+    "\n",
+    "We will let $X$ be the number of smokers in a population, so we will reject $H_0$ if $X$ is large enough.\n"
+   ]
+  },
+  {
+   "cell_type": "code",
+   "execution_count": 46,
+   "metadata": {},
+   "outputs": [
+    {
+     "name": "stdout",
+     "output_type": "stream",
+     "text": [
+      "p_value for accepting h_0: 0.12996099025442587\n",
+      "We accept H_0, thus the claim has not been proven\n"
+     ]
+    }
+   ],
+   "source": [
+    "p = 0.255\n",
+    "n = 500\n",
+    "smokers = 138\n",
+    "\n",
+    "p_value = 1 - stats.binom.cdf(138, 500, 0.255)\n",
+    "print(f\"p_value for accepting h_0: {p_value}\")\n",
+    "if p_value > alpha:\n",
+    "    print(f\"We accept H_0, thus the claim has not been proven\")\n",
+    "else:\n",
+    "    print(f\"We do not accept H_0, thus the claim is proven\")"
+   ]
+  }
+ ],
+ "metadata": {
+  "kernelspec": {
+   "display_name": "Python 3",
+   "language": "python",
+   "name": "python3"
+  },
+  "language_info": {
+   "codemirror_mode": {
+    "name": "ipython",
+    "version": 3
+   },
+   "file_extension": ".py",
+   "mimetype": "text/x-python",
+   "name": "python",
+   "nbconvert_exporter": "python",
+   "pygments_lexer": "ipython3",
+   "version": "3.9.2"
+  }
+ },
+ "nbformat": 4,
+ "nbformat_minor": 4
+}
author	Julian T <julian@jtle.dk>	2021-03-19 12:05:05 +0100
committer	Julian T <julian@jtle.dk>	2021-03-19 12:05:05 +0100
commit	501f3e928cf652e691853acad6fed4de25338f63 (patch)
tree	b10083cba1bcaa46c32d382ca68d329f4501c211
parent	83ced2d4cee2e46fe8d47e3e192b34efaa37bf0f (diff)