{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "from scipy import stats"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem 1\n",
    "\n",
    "> It is claimed that a certain type of bipolar transistor has a mean value of current\n",
    "> gain that is at least 210. A sample of these transistors is tested. If the sample mean\n",
    "> value of current gain is 200 with a sample standard deviation of 35, would the\n",
    "> claim be rejected at the 5 percent level of signiﬁcance if\n",
    "> \n",
    "> - (a) the sample size is 25;\n",
    "> - (b) the sample size is 64?\n",
    "\n",
    "First we define our $H_0$, which we set to $\\mu < 10$.\n",
    "\n",
    "We dont know the varience of the distribution, but instead the sample varience $S$.\n",
    "Then we can use the one sided *t-test*.\n",
    "\n",
    "$$H_0 : \\mu < 210$$\n",
    "$$H_1 : \\mu \\geq 210$$\n",
    "\n",
    "First TS is calculated with \n",
    "$$\n",
    "TS = \\sqrt{n} (\\bar{X} - \\mu_0) / S\n",
    "$$\n",
    "\n",
    "Then the p value is calculated\n",
    "$$\n",
    "p = P{T_{n-1} \\geq TS} = 1 - T(TS)\n",
    "$$\n",
    "\n",
    "Then one can check if the *p-value* is smaller than $0.95$.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "TS: -1.4285714285714286, p_value: 0.9169932070815955\n",
      "Assignment A: True\n",
      "TS: -2.2857142857142856, p_value: 0.987176989403574\n",
      "Assignment B: False\n"
     ]
    }
   ],
   "source": [
    "mu_min = 210\n",
    "mu_sample = 200\n",
    "sigma_sample= 35\n",
    "alpha = 0.05\n",
    "accept = 1 - alpha\n",
    "\n",
    "def test_with_n(n):\n",
    "    TS = np.sqrt(n) * (mu_sample - mu_min) / sigma_sample\n",
    "    p_value = 1 - stats.t.cdf(TS, n-1)\n",
    "    \n",
    "    print(f\"TS: {TS}, p_value: {p_value}\")\n",
    "    return p_value < accept\n",
    "\n",
    "# Part A\n",
    "print(f\"Assignment A: {test_with_n(25)}\")\n",
    "print(f\"Assignment B: {test_with_n(64)}\")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem 2\n",
    "\n",
    "> A question of medical importance is whether jogging leads to a reduction in\n",
    "one’s pulse rate. To test this hypothesis, 8 nonjogging volunteers agreed to begin\n",
    "a 1-month jogging program. After the month their pulse rates were determined\n",
    "and compared with their earlier values. If the data are as follows, can we conclude\n",
    "that jogging has had an effect on the pulse rates?\n",
    "\n",
    "I wont put the table from the book in :-(.\n",
    "\n",
    "Here the after is dependent of the before.\n",
    "We therefore have to look at the differences\n",
    "\n",
    "We let $H_0$ be that the pulse is lowered, thus the difference mean $\\mu_d < 0$.\n",
    "\n",
    "We assume $\\alpha = 0.05$"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "TS: -1.2629741003498156, p-value: 0.12351970736529827\n",
      "H_0 will be accepted with all alpha=0.12351970736529827\n"
     ]
    }
   ],
   "source": [
    "before = np.array([74, 86, 98, 102, 78, 84, 79, 70])\n",
    "after = np.array([70, 85, 90, 110, 71, 80, 69, 74])\n",
    "diff = after - before\n",
    "n = len(diff)\n",
    "\n",
    "\n",
    "mu_s = np.mean(diff)\n",
    "var_s = np.sqrt(np.sum((diff - mu_s)**2 / (n - 1)))\n",
    "\n",
    "TS = np.sqrt(n) * (mu_s - 0) / var_s\n",
    "p_value = 1 - stats.t.cdf(np.abs(TS), n-1)\n",
    "print(f\"TS: {TS}, p-value: {p_value}\")\n",
    "print(f\"H_0 will be accepted with all alpha={p_value}\")\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Problem 3\n",
    "\n",
    "> According to the U.S. Bureau of the Census, 25.5 percent of the population of\n",
    "those age 18 or over smoked in 1990. A scientist has recently claimed that this\n",
    "percentage has since increased, and to prove her claim she randomly sampled 500\n",
    "individuals from this population. If 138 of them were smokers, is her claim proved?\n",
    "Use the 5 percent level of signiﬁcance.\n",
    "\n",
    "The $H_0$ is that the new percentage is lower of equal than 25.5.\n",
    "Because each person is a coin flip, this is a Bernoulli distribution.\n",
    "\n",
    "$H_0$ is therefore $p \\leq p_0$ where $p$ is the Bernoulli probability and $p_0 = 0.255$.\n",
    "\n",
    "We will let $X$ be the number of smokers in a population, so we will reject $H_0$ if $X$ is large enough.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "p_value for accepting h_0: 0.12996099025442587\n",
      "We accept H_0, thus the claim has not been proven\n"
     ]
    }
   ],
   "source": [
    "p = 0.255\n",
    "n = 500\n",
    "smokers = 138\n",
    "\n",
    "p_value = 1 - stats.binom.cdf(138, 500, 0.255)\n",
    "print(f\"p_value for accepting h_0: {p_value}\")\n",
    "if p_value > alpha:\n",
    "    print(f\"We accept H_0, thus the claim has not been proven\")\n",
    "else:\n",
    "    print(f\"We do not accept H_0, thus the claim is proven\")"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.2"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}