Skip to content
Permalink
98df406b54
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
 
 
Cannot retrieve contributors at this time
696 lines (696 sloc) 28 KB
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Exploitation-Exploration tradeoff\n",
"\n",
"## Multi-armed Bandits"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"- Read Chapter 2 from \"[Reinforcement Learing - An intruction](http://incompleteideas.net/book/RLbook2020trimmed.pdf)\" (Sutton and Barto, 2018), and focus on sections 2.1, 2.2, 2.3, 2.6, 2.7 and 2.10. [This is a legal copy from one of the authors.]\n",
"\n",
" - A high level overview of the chapter can be gained by watching this video: https://www.youtube-nocookie.com/embed/9LhNHK1ULxs?start=5\n",
" \n",
"- Study the Chapter 2 code, reproduced below, from \"[Re-implementations in Python by Shangtong Zhang](http://incompleteideas.net/book/code/code2nd.html)\"."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The tasks for assessment are:\n",
"1. Set the **random-numbe-generator seed** to be your student ID. (See `np.random.seed(.....)` below.)\n",
"2. Choose a value for $k$ from the set $\\{7, 8, 9, 10, 11, 12\\}$ (i.e. `k_arm=....`).\n",
"3. Devise and run 5 computational experiments to study the effect of the parameters `epsilon`, `initial`, `step_size`, `sample_averages`, `UCB_param`, `gradient`.\n",
"\n",
" For each experiment:\n",
" - Explain what its aim is.\n",
" - Explain what parameters are being used, and what they are meant to control.\n",
" - Use diagrams to show your results, then discuss them.\n",
" \n",
" Your experiments must be sufficiently distinct from those presented below (which reproduce experiments in the book). For example, you may try a wider range of $\\varepsilon$ values to try and find an optimal value. You should also look into the initial distributions of the bandits' values.\n",
" \n",
"**NB** Note that the values for `MAX_RUNS = 100` `MAX_TIME = 300` are low to make the code faster. You will need to increase these at the final run to get smoother results."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Marking scheme\n",
"\n",
"|Item|Mark|\n",
"|:----|---:|\n",
"|Experimet 1|/4|\n",
"|Experimet 2|/4|\n",
"|Experimet 3|/4|\n",
"|Experimet 4|/4|\n",
"|Experimet 5|/4|\n",
"|||\n",
"|**Total**: |/20|\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:31:17.835969Z",
"start_time": "2022-10-25T11:31:17.045793Z"
}
},
"outputs": [],
"source": [
"import numpy as np\n",
"from numpy.random import rand, randn, choice\n",
"\n",
"np.random.seed(123456) ## --- SET THIS TO YOUR SID --- ##"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:31:17.851583Z",
"start_time": "2022-10-25T11:31:17.835969Z"
}
},
"outputs": [],
"source": [
"## --- Increase these values at the last run to get smoother statistics --- ##\n",
"MAX_RUNS = 100\n",
"MAX_TIME = 300"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:31:19.782382Z",
"start_time": "2022-10-25T11:31:17.851583Z"
}
},
"outputs": [],
"source": [
"# Adapted by Kamal Bentahar (2022) from:\n",
"# https://github.com/ShangtongZhang/reinforcement-learning-an-introduction/blob/master/chapter02/ten_armed_testbed.py\n",
"\n",
"#######################################################################\n",
"# Copyright (C) #\n",
"# 2016-2018 Shangtong Zhang(zhangshangtong.cpp@gmail.com) #\n",
"# 2016 Tian Jun(tianjun.cpp@gmail.com) #\n",
"# 2016 Artem Oboturov(oboturov@gmail.com) #\n",
"# 2016 Kenta Shimada(hyperkentakun@gmail.com) #\n",
"# Permission given to modify the code as long as you keep this #\n",
"# declaration at the top #\n",
"#######################################################################\n",
"\n",
"import matplotlib\n",
"import matplotlib.pyplot as plt\n",
"from tqdm import trange"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:31:19.828263Z",
"start_time": "2022-10-25T11:31:19.782382Z"
}
},
"outputs": [],
"source": [
"class Bandit:\n",
" # @k_arm: number of arms\n",
" # @epsilon: probability for exploration in epsilon-greedy algorithm\n",
" # @initial: initial estimation for each action\n",
" # @step_size: constant step size for updating estimations\n",
" # @sample_averages: if True, use sample averages to update estimations instead of constant step size\n",
" # @UCB_param: if not None, use UCB algorithm to select action\n",
" # @gradient: if True, use gradient based bandit algorithm\n",
" # @gradient_baseline: if True, use average reward as baseline for gradient based bandit algorithm\n",
"\n",
" def __init__(self, k_arm=7, epsilon=0.0, initial=0.0, step_size=0.1, sample_averages=False,\n",
" UCB_param=None, gradient=False, gradient_baseline=False, true_reward=0.0):\n",
" self.k = k_arm\n",
" self.step_size = step_size\n",
" self.sample_averages = sample_averages\n",
" self.indices = np.arange(self.k)\n",
" self.time = 0\n",
" self.UCB_param = UCB_param\n",
" self.gradient = gradient\n",
" self.gradient_baseline = gradient_baseline\n",
" self.average_reward = 0\n",
" self.true_reward = true_reward\n",
" self.epsilon = epsilon\n",
" self.initial = initial\n",
"\n",
" def reset(self): \n",
" self.q_true = randn(self.k) + self.true_reward # real reward for each action\n",
" self.q_estimation = np.zeros(self.k) + self.initial # estimation for each action\n",
" self.action_count = np.zeros(self.k) # number of chosen times for each action\n",
" self.best_action = np.argmax(self.q_true)\n",
" self.time = 0\n",
"\n",
" def act(self):\n",
" ''' Get an action for this bandit '''\n",
" if rand() < self.epsilon:\n",
" return choice(self.indices)\n",
"\n",
" if self.UCB_param is not None:\n",
" UCB_estimation = self.q_estimation\n",
" UCB_estimation += self.UCB_param * np.sqrt(np.log(self.time + 1) / (self.action_count + 1e-5))\n",
" q_best = np.max(UCB_estimation)\n",
" return choice(np.where(UCB_estimation == q_best)[0])\n",
"\n",
" if self.gradient:\n",
" exp_est = np.exp(self.q_estimation)\n",
" self.action_prob = exp_est / np.sum(exp_est)\n",
" return choice(self.indices, p=self.action_prob)\n",
"\n",
" q_best = np.max(self.q_estimation)\n",
" return choice(np.where(self.q_estimation == q_best)[0])\n",
"\n",
" def step(self, action):\n",
" ''' Take an action, update estimation for this action '''\n",
" # generate the reward under N(real reward, 1)\n",
" reward = randn() + self.q_true[action]\n",
" self.time += 1\n",
" self.action_count[action] += 1\n",
" self.average_reward += (reward - self.average_reward) / self.time\n",
" if self.sample_averages: # update estimation using sample averages\n",
" self.q_estimation[action] += (reward - self.q_estimation[action]) / self.action_count[action]\n",
" elif self.gradient:\n",
" one_hot = np.zeros(self.k)\n",
" one_hot[action] = 1\n",
" baseline = self.average_reward if self.gradient_baseline else 0\n",
" self.q_estimation += self.step_size * (reward - baseline) * (one_hot - self.action_prob)\n",
" else: # update estimation with constant step size\n",
" self.q_estimation[action] += self.step_size * (reward - self.q_estimation[action])\n",
" return reward"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:31:19.859949Z",
"start_time": "2022-10-25T11:31:19.833259Z"
}
},
"outputs": [],
"source": [
"def simulate(runs, time, bandits):\n",
" ''' Returns: mean_best_action_counts, mean_rewards '''\n",
" rewards = np.zeros((len(bandits), runs, time))\n",
" best_action_counts = np.zeros(rewards.shape)\n",
" for i, bandit in enumerate(bandits):\n",
" for r in trange(runs):\n",
" bandit.reset()\n",
" for t in range(time):\n",
" action = bandit.act()\n",
" reward = bandit.step(action)\n",
" rewards[i, r, t] = reward\n",
" if action == bandit.best_action:\n",
" best_action_counts[i, r, t] = 1\n",
" mean_best_action_counts = best_action_counts.mean(axis=1)\n",
" mean_rewards = rewards.mean(axis=1)\n",
" return mean_best_action_counts, mean_rewards"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](RLImages/Figure_1.png)\n",
"\n",
"\n",
"$$\n",
"\n",
"Figure \\space 1 \n",
"\n",
"$$\n",
"\n",
"For K-armed bandit, this is the 7-armed test bed\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:31:20.525782Z",
"start_time": "2022-10-25T11:31:19.859949Z"
}
},
"outputs": [],
"source": [
"def figure_2_1(k=7):\n",
" plt.figure(figsize=(12, 3))\n",
" plt.violinplot(dataset=randn(200, k) + randn(k))\n",
" plt.xlabel(\"Action\")\n",
" plt.ylabel(\"Reward distribution\")\n",
" plt.show()\n",
"figure_2_1()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Experiment 1\n",
"\n",
"The Aim of this experiment is to understand the K-Arm bandit problem and results of various Re-enforcement Learning Algorithms, in this experiment we study various greedy methods\n",
"\n",
"For K_arm = 7\n",
"\n",
"Max Time = 1000 and Max Runs = 300, Please see figure below. \n",
"\n",
"In the figure we see Average performance of ε-greedy action-value methods on the 10-armed testbed.\n",
"These data are averages over 300 runs with different bandit problems. We see comparisons of greedy method with the ε-greedy methods for ε = 0.10 and ε = 0.01, we see for optimum action the greedy approach took less time to find the optimum strategy."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](RLImages/Figure_2.png)\n",
"\n",
"\n",
"$$\n",
"\n",
"Figure \\space 2\n",
"\n",
"$$\n",
"\n",
"100%|███████████████████████████████████████████████████████████████████████| 200/200 [01:01<00:00, 3.25it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 200/200 [00:59<00:00, 3.39it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 200/200 [01:01<00:00, 3.24it/s]\n",
"\n",
"$$\n",
"\n",
"Terminal \\space Output\n",
"\n",
"$$\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:35:51.969695Z",
"start_time": "2022-10-25T11:31:20.525782Z"
}
},
"outputs": [],
"source": [
"def figure_2_2(runs=MAX_RUNS, time=MAX_TIME):\n",
" epsilons = [0, 0.1, 0.01]\n",
" bandits = [Bandit(epsilon=eps, sample_averages=True) for eps in epsilons]\n",
" best_action_counts, rewards = simulate(runs, time, bandits)\n",
"\n",
" plt.figure(figsize=(12, 6))\n",
"\n",
" plt.subplot(2, 1, 1)\n",
" for eps, rewards in zip(epsilons, rewards):\n",
" plt.plot(rewards, label=f'$\\epsilon = {eps:.02f}$')\n",
" plt.xlabel('steps')\n",
" plt.ylabel('average reward')\n",
" plt.legend()\n",
"\n",
" plt.subplot(2, 1, 2)\n",
" for eps, counts in zip(epsilons, best_action_counts):\n",
" plt.plot(counts, label=f'$\\epsilon = {eps:.02f}$')\n",
" plt.xlabel('steps')\n",
" plt.ylabel('% optimal action')\n",
" plt.legend()\n",
" plt.show()\n",
"figure_2_2(200, 10000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Experiment 2\n",
"\n",
"In this experiment, we see The effect of optimistic initial action-value estimates on the 10-armed testbed.\n",
"Both methods used a constant step-size parameter = 0.1. we provide the optimistic initial values as q=5 to the greedy method and for ε=0.1 for ε-greedy method the initial value stays at 0. \n",
"\n",
"As seen in the figure below we see initially greedy method is better but over time of 1000, ε=0.1 method gives a lower % optimal action "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](RLImages/Figure_3.png)\n",
"\n",
"\n",
"$$\n",
"\n",
"Figure \\space 3\n",
"\n",
"$$\n",
"\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:11<00:00, 29.22it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:11<00:00, 29.77it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:15<00:00, 21.50it/s]\n",
"\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:35:54.919508Z",
"start_time": "2022-10-25T11:35:51.969695Z"
}
},
"outputs": [],
"source": [
"def figure_2_3(runs=MAX_RUNS, time=MAX_TIME):\n",
" bandits = []\n",
" bandits.append(Bandit(epsilon=0, initial=5, step_size=0.1))\n",
" bandits.append(Bandit(epsilon=0.1, initial=0, step_size=0.1))\n",
" best_action_counts, _ = simulate(runs, time, bandits)\n",
"\n",
" plt.figure(figsize=(12, 3))\n",
" plt.plot(best_action_counts[0], label='$\\epsilon = 0, q = 5$')\n",
" plt.plot(best_action_counts[1], label='$\\epsilon = 0.1, q = 0$')\n",
" plt.xlabel('Steps')\n",
" plt.ylabel('% optimal action')\n",
" plt.legend()\n",
" plt.show()\n",
"figure_2_3()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Experiment 3\n",
"\n",
"In this experiment we repeat the same method as above, however we keep the step size as 0.1 for both methods with greedy and ε-greedy, on the bandit problem for a 10-armed bed\n",
"\n",
"As seen in the result below, ε-greedy approach reaches an optimum result quicker that the greedy method "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](RLImages/Figure_4.png)\n",
"\n",
"\n",
"$$\n",
"\n",
"Figure \\space 4\n",
"\n",
"$$\n",
"\n",
"\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:10<00:00, 31.91it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:10<00:00, 32.83it/s]\n",
"\n",
"\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:35:58.305039Z",
"start_time": "2022-10-25T11:35:54.919508Z"
}
},
"outputs": [],
"source": [
"def figure_2_4(runs=MAX_RUNS, time=MAX_TIME):\n",
" bandits = []\n",
" bandits.append(Bandit(epsilon=0, UCB_param=2, sample_averages=True))\n",
" bandits.append(Bandit(epsilon=0.1, sample_averages=True))\n",
" _, average_rewards = simulate(runs, time, bandits)\n",
"\n",
" plt.figure(figsize=(12, 3))\n",
" plt.plot(average_rewards[0], label='UCB $c = 2$')\n",
" plt.plot(average_rewards[1], label='epsilon greedy $\\epsilon = 0.1$')\n",
" plt.xlabel('Steps')\n",
" plt.ylabel('Average reward')\n",
" plt.legend()\n",
" plt.show()\n",
"figure_2_4()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Experiment 4\n",
"\n",
"In this experiment, we compare the average performance of UCB action selection on the 10-armed testbed. \n",
"\n",
"As seen in the results below, UCB generally performs better than ε-greedy action selection, except in the first k steps, when it selects randomly among the actions that are not tried before"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](RLImages/Figure_5.png)\n",
"\n",
"\n",
"$$\n",
"\n",
"Figure \\space 5\n",
"\n",
"$$\n",
"\n",
"\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:21<00:00, 15.41it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:19<00:00, 16.69it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:20<00:00, 16.49it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:18<00:00, 17.78it/s]"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:36:02.554756Z",
"start_time": "2022-10-25T11:35:58.305039Z"
}
},
"outputs": [],
"source": [
"def figure_2_5(runs=MAX_RUNS, time=MAX_TIME):\n",
" bandits = []\n",
" bandits.append(Bandit(gradient=True, step_size=0.1, gradient_baseline=True, true_reward=4))\n",
" bandits.append(Bandit(gradient=True, step_size=0.4, gradient_baseline=True, true_reward=4))\n",
" best_action_counts, _ = simulate(runs, time, bandits)\n",
" labels = [r'$\\alpha = 0.1$, with baseline',\n",
" r'$\\alpha = 0.4$, with baseline'\n",
" ]\n",
"\n",
" plt.figure(figsize=(12, 3))\n",
" for i in range(len(bandits)):\n",
" plt.plot(best_action_counts[i], label=labels[i])\n",
" plt.xlabel('Steps')\n",
" plt.ylabel('% Optimal action')\n",
" plt.legend()\n",
" plt.show()\n",
"figure_2_5()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Experiment 5\n",
"\n",
"In this experiment, we see average performance of the gradient bandit algorithm with and without a reward\n",
"baseline on the 10-armed testbed when the q(a) are chosen to be near 4 rather than near zero.\n",
"\n",
"As seen in the figure below , with a gradient baseline present , the optimal action is reached quicker\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](RLImages/Figure_6.png)\n",
"\n",
"\n",
"$$\n",
"\n",
"Figure \\space 6\n",
"\n",
"$$\n",
"\n",
"\n",
"\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:18<00:00, 17.78it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:14<00:00, 22.15it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:25<00:00, 12.77it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:28<00:00, 11.64it/s]\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"ExecuteTime": {
"end_time": "2022-10-25T11:36:44.335468Z",
"start_time": "2022-10-25T11:36:02.554756Z"
}
},
"outputs": [],
"source": [
"def figure_2_6(runs=MAX_RUNS, time=MAX_TIME):\n",
" labels = ['epsilon-greedy', 'gradient bandit',\n",
" 'UCB', 'optimistic initialization']\n",
" generators = [lambda epsilon: Bandit(epsilon=epsilon, sample_averages=True),\n",
" lambda alpha: Bandit(gradient=True, step_size=alpha, gradient_baseline=True),\n",
" lambda coef: Bandit(epsilon=0, UCB_param=coef, sample_averages=True),\n",
" lambda initial: Bandit(epsilon=0, initial=initial, step_size=0.1)]\n",
" parameters = [np.arange(-7, -1, dtype=float),\n",
" np.arange(-5, 2, dtype=float),\n",
" np.arange(-4, 3, dtype=float),\n",
" np.arange(-2, 3, dtype=float)]\n",
"\n",
" bandits = []\n",
" for generator, parameter in zip(generators, parameters):\n",
" for param in parameter:\n",
" bandits.append(generator(2**param))\n",
"\n",
" _, average_rewards = simulate(runs, time, bandits)\n",
" rewards = np.mean(average_rewards, axis=1)\n",
"\n",
" plt.figure(figsize=(12, 3))\n",
" i = 0\n",
" for label, parameter in zip(labels, parameters):\n",
" l = len(parameter)\n",
" plt.plot(parameter, rewards[i:i+l], label=label)\n",
" i += l\n",
" plt.xlabel('Parameter ($2^x$)')\n",
" plt.ylabel('Average reward')\n",
" plt.legend()\n",
" plt.show()\n",
"figure_2_6()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Summary \n",
"\n",
"In the below figure we see a summary of the different types bandit algorithms which we tested in the the experiments (K_arm = 7, Max Run = 330, Max time = 1000) above.\n",
"Each point is the average reward obtained over 330 steps with a particular algorithm at a\n",
"particular setting of its parameter."
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"![title](RLImages/Figure_7.png)\n",
"\n",
"\n",
"$$\n",
"\n",
"Figure \\space 7\n",
"\n",
"$$\n",
"\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:21<00:00, 15.53it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:19<00:00, 16.65it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:15<00:00, 21.93it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:11<00:00, 28.16it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:11<00:00, 29.43it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:11<00:00, 29.15it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:13<00:00, 25.17it/s]\n",
"100%|███████████████████████████████████████████████████████████████████████| 330/330 [00:14<00:00, 22.75it/s\n"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Conclusion\n",
"\n",
"The various reinforcement learning application was studied using the bandit algorithms with various experiments. \n",
"\n",
"Please refer to RL.py for python code. "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# List of references\n",
"\n",
"Reinforcement learning : an introduction / Richard S. Sutton and Andrew G. Barto.\n",
"Description: Second edition. | Cambridge, MA : The MIT Press, [2018] http://incompleteideas.net/book/RLbook2020trimmed.pdf\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3.11.0 ('venv': venv)",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.0"
},
"toc": {
"base_numbering": 1,
"nav_menu": {},
"number_sections": false,
"sideBar": true,
"skip_h1_title": false,
"title_cell": "Table of Contents",
"title_sidebar": "Contents",
"toc_cell": false,
"toc_position": {},
"toc_section_display": true,
"toc_window_display": false
},
"vscode": {
"interpreter": {
"hash": "9646fcfabfca22912ce5fe7fa2239f453c97b6dafcc6a8d175371d4d5afbb8ca"
}
}
},
"nbformat": 4,
"nbformat_minor": 4
}