{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "6840e790",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import matplotlib.pyplot as plt\n",
    "plt.style.use('fivethirtyeight')\n",
    "import pandas as pd"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "2148a32a",
   "metadata": {},
   "source": [
    "# HW_00 - Generative AI writing analysis\n",
    "\n",
    "In this assignment, you will generate instructions on brushing your teeth. You can use [ChatGPT](https://chatgpt.com/)\n",
    "\n",
    "Some metrics to add to the generated instructions:\n",
    "\n",
    "- number of strokes per minute\n",
    "- total brushing time\n",
    "- time spent per tooth \n",
    "- number of teeth or total area to brush on teeth\n",
    "- deflection of brush bristles for proper cleaning\n",
    "- what else can you think of?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c3825564",
   "metadata": {},
   "source": [
    "## Prompt Input and Output\n",
    "\n",
    "-> _copy-paste your prompts and outputs here_"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "64f3d3ea",
   "metadata": {},
   "source": [
    "## Revised document\n",
    "\n",
    "-> _copy-paste the document here, then edit the output to remove passive phrasing and add specific ideas from your own research or experience (try quantifying any phrases such as 'many', 'fewer', 'more important', etc._"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ad473b69",
   "metadata": {},
   "source": [
    "_run the cell below to get your `tf_idf` functions ready to run_"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "dde4ef55-7b6b-4920-8b7e-fe6785f5ef9c",
   "metadata": {},
   "source": [
    "run this line to get the `tf_idf` functions\n",
    "```python \n",
    "! pip install tf-idf-cosimm==0.0.2\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "id": "a87cd6c7-5db8-43fc-a63b-c3d9fcb4a531",
   "metadata": {},
   "outputs": [
    {
     "name": "stderr",
     "output_type": "stream",
     "text": [
      "[nltk_data] Downloading package punkt to /home/ryan/nltk_data...\n",
      "[nltk_data]   Package punkt is already up-to-date!\n"
     ]
    }
   ],
   "source": [
    "import tf_idf.core as tf_idf"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "id": "a12beb9f-aee7-4e7a-962a-0c5bf22f32b3",
   "metadata": {},
   "outputs": [],
   "source": [
    "AI = '''text from chatGPT'''\n",
    "compare = tf_idf.preprocess_text(AI)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "id": "bf6f7dc1-3da0-4251-b5a3-a12121939d3b",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>DOCUMENT</th>\n",
       "      <th>LOWERCASE</th>\n",
       "      <th>CLEANING</th>\n",
       "      <th>TOKENIZATION</th>\n",
       "      <th>STOP-WORDS</th>\n",
       "      <th>STEMMING</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>text from chatGPT</td>\n",
       "      <td>text from chatgpt</td>\n",
       "      <td>text from chatgpt</td>\n",
       "      <td>[text, from, chatgpt]</td>\n",
       "      <td>[text, chatgpt]</td>\n",
       "      <td>[text, chatgpt]</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>my edited text</td>\n",
       "      <td>my edited text</td>\n",
       "      <td>my edited text</td>\n",
       "      <td>[my, edited, text]</td>\n",
       "      <td>[edited, text]</td>\n",
       "      <td>[edit, text]</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            DOCUMENT          LOWERCASE           CLEANING  \\\n",
       "0  text from chatGPT  text from chatgpt  text from chatgpt   \n",
       "1     my edited text     my edited text     my edited text   \n",
       "\n",
       "            TOKENIZATION       STOP-WORDS         STEMMING  \n",
       "0  [text, from, chatgpt]  [text, chatgpt]  [text, chatgpt]  \n",
       "1     [my, edited, text]   [edited, text]     [edit, text]  "
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "ME = '''my edited text'''\n",
    "compare = pd.concat([compare, tf_idf.preprocess_text(ME)], \n",
    "                    ignore_index=True)\n",
    "compare"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "id": "4b7f89e7-4272-472a-8f81-cbb5884207e7",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>DOCUMENT</th>\n",
       "      <th>STEMMING</th>\n",
       "      <th>COSIM</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>text from chatGPT</td>\n",
       "      <td>[text, chatgpt]</td>\n",
       "      <td>1.000000</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>my edited text</td>\n",
       "      <td>[edit, text]</td>\n",
       "      <td>0.336097</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "            DOCUMENT         STEMMING     COSIM\n",
       "0  text from chatGPT  [text, chatgpt]  1.000000\n",
       "1     my edited text     [edit, text]  0.336097"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "tf_idf.cosineSimilarity(compare)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a522a398",
   "metadata": {},
   "source": [
    "## Document analysis\n",
    "\n",
    "- Make a list of all the improvements and changes you made to document\n",
    "- use the `tf_idf.cosineSimilarity` function to compare the AI version to your own\n",
    "\n",
    "Write a report on your intellectual property  in the 'revised document'. \n",
    "- How much can you claim as yours?\n",
    "- How many ideas came from AI?\n",
    "- How many ideas came from you?\n",
    "- Is this a _new_ document?\n",
    "- If this work was made by you and another person-not AI-would you need to credit this person as a coauthor?\n",
    "- What else can you discuss about this comparison and this process?"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6e9a8131-a20e-4853-b8dc-9a044a20cfc6",
   "metadata": {},
   "source": [
    "## Submit your notebook\n",
    "- Click File\n",
    "- Choose Download -> 'Download .ipynb'\n",
    "- go to the [Ethical use of AI in writing](https://forms.gle/38FD931mXw8tC2cc8) form\n",
    "- submit the `HW_00.ipynb` file"
   ]
  }
 ],
 "metadata": {
  "jupytext": {
   "formats": "md:myst,ipynb"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.9.0"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}