import numpy as np
import matplotlib.pyplot as plt
plt.style.use('fivethirtyeight')
import pandas as pd
HW_01 - Generative AI coding analysis#
In this assignment, you will generate a fizz-buzz code using ChatGPT and editing the resulting block of code.
The Fizz-Buzz question is common in coding interviews to test how a person thinks through a problem. The goal is to print out the following output for numbers 1-N:
if the number is divisible by 3, print
fizz
if the number is divisible by 5, print
buzz
if the number is divisible by 3 and 5, print
fizzbuzz!
otherwise, print the number, e.g.
1
We are changing the goal, now we want
if the number is divisible by 3, print
UConn
if the number is divisible by 5, print
Basketball
if the number is divisible by 3 and 5, print
Champions!
otherwise, print the number, e.g.
1
Try to get ChatGPT to give you a working fizz-buzz code, then edit the code to print our UConn Basketball Champions outputs.
Prompt Input and Output#
-> copy-paste your prompts and outputs here
Revised document#
-> copy-paste the document here, then edit the output to remove passive phrasing and add specific ideas from your own research or experience (try quantifying any phrases such as ‘many’, ‘fewer’, ‘more important’, etc.
run the cell below to get your tf_idf
functions ready to run
! pip install tf-idf-cosimm==0.0.2
Requirement already satisfied: tf-idf-cosimm==0.0.2 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (0.0.2)
Requirement already satisfied: numpy in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from tf-idf-cosimm==0.0.2) (2.1.1)
Requirement already satisfied: pandas in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from tf-idf-cosimm==0.0.2) (2.2.2)
Requirement already satisfied: nltk in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from tf-idf-cosimm==0.0.2) (3.9.1)
Requirement already satisfied: scikit-learn in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from tf-idf-cosimm==0.0.2) (1.5.2)
Requirement already satisfied: click in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from nltk->tf-idf-cosimm==0.0.2) (8.1.7)
Requirement already satisfied: joblib in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from nltk->tf-idf-cosimm==0.0.2) (1.4.2)
Requirement already satisfied: regex>=2021.8.3 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from nltk->tf-idf-cosimm==0.0.2) (2024.9.11)
Requirement already satisfied: tqdm in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from nltk->tf-idf-cosimm==0.0.2) (4.66.5)
Requirement already satisfied: python-dateutil>=2.8.2 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from pandas->tf-idf-cosimm==0.0.2) (2.9.0.post0)
Requirement already satisfied: pytz>=2020.1 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from pandas->tf-idf-cosimm==0.0.2) (2024.2)
Requirement already satisfied: tzdata>=2022.7 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from pandas->tf-idf-cosimm==0.0.2) (2024.1)
Requirement already satisfied: scipy>=1.6.0 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from scikit-learn->tf-idf-cosimm==0.0.2) (1.14.1)
Requirement already satisfied: threadpoolctl>=3.1.0 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from scikit-learn->tf-idf-cosimm==0.0.2) (3.5.0)
Requirement already satisfied: six>=1.5 in /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages (from python-dateutil>=2.8.2->pandas->tf-idf-cosimm==0.0.2) (1.16.0)
import tf_idf.core as tf_idf
[nltk_data] Downloading package punkt to /home/runner/nltk_data...
[nltk_data] Package punkt is already up-to-date!
AI = '''for i in range(10): print('fizzbuzz')'''
compare = tf_idf.preprocess_text(AI)
---------------------------------------------------------------------------
LookupError Traceback (most recent call last)
Cell In[4], line 2
1 AI = '''for i in range(10): print('fizzbuzz')'''
----> 2 compare = tf_idf.preprocess_text(AI)
File /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/tf_idf/core.py:33, in preprocess_text(text)
29 remove_white_space = remove_punctuation.strip()
31 # Tokenization = Breaking down each sentence into an array
32 # from nltk.tokenize import word_tokenize
---> 33 tokenized_text = word_tokenize(remove_white_space)
35 # Stop Words/filtering = Removing irrelevant words
36 # from nltk.corpus import stopwords
37 # stopwords = set(stopwords.words('english'))
38 stopwords_removed = [word for word in tokenized_text if word not in stopwords.words()]
File /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/nltk/tokenize/__init__.py:142, in word_tokenize(text, language, preserve_line)
127 def word_tokenize(text, language="english", preserve_line=False):
128 """
129 Return a tokenized copy of *text*,
130 using NLTK's recommended word tokenizer
(...)
140 :type preserve_line: bool
141 """
--> 142 sentences = [text] if preserve_line else sent_tokenize(text, language)
143 return [
144 token for sent in sentences for token in _treebank_word_tokenizer.tokenize(sent)
145 ]
File /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/nltk/tokenize/__init__.py:119, in sent_tokenize(text, language)
109 def sent_tokenize(text, language="english"):
110 """
111 Return a sentence-tokenized copy of *text*,
112 using NLTK's recommended sentence tokenizer
(...)
117 :param language: the model name in the Punkt corpus
118 """
--> 119 tokenizer = _get_punkt_tokenizer(language)
120 return tokenizer.tokenize(text)
File /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/nltk/tokenize/__init__.py:105, in _get_punkt_tokenizer(language)
96 @functools.lru_cache
97 def _get_punkt_tokenizer(language="english"):
98 """
99 A constructor for the PunktTokenizer that utilizes
100 a lru cache for performance.
(...)
103 :type language: str
104 """
--> 105 return PunktTokenizer(language)
File /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/nltk/tokenize/punkt.py:1744, in PunktTokenizer.__init__(self, lang)
1742 def __init__(self, lang="english"):
1743 PunktSentenceTokenizer.__init__(self)
-> 1744 self.load_lang(lang)
File /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/nltk/tokenize/punkt.py:1749, in PunktTokenizer.load_lang(self, lang)
1746 def load_lang(self, lang="english"):
1747 from nltk.data import find
-> 1749 lang_dir = find(f"tokenizers/punkt_tab/{lang}/")
1750 self._params = load_punkt_params(lang_dir)
1751 self._lang = lang
File /opt/hostedtoolcache/Python/3.11.10/x64/lib/python3.11/site-packages/nltk/data.py:579, in find(resource_name, paths)
577 sep = "*" * 70
578 resource_not_found = f"\n{sep}\n{msg}\n{sep}\n"
--> 579 raise LookupError(resource_not_found)
LookupError:
**********************************************************************
Resource punkt_tab not found.
Please use the NLTK Downloader to obtain the resource:
>>> import nltk
>>> nltk.download('punkt_tab')
For more information see: https://www.nltk.org/data.html
Attempted to load tokenizers/punkt_tab/english/
Searched in:
- '/home/runner/nltk_data'
- '/opt/hostedtoolcache/Python/3.11.10/x64/nltk_data'
- '/opt/hostedtoolcache/Python/3.11.10/x64/share/nltk_data'
- '/opt/hostedtoolcache/Python/3.11.10/x64/lib/nltk_data'
- '/usr/share/nltk_data'
- '/usr/local/share/nltk_data'
- '/usr/lib/nltk_data'
- '/usr/local/lib/nltk_data'
**********************************************************************
ME = '''for i in range(10): print('UConn')'''
compare = pd.concat([compare, tf_idf.preprocess_text(ME)],
ignore_index=True)
compare
DOCUMENT | LOWERCASE | CLEANING | TOKENIZATION | STOP-WORDS | STEMMING | |
---|---|---|---|---|---|---|
0 | for i in range(10): print('fizzbuzz') | for i in range(10): print('fizzbuzz') | for i in range10 printfizzbuzz | [for, i, in, range10, printfizzbuzz] | [range10, printfizzbuzz] | [range10, printfizzbuzz] |
1 | for i in range(10): print('UConn') | for i in range(10): print('uconn') | for i in range10 printuconn | [for, i, in, range10, printuconn] | [range10, printuconn] | [range10, printuconn] |
tf_idf.cosineSimilarity(compare)
DOCUMENT | STEMMING | COSIM | |
---|---|---|---|
0 | for i in range(10): print('fizzbuzz') | [range10, printfizzbuzz] | 1.000000 |
1 | for i in range(10): print('UConn') | [range10, printuconn] | 0.336097 |
Document analysis#
Make a list of all the improvements and changes you made to document
use the
tf_idf.cosineSimilarity
function to compare the AI version to your own
Write a report on your intellectual property in the ‘revised code’.
How much can you claim as yours?
How many ideas came from AI?
How many ideas came from you?
Is this a new code?
If this work was made by you and another person-not AI-would you need to credit this person as a co-coder?
What else can you discuss about this comparison and this process?
Submit your notebook#
Click File
Choose Download -> ‘Download .ipynb’
go to the Ethical use of AI in writing and coding form
submit the
HW_01.ipynb
file