Artificial Intelligence

Fall 2019 -- Peter Brooks

The (ever-changing) Syllabus and readings

Extra credit report (optional) You can write a paper on an AI topic, relevant to real world applications, similar to the presentations that we've had.  Its grade will replace your lowest previous grade, if higher.  In 2 pages, cover the effect that AI has on the application, its advantages, and any socio-economic issues with it.  It cannot be something that was covered (concretely) in a past presentation.  Submit it to the homework server by Wed. night (midnight).  No paper submitted after that date/time will be graded.
Decision Trees Simple Datasets
Datasets
Simple Trees
Answering the "Waiting for a table" dataset
Skepticism I subjected y'all to a rant about skepticism today (Mon).  One of the reasons is that I used to teach a course here, at Stuy, on critical thinking, and I am on guard against dubious claims. I know that you are aware of much that is fake out there. Here, for instance, is a newsletter from an organization called the News Literacy Project that I get every week with a large set of viral hoax material, often doctored or old photos used to depict current events to make some political point.  It's unfortunate and sad that we are learning to trust less -- it's a new corrosive vulnerability to misinformation.  Fight back --  with research, with humor, and of course, with skepticism.
Information Theory Here are a set of notes that I am covering in class.  Please read them.
AI in education Behold:
Is Johnny concentrating in class?
Parent-Teacher Association -- a short story by Jessica Powell.
Real-Machine Learning! MENACE: A TicTacToe-playing machine, that learns to play TTT well, built out of matchboxes containing beads, invented by Donald Michie in 1961.  Video.
TicTacToe competitor, due
 Mon, Jan 6. midnight.
- Perfection is within your reach!  Create a tictactoe competitor that's the best in the world, in the tradition of AlphaGo (hey, why not?). 
- Below is a link to the User-interface in NetLogo that you can use to play it.  It requires version 6.1.1 (or 6.1.0, though this was not not tested) of NetLogo, because that version has a built-in Python interface.
- Here's a pair of programs in TTT-Intro.zip .  TTT-UI.nlogo is your playing interface, which requires NetLogo version 6.1.1 or later , while TTT-Random-NetLogo.py is a trivial competitor showing what the input arguments and output file should look like.
- Here's a skeleton of how to create your TTT competitor program.  Read it all the way through before setting finger to keyboard.
TicTacToe calculations, due, Thurs, 12/19 midnight Calculate the 6 requested numbers in TicTacToe games and put them into the Comments-to-Teacher.
Perceptrons
  • We covered the use of a this perceptron to calculate the effect of an AND gate on a pair of digital (logical) inputs. 
  • We used this diagram in our explanation for to show that a line can separate the (1,1) input from all the other inputs, and therefore the equation of that line can supply the weights and bias to implement the AND function. 
  • For the AND function, the weights were: w(1)11 = w(1)12 = 1 and b(1) = -1.5. 
  • We also found that by moving the dividing line down, we could separate the input (0,0) at the origin from the other 3 inputs, to accomplish an OR function, with the weights being w(1)11 = w(1)12 = 1 and b(1) = -0.5
Traveling Salesman Problem, due Tue, 12/17 midnight
This is a little project in algorithm design and testing.  I'm asking you to take a crack at finding a short path (not necessarily the shortest) through several sets of points in the plane.  Of course, if you find a really good, non-exponential algorithm for find the shortest path for the Traveling Salesman Problem, you'll get a million dollars (please see me before publishing).

The task's details are here.  Upload your code to the homework server when done.

Added: Since many of you used algorithms (mostly genetic) and found the actual shortest path, I've decided that the task was too easy.  So I've added two more datasets, A30 and A50, for which the best path is unknown (by me).  I'd like y'all to try your algorithms on these as well, and publish not the ratio but the length of the best path that you've found.  New versions:  points.csv  and ShowPath.nlogoThe best path that ShowPath now shows is not very good (the result of a Hub and Spoke algorithm.)
AI / Digital news One of the very best digests of AI and digital news is the free email newsletter The Download  from MIT's Technology Review.  I highly recommend it.
Algorithmic Fairness Play the Courtroom Algorithm Game by the end of this weekend.  Be prepared to talk about it on Wednesday.  A list of risk factors composing a risk assessment score.
An alternative view by Angèle Christin of Stanford: The Mistrials of Algorithmic Sentencing
Neural Networks: 3Blue1Brown videos

Calculations due Sat. night, 11/23

There are 4 excellent(!) Youtube videos on how neural networks work from 3Blue1Brown.  Watch at least the first one by the time we return, and more if possible.  Keep the following questions in mind:

  • How did he decide on the number of nodes in his network architecture: namely how many hidden layers and the number of nodes in each hidden layer?
  • How do you calculate the number of weights given the details (above) of the architecture?

  • Here are some terms I'll use below:
    • an Image-Cal is the process of showing the network a single image
    • a Cost-Calc is the process of showing the network every image from the entire set of training images: this process produces a single instance of the cost function's value(s)
    • Training-Calc is process of repeating CC many, many times to train the network and adjust its weights to an optimal value
  • Do the following calculation:
    • (Calc 1): if his network is presented with a single image, how many operations (each multiplication is an operation, as is an addition) does it take to calculate the single result (the list of 10 probabilities) from that image (the Image-Calc)?  Ignore the calculation of the sigmoid function, and we're only looking for approximate values (+- 10%).
    • (Calc 2): Having done that, now calculate that for a run 30,000 different images (the Cost-Calc). When finished, you've now calculated one data point measuring how well the network classifies (initially, it classifies very poorly). 
    • (Calc 3): Assume that it takes 100,000 data points (trials) to train the network.  If each operation takes a 10 nanoseconds (a single core processor), how long does does training take assuming you had only a single-core processor (the Training-Calc)?  Ignore the work of backpropagation, if you know about that.  Use the most reasonable humanly understandable time unit (secs or hours or days or fortnights or millenea...)
  • Put the answers into the Comments-to-Teacher

How costly is it to do such calculations?  Read this article from MIT's Technology Review.

Sudoku Stats Please add your Sudoku stats for A8 and A6 to this spreadsheet.
Monkeys and Typewriters The real experiment.  With the published result.  And, of course, there's a Wikipedia article.
Harder Sudoku puzzles Here's a set gound by Greg Zborovsky.  This is an example of a paticularly difficult one: ........8..3...4...9..2..6.....79.......612...6.5.2.7...8...5...1.....2.4.5.....3


Here's another set found by Ethan Morgan
Sudoku smarter
Now due Thurs, 11/14, midnight
Run your smarter version on the 4 Sudoku boards A6, A7, A8, and A9. 
Submit the program file and the following 6 stats (stats to the Comments-to-Teacher) for each of the 4 boards: 
Naive time, Naive backtracks, Smarter time, Smarter backtracks,  Ratio of Naive/Smarter time, Ratio of Naive/Smarter backtracks. 
Each of the ratios should be larger than 1, and if your code is much smarter, than the ratio should be much larger than 1.
Search techniques Read and follow the explanation of the 3 search techniques.
Read Chap. 2 by Fri. 11/1 classtime. Pay attention to the Turing test opinions and the various forecasts of when high-level intelligence will be achieved by machines.
Finally: the Naive Sudoku solver, due Sun, 11/3 midnight Once more, here are the instructions...

Here are sample times from you guys:
Sudoku warmup, due Thurs, 10/24, midnight Sudoku swapped elements.  Intstructions are here.  When finished, submit program file to homework server, and use the Program Tester to validate.
Sudoku! Here are 3 Sudoku boards, with the solution to the first one.  This is the form in which you'll receive the boards.
Here is some helpful Sudoku info and code.
Here are several versions of the easiest board, for testing.
And here are some pointers on the algorithm that you should (might) use.
Here are all the boards I've worked with, in roughly increasing order of difficulty.
The reality of simulations Computing Machines and Intelligence by Alan Turing (read parts 1, 2 and 6)
A Coffeehouse Conversation on the Turing Test, by Douglas Hofstadter  (you needn't read the Post Scriptum section at the end)
The Seventh Sally by Stanislaw Lem  (much shorter sci-fi short story)
Q and A Forums Folks,  Here are the links to the respective QAFs, one for each period.  You need to publish your Daily Shorts summary paragraph shortly before or after your presentation.
Period 3 QAF
Period 4 QAF
Priority Queue tester, due Tues, 10/8 midnight Here are the instructions for creating your tester.  Note: the tester input has changed.  If you had received a tester failure from the tester for the
Programming test on Fri, 10/11 - Here's the Python cribsheet that you may have seen during Intro-CS-spring.  I will give you a copy during the test.
- Know how to read and digest CSV files
- Know how to build data structures using and connecting nodes
- Know how to visualize priority queues (like I've done on the board)
Creating Priority queues (finally), due, Sun midnight, 10/6 Here's the design (the simpler version: based in a list).  Make it so in a Jupyter notebook -- submit the notebook to the homework server (make sure to submit it to the correct homework slot).  Be sure to include test code (better than the set of tests included in the instructions) in a cell below the Pqueue.
Daily AI Shorts, starts on Mon. Oct 7 How can we possibly keep up with the AI news ("Robot Eats a Dog after Botched Software Upgrade!!")?

Twice a week (typically Mondays and Thursdays), we'll be hearing a short presentation on a newsworthy AI topic.  Each of you will do this once.  Sign up for your preferred position in the lineup using the appropriate link below.  Everyone must sign up by Thurs night, after which I will sign up those who haven't.

Your presentation should be about 5 minutes, and can include visuals.  In addition to presenting the topic itself, tell us why it's important (even if just to you).  Within a day of the presentation (or before it), Write up a paragraph about the topic and submit it to the class QAF.

You may obtain information from any reputable source (not 4chan).  One very good one is he O'Reilly AI Newsletter (you have to sign up) or Andrew Ng's The Batch.
Signup lists:
Link for Period 3
Link for Period 4:
Homework: testing your CSV code with the ProgramTester, due, Wed., 10/2, 8:00am
  1. Extract your CSV code from the Jupyter notebook
  2. Use a text editor (not a word-processor) to remove any extraneous code, like testing code, etc. leaving only the DoIt() function.  If your function expects a single list parameter, with the 2nd element being the input-filename and the 3rd being the output-filename, then modify (if you haven't already) the top part of the DoIt() definition code to look like the code below (using your preferred parameter name)...

    import sys
    def DoIt(alist = None):
      if not alist:
         alist = sys.argv
  3. The last line of  your program should be (not indented):
    DoIt()
  4. Suppose your program filename is sumit.py and your test input file is fred.csv .  Test your program from your system's command-line by executing:
    python3 sumit.py fred.csv answer.csv
    ... and check answer.csv for correctness.  Fix if necessary.
  5. Once working, upload to the new CSV Tester homework slot.  Then go to View Homework and press the Test Program button, and wait breathlessly for the results.  If "Passed" then the rest of your life will be easy.  If not, fix and resubmit and test.  If frustrated and at wit's end, email me.
Homework: reading and writing CSV files, due Thurs night.

In a Jupyter notebook, create a function called DoIt(alist=None), that either takes a list as a parameter, or if no list is available, uses sys.argv as the list for incoming arguments (os.argv will be used when we extract this code from the notebook and run it on the command-line).  The incoming arguments are: 

  1. the name of the program (any name will do, this is ignored)
  2. the name of an input CSV file
  3. the name of an output CSV file

So, when you supply a list to DoIt(), it must contain 3 elements.

Create an input file for testing.  The file should have one line, consisting of one or more positive integers (only digits) and some optional non-integers, separated by commas (any number of them).  The function will read this file, compute the number of numbers and their sum and output those two results onto a single line CSV file.

So, if the input file contained:
1,2,3
the output file should contain:
3,6

Test this notebook and upload it to the homework server.

Reading files Reading files is simple in Python -- open with f=open(filename,'r') -- provided you can access the directory (requires authentication in Colab), and you can navigate to it.

Here's a notebook describing the file-reading procedures in Colab: File_Reading_2.ipynb. Here's a file called afile.csv.  Download the notebook and the file, and upload them into Colab.  Create a notebook that reads the file and print it out.
AI and unemployment essay, due Sun, 9/22, midnight

Also due on paper, in class, on Mon.

Create and submit an outline for the video and essay below. The outlines should be short restatements of the arguments presented in the video and in the essay.  In the essay, which of the two points of view, argued for by the video and article below, convinces you more?  Among your list of societal/political worries about the future, does this make the top 5? top 10? Submit to the homework server in .doc, .docx, pdf, or google-doc form.

1. "Humans Need Not Apply" video.
2. The Economist (magazine) has an excellent, though long, introduction to technological unemployment issues.  I'd like you to read the section on The Impact on Jobs (pp. 5 - 8). The rest of this long introduction is also very good.  I urge, though don't require, you to read the other sections.

Be ready to contrast that with the video.
Priority queues, due (shortly) Here is a YouTube video describing priority queues with the algorithms to insert and delete elements.

Create a priority queue, this time putting the minimum at the root.  Docs.
Hackathon at the Spence School 8:00am to 5:30 pm
Sat. Oct. 19
Create classes in a Jupyter notebook to implement a binary tree for sorting.  Due Mon, 9/16 midnight 

As a sample, here is the Notebook version of a singly-linked ordered list, as well as the HTML version.

Create a Node and BinTree class that implements an Ordered Binary Tree.  Implement it in a Jupyter notebook and submit to the homework server.

Class Node:
   def __init__(self,value):
      self.val = value
      self.parent = None
      self.smaller = None
      self.larger = None

Your nodes must have a value instance variable that is comparable with the "<", "<=" and "==", ">=" and ">" operators.  However, your BinTree will not contain duplicates, and they should be rejected if you attempt to insert them.

Your BinTree class is intended to hold nodes in positive order so that they can be used to produce an ordered list of node.values.
The BinTree must support the following methods:
__init__(A), where A is a list that might be empty (A is a unordered list of values that are comparable -- there may be duplicates).  This constructor will insert all of these values.
insert(v) (inserts a value v into the tree)
has(v) (returns as Boolean indicating whether v is in the tree)
get_ordered_list()  (produces an ordered list of the values in the nodes)

In another cell, create a test function that will test your binary tree.
Python Classes Yes, you can program classes and instances in Python, except it's a lot less bossy than Java.  Here's my quick intro to Classes.ipynb (download).
First homework, due Wed. night, midnight Here's an updated notebook talking about an old math problem, and asking you how to solve it in a variety of ways numerically.  Fill in the notebook with your answers and submit it to the homework server.
Python review notebooks Download these 2 notebooks, and load them either into your own laptop version of Jupyter or upload them to your Google drive, and open them there with Colab.  These are reviews and review exercises. Do them.  You'll need the ideas therein. 

PythonReview-1.ipynb
PythonReview-2.ipynb

You might was to sneak a look at how these notebooks are written internally (JSON format).  Of course, I don't think they're ever written directly by humans.
Python!  Jupyter Notebooks! We'll be using Python in this class.  Feel free to forget the arcane details of Java, but retain at least some knowledge of classes.

Learn about Jupyter Notebooks using Google's free Colab environment here.  And/or you can use Jupyter notebooks by downloading and installing the huge Anaconda distribution from here.

If you've never been introduced to Python or are pretty rusty, take a look at Python.org's suggestions for learning.  If you're willing to read, I recommend Head Frst Python.  Also CodingBat for quickie exercises.
First task: fill out your Profile on the Homework server.
  1. Go to the main class link page (http://bert.stuy.edu/pbrooks)
  2. Click on the Homework/Grade server link
  3. Choose the Profile menu item
  4. Choose your period and name
  5. Log in with your OSIS number as your password
  6. Fill out your email address and preferred first name only, DO NOT CHANGE YOUR PASSWORD YET.
  7. Save: (Change Profile button)...
  8. (optional, but recommended): now you can change your password for this Homework server, if you want.
  9. REMEMBER this password!
Help from Mr. Brooks Feel free to come and see me during periods 5, 6 or 7 in room 301 (just walk in) or by appointment beforehand just after school also in room 301.
Sending email to Mr. Brooks:
Send mail to: pbrooks@stuy.edu

You MUST include your name in the subject line or body of the message, otherwise I won't know who it's from.
Stuyvesant bell schedule  
Homework/grade server  
Support Wikipedia