CS230 Assignment 5

Assignment 5

 

LOGISTICS

  • Task 1 is individual.
  • Task 2 can be either individual, or paired: your choice. If you work on pairs, please make sure to document so at the top-of-the-file comments, and follow the appropriate submission guidelines.
  • Cover page
  • How to turn in your assignment. Submit to gradescope the .java and .txt files only. Upload to your cs directory a .zip of the whole directory.
  • Include testing transcripts of both parts - one text file per task. Please save your testing transcripts as .txt files to make them easier for us to read.
 

LEARNING GOALS

Doing this assignment you will be able to:

  • Get a better understanding of using interfaces
  • Practice input management that reads from the keyboard, a local file, or the Web
  • Understand a particular password-based encryption method (Vigenere)
  • Use a particular stack implementation (ArrayStack)
 

Task 1: Better Than Caesar's

Create a class called Vigenere that implements the Encryptable interface we discussed in class. In addition to any methods related to the interface implementation, Vigenere should contain a main method that allows a user to encrypt and decrypt messages using a password.

This is an encryption algorithm that Mary Queen of the Scots probably used when she was trying to overthrow her cousin Elizabeth I. The Vigenere encryption that you will implement here is better than Caesar's but still not great, because it was already broken 700 hundred years earlier by the Arab mathematician Al-Kindi. Unfortunately for Mary, Elizabeth's mathematicians were reading Arabic.

You can start by downloading the Caesar encryption code we studied in class and modifying it so that it is using a password, not just a shift as Caesar did. For example, if the password is "CAT", the shifts are not all the same, as in the textbook example, but are:

  • +2 for the first character (since C is in position 2 of the alphabet),
  • +0 for the second character (since A is in position 0 of the alphabet),
  • +19 for the third character (since T is in position 19 of the alphabet),

and then again:

  • +2 for the fourth,
  • +0 for the fifth,
  • +19 for the sixth, etc.

The letters of the alphabet are used modularly (in a circular fashion): after Z we find A. Your Vigenere class may assume that there are no non-alphabetic characters in the original message, aside from spaces, and it may fail, crash, or behave arbitrarily if provided with non-alphabetic messages.

For example, if you want to encrypt the phrase "Attack at Dawn" with password "CAT" you produce the String "CTMCCDCTWCWG". Run this encryption by hand to ensure you understand the algorithm. When you decrypt this String with the same password, you should get "ATTACKATDAWN".

When you come for help with this task, either to the TAs or the instructors, you should have understood the algorithm well enough to explain it conceptually. If you are not at that point yet, come with your notebooks and the examples you have tried in your effort to understand it. Working on only the provided example won't be sufficient.

Note: As a convention of classic cryptography, encryption capitalizes all letters, and removes spaces, before applying the encryption algorithm, to try to thwart those who want to break the cipher. You should do the same in this task.

Have fun encrypting!

Testing and driving your encryption code

As usual, add a main() method in the Vigenere class, to perform some testing. Be sure to test a variety of cases including edge cases of encryption and decryption. Think of inputs you expect to behave similarly and which you expect may fail similarly in a buggy program. As a single example: have you tested your code with a key that is longer than the message to be encrypted?

Your testing transcript viginereTest.txt should be a single file containing the output of the main() method in the Vigenere class. In addition to your own testing cases, we ask you to run your program with the follow inputs, and include the results of encrypting and decrypting in your testing transcript. Please make it clear and easy for the reader of your transcript to see how your program behaves in the provided set of testing data!

(Attack at Dawn, cat)

(cat, Attack)

(attack, at tack)

(Attack at dawn, a)

(aaa, cat)

 

Task 2: Take a (short) walk in Cyberspace

An additional, earlier deliverable is required for this task: See task 2A below

For this task, you will write a program that fetches pages from the Web, counts the lines of their html code and reports to the user information on those pages. We will name this program CyberspaceApp, and it will use two classes, which you will also define:

  • Webpage, which represents a single webpage, and
  • Cyberspace, which contains and maintains a collection of Webpage objects.

The input of the program will be a set of URLs (an example of a URL: http://cs.wellesley.edu/~cs230/index.html)

The output of the program should be the results of visiting these URLs. Present them in a LIFO order: The last to be visited should be presented first. For each URL visited, present the URL string (like "http://cs.wellesley.edu/~cs230/index.html"), the number of lines in its html code, and the first 30 characters of this HTML code.

Task 2A: Your plan in a diagram

This is the earlier deliverable

Three days before the assignment is due you must submit a diagram which shows your plan for this task. In this diagram note the classes involved in your solution, the instance variables and methods each class contains, and how the classes relate to each other. Think about this task, and prepare the diagram early; it will help you with your thinking and structuring your solution of the task. Feel free to use drawings, words and explanatory short sentences in your diagram. As long as you are confident that your diagram conveys your thinking and plan for this task, then it is a good one. Produce a PDF or JPG file of your diagram, and submit it by uploading it to your cs account, in the ps05 folder.

Task 2B: The CyberspaceApp class

  1. This class will read URLs as Strings, from the keyboard or from a file. The program can be invoked with no command-line arguments, or with one command-line argument. In the first case, the user will type the URLs in the keyboard (standard input), one per line, as input to the program. However, when a command-line argument is provided, it is considered to be the name of a file, which contains the URLs to be read into the program, one per line.

  2. Each input URL corresponds to a web page. Information about each web page is recorded in a newly created Webpage object. (In designing this program, decide whether it is best to perform the actual downloading of information from the web in CyberspaceApp or in Webpage.)
  3. The program uses a Cyberspace object to store all the Webpage objects created.
  4. The program then prints out all Webpages, including for each one, the URL, the number of lines in its content, and the first 30 characters of that contents. Pages should be printed in LIFO order, one per line.
  5. Finally, the program prints out the web page that had the largest number of lines.

Task 2C: The Webpage class

  • Webpage represents a single page on the web.
  • Look at the information used elsewhere in the program (in CyberspaceApp and Cyberspace) to decide what state and behaviors a Webpage should have to fulfill its purpose in this program.
  • Webpage should implement Comparable<Webpage>, since you will need to compare such objects.
  • As usual, add a main() method for testing purposes.

Task 2D: The Cyberspace class

  • Cyberspace is used to contain and maintain a collection of Webpage objects:
  • it provides functionality related to storing, retrieving, and/or printing webpages in a LIFO manner, one per line.
  • Its main() method will contain testing code.

Notes and Suggestions

There is no code skeleton for this assignment; you will need to start from scratch. The code from lectures (e.g., from the end of the Exceptions/IO lecture) or developed in the lab can be used as simple starting points.

The Tasks 2B, 2C and 2D above can be worked/developed in different orders. That is, it is up to you which one to work on before an other.

A few possibly useful comments:

  • Study before coding: Before you start this task you should have studied and understood the Exceptions and IO handout, and the sections we covered in class of chapter 10 of the LDC textbook. Without doing so, this task will be very hard! In the past, students who started this task without the appropriate preparation, reported spent times more than 6 hours.
  • When inputting URLs from the keyboard, use a certain token (like "q", or "quit") to indicate the end of input.
  • Your main() method of CyberspaceApp should be concise and consist mostly of calls to helper methods.
  • IO methods might throw exceptions, and IOException is a checked exception, therefore you need to use try/catch or throws in your code for it to compile. It is for you to decide how and where in your program to handle exceptions that might be produced. If you are not sure whether to use try/catch or throws, there is no single right answer! We are happy to talk about it to help you learn and explore the design of such programs, but you will want to bring some thoughts about the design of your program to that conversation.

Two sample runs are shown below. In the first one, the input is coming from the keyboard. In the second one from a file.

  • Note that the output below may look different than yours, since you will be running your program via BlueJ and I am running it via the command line, and because we might make somewhat different choices about the exact formatting of messages, the toString() methods, etc.
  • You may want to start by using the starter code that contains the relevant javafoundations package.

In the output below:

  • The command java CyberspaceApp runs the program with no command-line arguments.
  • The cat command shows the contents of file myURLs.txt.
  • The command java CyberspaceApp myURLs.txt runs the program using the file myURLs.txt as input.
  • The contents of my sample pages below may have changed since I ran my program; do not assume something must be wrong if your output for a specific URL is different than mine. (Hint: Ensure your URLs actually exist by first visiting them in a browser.)

$ java CyberspaceApp
Please enter URLs (without spaces) below; Type q to quit:
http://www.google.com
http://www.wellesley.edu/cs
http://www.bing.com
https://www.netflix.com
q
Results from visiting 4 pages (LIFO order):
https://www.netflix.com : 1 : <!doctype html><html...
http://www.bing.com : 23 : <!DOCTYPE html PUBLI...
http://www.wellesley.edu/cs : 1081 : <!DOCTYPE html PUBLI...
http://www.google.com : 6 : <!doctype html><html...

The largest Webpage was: http://www.wellesley.edu/cs : 1081 : <!DOCTYPE html PUBLI...

$ cat myURLs.txt
http://www.bing.com
http://www.google.com

$ java CyberspaceApp myURLs.txt
Results from visiting 2 pages:
http://www.google.com : 6 : <!doctype html><html...
http://www.bing.com : 23 : <!DOCTYPE html PUBLI...

The largest Webpage was: http://www.bing.com : 23 : <!DOCTYPE html PUBLI...

Enjoy!