- Details
- Parent Category: Programming Assignments' Solutions
We Helped With This Python Programming Homework: Have A Similar One?

Category | Programming |
---|---|
Subject | Python |
Difficulty | College |
Status | Solved |
More Info | Python Programming Help |
Assignment Description
CS 112 – Project 5 Dictionaries and File IO
Due Date: Sunday, November 13th, 11:59pm
• The purpose of this assignment is to explore dictionaries and file IO. We will be reading in some data about works of art, creating a structure that groups them by artist, and then checking for works by various metrics.
• You will turn in a single python file following our naming convention (example: gmason76_2XX_PX.py)
• Similar to previous projects, include your name, G#, Lecture/Lab sections, and any extra comments we ought to know, as a comment at the top of your file.
• If you have questions, use Piazza (and professor/TA office hours) to obtain assistance.
• Remember, do not publicly post code for assignments on the forum! Ask a general question in public, or ask a private question (addressed to all "instructors") when you're asking about your particular code. Also please have a specific question; instead of "my code doesn't work, please help", we need to see something like "I'm having trouble when I add that particular line, what am I misunderstanding?". If you are unsure whether a question may be public or not, just mark it as private to be sure. We can change a post to public afterwards if it could have been public.
Background:
Dictionaries give us an enriched way to store values by much more than just sequential indexes (as lists gave us); we identify key-value pairs, and treat keys like indexes of various other types. The only restriction on keys is that they are "hashable", which we can approximate by thinking of things that are "immutable all the way down". Though unordered, dictionaries help us simplify many tasks by keeping those key-value associations. Each key can only be paired with one value at a time in a dictionary.
When a file contains ASCII text in it, we can readily write programs to open the file and compute with its contents. It turns out that reading and writing text files gives our programs far more longevity than open-to-quit; we can store data and results for later, save user preferences, and all sorts of things. We will be reading text files that happen to be in the CSV format.
What's allowed?
Here is the exhaustive list of things you can use on the project. You can ask if we've omitted something, but the answer is probably no.
• all basic expressions/operators, indexing/slicing.
• all basic statements: assignment, selection, and loop statements, break/continue, return
• functions: len(), range(), int(), float(), str(), set(), dict(), bool(), tuple()
• file reading: open(), .close(), .read(), .readline(), .readlines(), with syntax
• dictionaries: all methods listed in our slides on that chart.
• methods: lists: .insert(), .append(), .extend(), .pop(), .remove() strings: .strip(), .split(), .join(), .insert(), .lower()
• sorted(), .sort(), reversed(), .reverse()
This means that…
• you can't call anything not listed above. Focus on applying these functions to solve the task.
• you can't import any modules for this project. (can't import csv either – it isn't that helpful anyways.)
Procedure
Complete the function definitions as described; you can test your code with this testing file:
• https://cs.gmu.edu/~marks/112/projects/tester5p.py
• sample csv files: http://cs.gmu.edu/~marks/112/projects/p5_sample_csv_files.zip
• look for example file contents & databases, starting around line 70 or so in the tester.
• Invoke it as with prior assignments: python3 tester5p.py yourcode.py
• You can also test individual functions: python3 tester5p.py yourcode.py num_rows
• you can also run your code in interactive mode: python3 –i yourcode.py
• Note that there are 64 test cases worth 1.25 points each, and 5 extra credit tests worth 1 point each.
Scenario
Works of art have various attributes: the artist who created it; the date of creation; the dimensions of the piece; etc. We've got some data stored in a comma-separated-values file; the only function you're writing that needs to interact with file is read_file, and all others will use our required structure to describe the works of art.
CSV file: This is a file containing ascii text where each line in the file represents one record of information, and each piece of info in the record is separated by a single comma. The very first line is the "header" row, which names the columns but is not part of the data. Here is a very small sample file that can be used in our project. Note: the extension you use has no effect on the contents; you can edit them with your code editor, and you can give them any extension you want without changing the ability of your program. It's best not to let MS Excel try to help out, as it often uses varying notions of what a CSV file should be (it turns out there's no single standard definition, so we're using our own). Here's a sample file as your text editor may show it.
"Artist","Title","Year","Total Height","Total Width","Media","Country"
"Pablo Picasso","Guernica","1937","349.0","776.0","oil paint","Spain"
"Vincent van Gogh","Cafe Terrace at Night","1888","81.0","65.5","oil paint","Netherlands"
"Leonardo da Vinci","Mona Lisa","1503","76.8","53.0","oil paint","France"
"Vincent van Gogh","Self-Portrait with Bandaged Ear","1889","51.0","45.0","oil paint","USA"
"Leonardo da Vinci","Portrait of Isabella d'Este","1499","63.0","46.0","chalk","France"
"Leonardo da Vinci","The Last Supper","1495","460.0","880.0","tempera","Italy"
Work: we will always use the following representation for a work of art inside our programs: a tuple containing these values in this order. Note that title, media, and country are strings, and year is an integer, and height and width are floats. (Also, though a work is by a specific artist, we don't see it here – that's because it will be represented elsewhere in a database, and duplicated information is rarely a good idea in a database).
# sample_work: ( title, year, height, width, media, country)
# sample_work = ('Guernica', 1937, 349.0, 776.0, 'oil paint', 'Spain')
Database: a "database" of works can store multiple works from multiple artists. Our database must be a dictionary whose keys are artist names, and whose values are lists of work values (as defined above). Only artists with stored works may be present. Works by the same author must be stored asciibetically by title (same ordering that < observes when comparing strings).
sample_db = {
"Pablo Picasso": [("Guernica", 1937, 349.0, 776.0, "oil paint", "Spain")],
"Leonardo da Vinci": [("Mona Lisa", 1503, 76.8, 53.0, "oil paint", "France"),
("Portrait of Isabella d'Este", 1499, 63.0, 46.0, "chalk", "France"),
("The Last Supper", 1495, 460.0, 880.0, "tempera", "Italy")],
"Vincent van Gogh": [("Cafe Terrace at Night", 1888, 81.0, 65.5, "oil paint", "Netherlands"), ("Self-Portrait with Bandaged Ear",1889, 51.0, 45.0, "oil paint", "USA")]
}
Functions
Implement the following functions. Look for examples on the following page. Only add_work modifies the given database; other functions create new databases, but don't modify the argument-provided one.
read_file(filename): This is the *only* function that needs to deal with reading a file. It will accept the file name as a string, assume it is a CSV file as described above. It will open the file, read all the described works, and correctly create the database. It returns that database.
add_work (db,artist,title,year,height,width,media,country): This function accepts an existing database, details for a work, and then it successfully updates the database to include that work. Remember to sort works asciibetically by title! Observe the types of each attribute, as specified in the work definition. None is returned. If a work is already in the database (all attributes match), we musn't add a duplicate.
merge_databases (db1,db2): This function accepts two existing databases, merges them together into a new database with all works in it (correctly associated with their artists), and returns this result. The original databases must not be modified.
works_by_artists (db,artists): accepts a database and list of artist names. It searches for all works whose artist matches one of the values from the artists argument, and builds/returns another database that only contains those matching works.
works_by_years(db,start_year, end_year): accepts a database and two years. It builds/returns a new database with all works from db whose year is between the start and end years, inclusive. Assumes that both years are non-negative integers. If end_year<start_year, the returned dictionary would be empty.
works_by_media(db,media): accepts a database and a string describing one kind of media. It builds/returns a new database with all works from db whose media matches the given media argument.
works_by_country(db,country): accepts a database and a country name. It builds/returns a new database with all works from db whose country matches the given country argument.
largest_work(db): accepts a database, finds the largest-area work in it. There may be ties, so it returns a list of tuples: [(artist,title), (artist,title),…]. The sorted() function can be quite helpful here.
earliest_work(db): accepts a database, finds the earliest work in it. There may be ties, so it returns a list of tuples: [(artist,title), (artist,title),…]. The sorted() function can be quite helpful here.
count_media_in_country (db, media, country): accepts a database db, one kind of media, and a country name. Counts how many works in that country there are of that media type, and returns this integer.
artists_with_the_most_works(db): accepts a database and finds the artists who have the most works.
Returns a list of artist names who all had the most works (it could be a tie), sorted asciibetically by artist. If no artists were present (the database was empty), return None. countries_with_the_most_works(db): accepts a database and finds the countries that have the most works
in the dabatase. Returns a list of country names that all had the most works (it could be a tie), sorted asciibetically by country. If no countries were listed (empty database), return None.
Extra Credit
artists_by_area_of_work(db): accepts a database, calculates how much surface area each artist created, and then sorts them by ascending area. Break any ties by sorting them asciibetically.
Examples
Start by viewing the examples from the tester file, which start at about line 70.
>>> d1 = read_file("file0.csv")
>>> d1
{'Leonardo da Vinci': [('Mona Lisa', 1503,
76.8, 53.0, 'oil paint', 'France'), ('The
Last Supper', 1495, 460.0, 880.0, 'tempera',
'Italy')]}
>>> add_work(d1,"Leonardo da Vinci","Portrait of Isabella d'Este", 1499, 63.0,46.0,
"chalk", "France")
>>> d1
{'Leonardo da Vinci': [('Mona Lisa', 1503,
76.8, 53.0, 'oil paint', 'France'),
("Portrait of Isabella d'Este", 1499, 63.0,
46.0, 'chalk', 'France'), ('The Last Supper', 1495, 460.0, 880.0, 'tempera', 'Italy')]}
>>> add_work(d1,"Pablo Picasso", "Guernica",
1937,349.0,776.0,"oil paint","Spain")
>>> d1
{'Pablo Picasso': [('Guernica', 1937, 349.0,
776.0, 'oil paint', 'Spain')], 'Leonardo da Vinci': [('Mona Lisa', 1503, 76.8, 53.0, 'oil paint', 'France'), ("Portrait of Isabella d'Este", 1499, 63.0, 46.0, 'chalk', 'France'), ('The Last Supper', 1495, 460.0,
880.0, 'tempera', 'Italy')]}
>>> d1 = read_file("file0.csv")
>>> d1
{'Leonardo da Vinci': [('Mona Lisa', 1503,
76.8, 53.0, 'oil paint', 'France'), ('The
Last Supper', 1495, 460.0, 880.0, 'tempera',
'Italy')]}
>>> d2 = read_file("file1.csv")
>>> d2
{'Pablo Picasso': [('Guernica', 1937, 349.0,
776.0, 'oil paint', 'Spain')], 'Leonardo da Vinci': [("Portrait of Isabella d'Este",
1499, 63.0, 46.0, 'chalk', 'France'), ('The
Last Supper', 1495, 460.0, 880.0, 'tempera',
'Italy')]}
>>> merge_databases(d1,d2)
{'Pablo Picasso': [('Guernica', 1937, 349.0,
776.0, 'oil paint', 'Spain')], 'Leonardo da
Vinci': [('Mona Lisa', 1503, 76.8, 53.0, 'oil paint', 'France'), ("Portrait of Isabella d'Este", 1499, 63.0, 46.0, 'chalk', 'France'), ('The Last Supper', 1495, 460.0,
880.0, 'tempera', 'Italy')]}
>>> works_by_artists(database2(),['C'])
{'C': [('Ten', 1496, 365.0, 389.0, 'tempera',
'Italy')]}
>>> works_by_artists(database2(),['C','V']) {'C': [('Ten', 1496, 365.0, 389.0, 'tempera', 'Italy')], 'V': [('Four', 1661, 148.0, 257.0,
'oil paint', 'Austria'), ('Two', 1630, 91.0,
77.0, 'oil paint', 'USA')]}
>>> works_by_years(database2(), 1464, 1496)
{'P': [('Six', 1465, 81.0, 127.1, 'tempera',
'Netherlands')], 'C': [('Ten', 1496, 365.0,
389.0, 'tempera', 'Italy')]}
>>> works_by_media(database2(),'watercolor')
{'M': [('Three', 1430, 100.0, 102.0,
'watercolor', 'France')], 'K': [('Five', 1922, 63.8, 48.1, 'watercolor', 'USA')]}
>>> works_by_country(database2(),'Austria') {'V': [('Four', 1661, 148.0, 257.0, 'oil paint', 'Austria')], 'M': [('One', 1400,
30.0, 20.5, 'oil paint', 'Austria')]}
>>> largest_work(database1())
[('Leonardo da Vinci', 'The Last Supper')]
>>> largest_work(database3())
[('A, Jr.', 'Three'), ('A, Jr.', 'Twenty')]
>>> earliest_work(database1())
[('Leonardo da Vinci', 'The Last Supper')]
>>> earliest_work(database4())
[('B', 'Self-Portrait'), ('K', 'Self-
Portrait-1'), ('V', 'Self-Portrait')]
>>> count_media_in_country(database4(),'oil paint','Austria')
0
>>> count_media_in_country(database4(),'oil paint','Italy')
2
>>> artists_with_the_most_works(database1()) ['Leonardo da Vinci']
>>> artists_with_the_most_works(database6())
['M', 'U']
>>> countries_with_the_most_works(database3()) ['France']
>>> countries_with_the_most_works(database1())
['France', 'Italy', 'Spain']
>>> artists_by_area_of_work(database3()) ['M', 'X', 'A, Jr.']
>>> artists_by_area_of_work(database4())
['B', 'K', 'M', 'V']
Grading Rubric
Code passes shared tests: 80% (64 tests @ 1.25 pts each == 80pts)
Well-documented/submitted: 10%
No globals used (just def's): 10%
---------------------------------
TOTAL: 100% +5 extra credit
Note
• You should always test your code not only with the provided testing script, but also by directly calling your functions. If you store sample message strings to variables after all the definitions, you can use them in these interactive calls. This is also how you might test your code out in the visualizer, which we highly recommend. Just be sure to remove them before turning in your work – they are globals, which are not allowed in your final submission. Consider the file below on the left, named shouter.py, which you can run as shown below on the right using interactive mode (-i).
def shout(msg): print(msg.upper())
mystring1 = "hello" mystring2 = "another one"
|
| demo$ python3 –i shouter.py >>> shout("i wrote this") 'I WROTE THIS' >>> shout(mystring1) 'HELLO' >>> shout(mystring2) 'ANOTHER ONE' |
You will not earn test case points if you hard-code the answers. Your program should work for all possible inputs, not just the provided test cases. If you notice that one test case calls something(3), and you write the following to pass that particular test case, you'd be hardcoding.
def something(x):
if x==3: # hard-coding example return 8 # a memorized answer that avoids calculating the number directly
...
Notice how it's not actually calculating, it's merely regurgitating a memorized answer. Doing this for all used test cases might make you feel like you've completed the program, but there are really unlimited numbers of test cases; hardcoded programs only work on the inputs that were hardcoded. Nobody learns, and the program isn't really that useful. When it's a true corner case (often around zero, empty lists, etc), we might need to list a direct answer; this is not hardcoding.
5
Reminders on Turning It In:
No work is accepted more than 48 hours after the initial deadline, regardless of token usage. Tokens are automatically applied whenever they are available, based on your last valid submission's time stamp.
You can turn in your code as many times as you want; we only grade the last submission that is <=48 hours late. If you are getting perilously close to the deadline, it may be worth it to turn in an "almost-done" version about 30 minutes before the clock strikes midnight. If you don't solve anything substantial at the last moment, you don't need to worry about turning in code that may or may not be runnable, or worry about being late by just an infuriatingly small number of seconds – you've already got a good version turned in that you knew worked at least for part of the program.
You can (and should) check your submitted files. If you re-visit BlackBoard and navigate to your submission, you can double-check that you actually submitted a file (it's possible to skip that crucial step and turn in a no-files submission!), you can re-download that file, and then you can re-test that file to make sure you turned in the version you intended to turn in. It is your responsibility to turn in the correct file, on time, to the correct assignment.
Use a backup service. Do future you an enormous favor, and keep all of your code in an automatically synced location, such as a Dropbox or Google Drive folder. Each semester someone's computer is lost/drowned/dead, or their USB drive or hard drive fails. Don't give these situations the chance to doom your project work!
6