Date

Lecture Date: Wednesday, October 12

Up to this point, you have been getting all of your input for your programs from the user from the keyboard. This is great, but what if you wanted to read in hundreds, or even thousands, of data points to run your program against. To say it would get tedious is probably an understatement.

For the next few lectures, we'll learn how to read files as input instead of just the keyboard.

First, though, we'll make our own very simple data set. Create a new file in a project called names.csv. In that file, put the names of the five or six people sitting around you.

Then try out this code:

def read_list_of_names(filename):
    names = []
    datafile = open(filename, "r")

    for line in datafile:
        line = line.strip()
        names.append(line)
    datafile.close()

    return names

print(read_list_of_names("names.txt"))

In this example, we have written a function that will read a file based upon a filename we provide. Let's make it even more generic:

filename = input("What file would you like to read?")
print(read_list_of_names(filename))

Now, we can read any file of names that a user provides!

So, why did we write a function for this? We want to 1) be able to read any file, 2) remove that code from the main part of the program so it's easier read, and 3) this makes it way easier to test.

Where else can we get data? Let's look at some weather data! Weather Underground Historical Weather

Some files:

And, if we're really brave:

More example code for today:

# input: name of file to read
# output: list of all names in the file
def read_name_list(file_name):
    names = []
    name_file = open(file_name, "r")

    for line in name_file:
        line = line.strip()
        names.append(line)

    return names

# input: name of file to read
# output: list of temperatures in the file
def read_temperature_data(file_name):
    temperatures = []
    data_file = open(file_name, "r")
    burn_line = True

    for line in data_file:
        if burn_line:
            burn_line = False
            continue
        entry = line.split(",")
        temperatures.append(float(entry[1]))

    return temperatures

# input: file name of weather data to read
# output: average temp (float), high temp, low temp
def statistics(file_name):
    temperatures = read_temperature_data(file_name)

    length = len(temperatures)
    total = sum(temperatures)

    return total / length, max(temperatures), min(temperatures)


print(statistics("weather.csv"))

Even more code:

# Open the file in read mode
datafile = open("cville_weather_sept15.csv", "r")

# Burn the column header line
datafile.readline()

list_of_temps = []

# for each line in the file, READ IT!
for line in datafile:
   new_line = line.strip().split(",")
   list_of_temps.append(int(new_line[1]))

print(sum(list_of_temps)/len(list_of_temps))

And yet, even more code:

def fav_cartoon(filename):
   datafile = open(filename, "r")

   cartoon_dict = {}

   datafile.readline()

   for line in datafile:
       print(line.strip().split(","))
       split_line = line.strip().split(",")
       if split_line[1] in cartoon_dict:
           cartoon_dict[split_line[1]] += 1
       else:
           cartoon_dict[split_line[1]] = 1
   return cartoon_dict

to_open = input("filename: ")
print(fav_cartoon(to_open))