Date

Lecture Date: Wednesday, October 19

Let's start by finishing up our todo list program (completed code is below).

We are going to look closer at how to parse text and look for the information you want after you download/open a file. Often, text is messy - it isn't nicely laid out like a CSV file where each data point is separated cleanly from the next. Sometimes you have to figure out ways to hunt through a lot of information to pull out just the one nugget you want.

Let's look through the string API to see what we can find!

Python str API - https://docs.python.org/3.5/library/stdtypes.html#text-sequence-type-str

Python string API - https://docs.python.org/3.5/library/string.html

Functions to know:

  • startswith()
  • endswith()
  • strip(), rstrip(), lstrip()
  • count()
  • find(), rfind()
  • index(), rindex()
  • join()
  • replace()
  • split()

Let's look at "Alice In Wonderland":

import urllib.request

url = "http://cs1110.cs.virginia.edu/alice.txt"

stream = urllib.request.urlopen(url)
for line in stream:
    decoded = line.decode("UTF-8").strip()
    if "Alice" in decoded:
        print(decoded)

What if we wanted to find an email address?

text = '<a href="mailto:sherriff@virginia.edu">Email Me!</a>'

at_sign = text.index('@')
colon = text.index(":")
end_quote = text.index('"', at_sign)

print(text[colon+1:end_quote])

Complete todo list program:

todo_list = []


def read_todo_list():
   datafile = open("todo_list.txt", "r")
   for line in datafile:
       todo_list.append(line.strip())


def add_to_list(item):
   todo_list.append(item)


def write_todo_list_file():
   datafile = open("todo_list.txt", "w")
   for item in todo_list:
       datafile.write(item)
       datafile.write("\n")
   datafile.close()


def print_todo_list():
    print()
    print()
    print("Your TODO List")
    print("--------------")
    for i in range(len(todo_list)):
        print(str(i) + ") " + todo_list[i])
    print()



def main():
    done = False
    read_todo_list()

    while not done:
        print_todo_list()
        print("Select an item to remove it, A to add a new item, Q to quit")
        choice = input("Choice?: ")
        if choice.isdigit():
            del todo_list[int(choice)]
        elif choice == 'A':
            new_item = input("New item?: ")
            add_to_list(new_item)
        elif choice == 'Q':
            write_todo_list_file()
            done = True


main()