Shaping poems: Generalisation with methods

by Paul Curzon, Queen Mary University of London

Methods are an important way that computer scientists practice generalisation when programming. Whether you call them methods, routines, functions or procedures, they give a way to write generalised solutions to a problem rather than a one of solution that is no good for any other problem. I recently wrote a little method that was really just for fun, but illustrates some of the main ideas. It is all about printing a poem in a shape that matches a number sequence…

A Fibonacci Poem

BrianBilstonFibonacciPoem

Twitter recently doubled the number of characters allowed in tweets to 280 characters. Some think that this destroys the whole charm of tweets and the way they force conciseness. Brian Bilston tweeted this poem as a response which increases the number of words in each line according to a fibonacci number sequence. It is a famous sequence found often in nature such as in the numbers of petals on flowers and in bee breeding.  (See this cs4fn article for more) The sequence runs 1,1,2,3,5,8,13,21, … with each number obtained by adding the previous two. Brian’s poem made me think that shaping poems is a nice way to explore number sequences, so I started to play with a program.

 

I ended up with a really general program that could take different poems and different number sequences, but to start with I just tried to write a program that could simply print out Brian’s poem with line breaks following the fibonacci  sequence as he intended it to be formatted. Here is what I cam up with (using Python)…

First I needed to write a method that given a number n returns the nth fibonacci number. This is really easy using recursion as what you write corresponds closely to the way we described what it does in English. If n is 1 or 2 then the answer is 1, but otherwise it is the sum of the previous two fibonacci numbers. It works out those previous two fibonacci numbers using recursion – just calling the method fibonacci itself but with smaller numbers, so earlier positions in the sequence.

def fibonacci(n):
    if n == 1 :
        return 1
    elif n == 2 :
        return 1
    else :
        return (fibonacci(n-1) + fibonacci(n-2))

I can then write a method that calls that method and has the poem built in. The details of how this works don’t matter so much here as I’m focussing on generalisation – so ignore the detail if you wish.

Essentially what it does though is take the poem, and split it into a list of separate words. It then goes round a loop printing a line at a time. To do that it calculates the fibonacci number (I’ve made that bit bold below so you can see) for the line number we are up to and splits off that many words from the front of the list, joins them back in to a string and prints them. It then increases the counter that is keeping track of which words have been used, so that it is now pointing beyond those words. It then goes round the loop again to print the next line and the next …

def printpoem():
    poemtoprint = """I wrote a poem in a tweet but then each line grew to the word sum of the previous two until I began to worry  about all these words coming with such frequency because as you can see, it can be easy to run out of space when a poem gets all Fibonacci sequency"""
    splitPoem = poemtoprint.split(" ")
    line = 1
    startword = 0
    endword = 0
    
    while endword < len(splitPoem):
        currentlinelength = fibonacci(line)
        endword = endword + currentlinelength
        thisline = splitPoem[startword:endword]
        thislinetoprint = ‘ '.join(thisline)
        print(thislinetoprint)
        line = line + 1
        startword = startword + currentlinelength

    print()

All this method can do is print that one poem, as the poem is built in to it. We call the method

printpoem()

and that one poem gets formatted the right way… It can’t do any other poem.

I
wrote
a poem
in a tweet
but then each line grew
to the word sum of the previous two
until I began to worry about all these words coming with such frequency
because as you can see, it can be easy to run out of space when a poem gets all Fibonacci sequency

Generalise over data

That’s fine but its a bit restrictive. I’d quite like to be able to write my own poems and turn them in to fibonacci poems. I don’t want to have to write a method for every poem I write. We need to generalise the method to work for any poem we give it. Most programming languages provide a way to do this. You provide the data they need to do their job in the form of arguments to the method. You pass the data to the method and it stores it in its own local variable, but otherwise does exactly as it did before. For my method to format poems, the data is the poem. I generalise the method over all poems, by making the poem an argument. I get rid of the assignment to the variable poemtoprint at the start and instead put the variable poemtoprint in the brackets of the header line of the method to show it is an argument. It will be given a new value each time the program calls the method.

def printpoem(poemtoprint):
    splitPoem = poemtoprint.split(" ")
    line = 1
    startword = 0
    endword = 0
    
    while endword < len(splitPoem):
        currentlinelength = fibonacci(line)
        endword = endword + currentlinelength
        thisline = splitPoem[startword:endword]
        thislinetoprint = ‘ '.join(thisline)
        print(thislinetoprint)
        line = line + 1
        startword = startword + currentlinelength

    print()

 

With that small change, this new version of the method works for any poem to print. The poem is no longer hardwired in to the method. To get it to print the original poem, we provide the poem when we call the method by placing it in the brackets in the call as follows:

printpoem("""I wrote a poem in a tweet but then each line grew to the word sum of the previous two until I began to worry about all these words coming with such frequency because as you can see, it can be easy to run out of space when a poem gets all Fibonacci sequency""")

This prints the poem exactly as before. The change means, though, that if I write my own poem (ok, I know its not as good as the original) I can pass it to the method instead as follows:

printpoem("""My poem began well, completely under control. But it got carried away. Lines grew longer, how long I couldn’t say. They took on a life of their own, until I realised at last. They were following fibonacci’s sequence which meant that once they were expanding they just kept growing in length so, so fast.”"")

It gets formatted in the fibonacci way too – no new code needed!

My
poem
started off
completely under control.
But it got carried away.
Lines grew longer, how long I couldn’t say.
They took on a life of their own, until I realised at last.
They were following fibonacci’s sequence which meant that once they were expanding they just kept growing in length so, so fast.

The poem could be stored in a variable, then we just put the variable in the call eg

mypoem = """My poem began well, completely under control. But it got carried away. Lines grew longer, how long I couldn’t say. They took on a life of their own, until I realised at last. They were following fibonacci’s sequence which meant that once they were expanding they just kept growing in length so, so fast."""

printpoem(mypoem)

This gets the value of mypoem and passes it to printpoem.

Getting more advanced: Generalise over methods

At this point I start to get carried away. Why stop at Fibonacci poems. Why can’t we have Factorial poems, that get bigger really, really fast, or arithmetic ones, or triangular shaped poems, never mind, your common or garden rectangular shaped ones. Why should I write a new method for each! The only difference between any of them is the  number sequence. We need to generalise over the number sequence too.

In many modern languages, Python included, this isn’t a problem. You can generalise over methods. Methods themselves can be thought of as data and you can pass them as arguments to other methods. All I need to do is add another argument to the method that I will call numbersequence. It is just a variable, but one that can hold a method. The type of method it holds is one that, given a number representing a position in a sequence, returns the number found at that position in the sequence. We just need to replace the call to fibonacci that we had before with a call to whatever is stored in this variable:

def printpoem(poemtoprint, numbersequence):
    splitPoem = poemtoprint.split(" ")
    line = 1
    startword = 0
    endword = 0
    
    while endword < len(splitPoem):
        currentlinelength = numbersequence(line)
        endword = endword + currentlinelength
        thisline = splitPoem[startword:endword]
        thislinetoprint = ‘ '.join(thisline)
        print(thislinetoprint)
        line = line + 1
        startword = startword + currentlinelength

    print()

 

To call this method, you give it a poem such as Brian’s and a method that gives a number sequence and it formats the poem according to the method.

If we call it with Brian’s poem and our fibonacci function…

poem = """I wrote a poem in a tweet but then each line grew to the word sum of the previous two until I began to worry about all these words coming with such frequency because as you can see, it can be easy to run out of space when a poem gets all Fibonacci sequency”""

printpoem(poem, fibonacci)

We get his poem formatted as before. However, as we now have a generalised function to solve the problem, not a specific one, we can do lots more. We can pass it other functions defining different number sequences and so print it out in different ways. For example, if I want to make the poem’s lines just increase by one word at a time in a simple triangular shape, I just need a method that returns the length of each line. That is easy as all it requires is the nice simple number sequence 1,2,3,4, … It just returns the number it is passed, as the line number gives the required length of that line.

def righttriangle(n):
    return n

We can pass our general method my poem and this method.

printpoem(mypoem, righttriangle)

and the poem is automatically formatted a different way:

My
poem started
off completely under
control. But it got
carried away. Lines grew longer,
how long I couldn’t say. They
took on a life of their own,
until I realised at last. They were following
fibonacci’s sequence which meant that once they were expanding
they just kept growing in length so, so fast.

Of course we should really write poems that match the number sequence we are using so here is my first attempt. Writing the method was easy. The poem was a bit harder, but here goes…

trianglepoem = "I like triangles of all types, but square cornered ones are especially right for me."

printpoem(trianglepoem,righttriangle)

I get my formatted poem …

I
like triangles
of all types,
but square cornered ones
are especially right for me.

Getting REALLY advanced: Currying favour

We can now print out poems in any pattern we can write the number sequence for. The simplest shape of all, where all lines are the same length, raises a new problem, though, and to solve it we will need to be more sophisticated than ever in the way we think of methods. We need something called currying, which gives an amazing amount of flexibility.

Lets suppose we want to print out a poem in a rectangular way with 5 words per line. That is easy we just need the method that takes the position n and always returns 5 whatever line we are on.

def rectangle5 (n):
    return 5

However, we’ve just spent a lot of time praising the idea of generalisation so of course what we want really is to pass it the number of words in a line too. That’s easy. We just need two arguments.

def rectangleN (side, n):
    return side

The trouble is this doesn’t work with our poem-printing method. It expects a single argument. Remember, how it appears in the method is:

currentlinelength = numbersequence(line)

Ours has two arguments. It doesn’t fit. It has the wrong type.

What we need is a way to pass arguments to a method one at a time. That is called currying. You give a method one argument and it gives you back a new method that takes the next argument. Some languages give nice easy ways to do that. You can do it in Python, but it looks a bit convoluted as there is no nice syntax. In fact its doing exactly as described above. You define a method inside the method.

In our case we want a method that, when passed an argument that says the number of words you want to fix each line length as, gives back a method of the kind we need. It should return a method, rectangleN, that when passed a number always returns the same number – that fixed side.

def rectangle (words):
    def rectangleN(n):
        return words
    return rectangleN

So this method has a method defined in it, rectangleN, that is the specific method we want. Whatever value is passed as the first argument is the value the created method will return when called. The method as a whole returns the new method, rectangleN.

We then get the method to give to our printpoem method by calling this method with just one argument so, for example,

rectangle(5)

creates the specific method  we need to give the number sequence 5,5,5, …

We really need a poem about rectangles to try this out on, but my poem-writing skills are running  out of steam. So calling it on the original poem

printpoem(poem,rectangle(5))

gives us a more boring version of that poem (though at least this one fits on the page):

I wrote a poem in
a tweet but then each
line grew to the word
sum of the previous two
until I began to worry
about all these words coming
with such frequency because as
you can see, it can
be easy to run out
of space when a poem
gets all Fibonacci sequency

OK so it loses the point of the poem. We have at least solved the programming problem, though. Currying means we can now even pass generalised methods to a generalised method.

What’s the point of all this? Probably none, but I enjoyed myself writing poems and programs, so maybe you will. It’s possibly a fun way to explore number sequences by writing number-sequence programs with linked poems to visualise them. Or just take a favourite poem and visualise lots of number sequences using it. It does, however, show what generalisation in programs is all about  though. That aside we definitely should mix poems and computer science more. Roll on National Poetry Day

Strictly Private: Abstraction and the importance of ‘privacy’

by Paul Curzon, Queen Mary University of London

Celebrities want fame but they also want privacy. They want some things they do known to the public, but not everything. It turns out that privacy is important when writing good code too, and in object-oriented programming the personal body guards that provide it are built-in.

StrictlyLeaderboard

In the last blog post we explored the basics of object-oriented programming by looking at how to make game show judges for the Strictly Come Dancing of the future. Object-oriented programming is centred on the idea of abstraction: hiding information to make a problem (here coding) easier. In particular, it is built around the concept of an abstract data type. You create new types – collections of related values – for your program to use,  but hide their complicated inner details, their representation. We will illustrate the ideas by continuing our Strictly theme and look at how we might keep track of contestants’ scores using objects, and in doing so find out what abstract data types are, and what they give us.

We are going to think of Strictly Come Dancing contestants as objects just as we did with judges. We will create a new type of Contestant for our programs to use. We will then be able to create new Contestants, and record and change their details. However, we will do it in a special way, making those inner details private and only accessible by the privileged parts of our code (a bit like the way real celebrities’ lives are protected from the masses).

Properties of a Strictly contestant

AlexandraGorkaWe need to decide how to represent a contestant. First we ask what properties – what attributes – we need to record about contestants. How will we represent them? In Strictly contestants always come in pairs: a celebrity and a professional, so we will group them together and treat the contestant as the pair. Each person in the pair has a name. They are also given jointly, after they dance, a score from the judges. To keep things simple we will ignore the audience vote. We will also assume they just dance once for each show. Their score is just made up of four individual judge’s scores, but it is the total that matters and appears in the leader board. So that is what we will record. The properties of a contestant can be defined (in a simple pseudocode rather than a real language) as:

DESCRIPTION OF A Contestant:
    CelebrityName: String
    ProfessionalName: String
    TotalScore: integer

CelebrityName, ProfessionalName and TotalScore are the instance variables of the class we are defining. They hold the property values. This says a contestant is defined by the name of the celebrity, the name of their paired professional dancer and their score, which is an integer.

Keep it private

Now we do something that seems a bit perverse at first, but is the key to the power of objects. We mark all those instance variables as being private! They cannot be seen outside the Contestant class we are defining. The representation we just carefully created will be hidden.

DESCRIPTION OF A Contestant:
    PRIVATE CelebrityName: String
    PRIVATE ProfessionalName: String
    PRIVATE TotalScore: integer

Privacy is important to the talent of shows like this! Of course, this form of privacy is nothing to do with the stars’ private lives. Privacy in OOP is all about making the program easier to change in the future, as we will see. What marking them as private specifically means is that the rest of the program (outside the description of a Contestant we are defining) can not see any of these attributes of a contestant directly. We have set up a curtain around the representation of the contestants. We will only allow it to be accessed in carefully controlled ways via methods. In real life you can ask a star for an autograph or a selfie but they won’t go down the pub with you. Their behaviour is controlled. It is as though they come with personal bodyguards that keep all unwelcome advances away – very important in show business it seems for stars, but also for code. Access is strictly controlled to variables marked as PRIVATE. In practice, in compiled languages as least, it is the compiler that does all the checking – it ensures nothing marked PRIVATE is mentioned anywhere it shouldn’t be in the code.

Accessing marks

How then do we access the names and score? We are going to code a set of behaviours – a set of methods – whose job will be to allow those properties to be read and modified, but only in the controlled ways they define. These methods take the place of bodyguards for the code.

For example, we do not want the total score to be set arbitrarily. It MUST always be just the result of adding the four separate judges scores together. Therefore we will write a single method to do this and then enforce that it is used. It will be given the four different judges scores and add them up. No other way will be given to change the total.

So the behaviour we must define for a contestant pair gaining a score is that given four marks, which we will refer to as judge1Mark, judge2Mark, and so on, their sum is recorded. We define a method to do this as part of the description of a Contestant.

TO SetContestantScore:
    GIVEN judge1Mark AND judge2Mark AND judge3Mark AND judge4Mark:
        TotalScore = judge1Mark + judge2Mark + judge3Mark + judge4Mark

Whereas the actual property, TotalScore is private and so cannot be used directly in the rest of the program (just here), the method SetContestantScore is public. It is visible and can be used elsewhere. By setting this method as the only way TotalScore is accessed, we ensure it can ONLY by set by combining four judge’s marks together correctly. This protects the program from bugs caused by programmers getting confused or misunderstanding things elsewhere in the code.

We don’t want to just be able to set the score, we also need a way to access that score too, so it can be displayed on the leader board, for example. We again give a method to access it. By providing this method, we show we are happy for it to be accessed.

TO GetContestantScore:
         RETURN TotalScore

It just accesses the score from the Contestant object and returns it. This seems a little odd at first. Why use a method to do that? As we will see, accessing the values of a type through methods like this makes the program easier to change with less risk of mistakes being introduced. It is a good thing to do even in simple situations like this, just in case we later want to change them.

What’s your name?

What about the contestant’s names? We do the same sort of thing. We want them to be set only when the pairs are put together at the start (in real life with cries of “You were the partner I always wanted”). After being set they should not be changed, so we won’t give a way to change them. We do not want a bug elsewhere to be able to mess with their names so we make it impossible to do. We can build the setting of names in to a method called CreateContestant, say.

TO CreateContestant GIVEN celeb AND pro:
    CelebrityName = celeb
    ProfessionalName = pro
    TotalScore = 0;

We store a given celebrity name (celeb) and a given Pro’s name (pro) in to the corresponding instance variables of the object. We also set the TotalScore Instance variable to 0 as at the point they are created they have no score.

Methods like this that, when executed, create and initialise an actual object of a given class are called constructors. In many languages constructors are by convention called the same name as the type and automatically called as part of creating a new object, so we will do the same and change the name. Actually what they do instead of the above is something more like:

TO CREATE A NEW Contestant GIVEN celeb AND pro: 
    CREATE A NEW OBJECT WITH SPACE FOR THE ATTRIBUTES NAMED ABOVE
    THEN DO THE FOLLOWING
        CelebrityName = celeb
        ProfessionalName = pro
        TotalScore = 0;

We will use this version, called Contestant, that spells out that it creates an object of type Contestant first before setting the values, just to emphasise that constructors are a more complicated kind of method.

Let’s suppose once formed we ALWAYS want to refer to the celebrity-professional couple by combining the names, as in “Alexandra and Gorka”, with the celebrity first. We can enforce this by making the method that returns the name always return it in that form. Again if that is the only method we write, any code using our class can’t get it wrong. We glue the two names together with “ and “ between to make the form as above.

TO GetContestantName:
    RETURN (CelebrityName + “ and “ + ProfessionalName)

Abstract Data Types

At the start we talked about an abstract data type. Where does that come in? An abstract data type is a type that is defined by the things that can be done on values of that type. We have just defined what it means to be a contestant in exactly that way. We have specified that a value – an object – of type Contestant (as seen by the rest of the program) has four behaviours, and only four behaviours. You can

  • Create a contestant by providing the name of the celebrity and of the professional
  • Get the name of the contestant as a couple
  • Set a Contestant’s score as the total of four judge’s scores,
  • Get a Contestant’s score

You can do nothing else with a Contestant as all its inner details are hidden and inaccessible to any code using the class.

Our description above is a description of an abstract data type because it doesn’t say anything about how the names and scores are stored, and because there is nothing else you can do to manipulate the values other than those four actions.

If we put the parts together we get our whole class description for a contestant – the blueprint for a Contestant that can be used to create and manipulate them elsewhere in the program,

DESCRIPTION OF A Contestant:
  PRIVATE CelebrityName: String
  PRIVATE ProfessionalName: String
  PRIVATE TotalScore: integer

  TO CreateContestant GIVEN celeb AND pro:
      CelebrityName = celeb
      ProfessionalName = pro
      TotalScore = 0;

  TO SetContestantScore
    GIVEN judge1Mark AND judge2Mark AND judge3Mark AND judge4Mark:
      TotalScore = judge1Mark + judge2Mark + judge3Mark + judge4Mark

  TO GetContestantScore:
      RETURN TotalScore

  TO GetContestantName:
      RETURN (CelebrityName + “ and “ + ProfessionalName)

Using Contestants

So we have now created a new type of Contestant. How do we use it? We can now write code, for example, to create pairs of contestants as needed:

Pair1 IS A NEW Contestant USING “Alexandra” , “Gorka”

This is using our class blueprint to create our first object of type Contestant. We give it the names of “Alexandra”  and “Gorka”. It calls our constructor, our special method used to make Contestants, creates an empty blank of the object and then sets the CelebrityName to be “Alexandra” and the  ProfessionalName to be “Gorka”. It also sets their TotalScore to 0. Notice here we do not actually refer to any of those variable names. Writing the rest of the code we can forget about all that detail. We do not want or need to know the actual representation. We create more pairs as needed

Pair2 IS A NEW Contestant USING “Aston” , “Janette”
Pair3 IS A NEW Contestant USING “Jonnie“, “Oti”
Pair4 IS A NEW Contestant USING “Susan“, “Kevin”

and so on.

For each pair the program will also need to give them their marks as allocated by the judges eg by calling Pair1’s SetContestantScore method to set Alexandra and Gorka’s scores:

Pair1.SetContestantScore WITH 9, 10, 10, 10

Of course the actual marks will be provided by our judge objects. We do not need to know what happens to these values at this point as long as we trust it was the right thing. For all we know, writing this part of the program, each mark might just have been stored separately.

Similarly we can retrieve the total score when we need it. As long as that one method was written correctly, the right thing will always happen for all pairs. We will get the total (and in the following display it).

DISPLAY Pair1.GetContestantScore ()

We could do a similar thing to get the pair’s names. In the following we get both names and score and put space between the name and the score.

DISPLAY Pair1.GetContestantName() + “  “ + Pair1.GetContestantScore()

If we do this for each pair we will have displayed the leaderboard.

You can do no more …

As those method are the only ways our class provides for the code to access the details of the pairs, when writing the rest of the code, we can’t accidentally introduce bugs, printing the names in mixed up pairs, for example, or missing one of the judge’s scores when we total their marks. Programming in this style helps ensure we rule out the possibility of making these easy-to-make kinds of mistake. This is especially important as our whole program gets larger and larger, perhaps millions of lines of code, and as more people become involved in writing it. Those new people may not understand how we have coded a Contestant, but the point is they do not need to. All they need to know are what the methods we have provided to access it do not how they do it. They are the parts we have made public.

Next episode we will see more about the benefits of the abstract data type idea and so of object-oriented programming. Abstraction, i.e., hiding the implementation, allows us to modify the class implementation and yet the rest of the code still works without any further change.

Catch up with Paul’s other blog posts