abstract data type | Teaching London Computing: A RESOURCE HUB from CAS LONDON & CS4FN

by Paul Curzon, Queen Mary University of London

Celebrities want fame but they also want privacy. They want some things they do known to the public, but not everything. It turns out that privacy is important when writing good code too, and in object-oriented programming the personal body guards that provide it are built-in.

In the last blog post we explored the basics of object-oriented programming by looking at how to make game show judges for the Strictly Come Dancing of the future. Object-oriented programming is centred on the idea of abstraction: hiding information to make a problem (here coding) easier. In particular, it is built around the concept of an abstract data type. You create new types – collections of related values – for your program to use, but hide their complicated inner details, their representation. We will illustrate the ideas by continuing our Strictly theme and look at how we might keep track of contestants’ scores using objects, and in doing so find out what abstract data types are, and what they give us.

We are going to think of Strictly Come Dancing contestants as objects just as we did with judges. We will create a new type of Contestant for our programs to use. We will then be able to create new Contestants, and record and change their details. However, we will do it in a special way, making those inner details private and only accessible by the privileged parts of our code (a bit like the way real celebrities’ lives are protected from the masses).

Properties of a Strictly contestant

We need to decide how to represent a contestant. First we ask what properties – what attributes – we need to record about contestants. How will we represent them? In Strictly contestants always come in pairs: a celebrity and a professional, so we will group them together and treat the contestant as the pair. Each person in the pair has a name. They are also given jointly, after they dance, a score from the judges. To keep things simple we will ignore the audience vote. We will also assume they just dance once for each show. Their score is just made up of four individual judge’s scores, but it is the total that matters and appears in the leader board. So that is what we will record. The properties of a contestant can be defined (in a simple pseudocode rather than a real language) as:

DESCRIPTION OF A Contestant:
    CelebrityName: String
    ProfessionalName: String
    TotalScore: integer

CelebrityName, ProfessionalName and TotalScore are the instance variables of the class we are defining. They hold the property values. This says a contestant is defined by the name of the celebrity, the name of their paired professional dancer and their score, which is an integer.

Keep it private

Now we do something that seems a bit perverse at first, but is the key to the power of objects. We mark all those instance variables as being private! They cannot be seen outside the Contestant class we are defining. The representation we just carefully created will be hidden.

DESCRIPTION OF A Contestant:
    PRIVATE CelebrityName: String
    PRIVATE ProfessionalName: String
    PRIVATE TotalScore: integer

Privacy is important to the talent of shows like this! Of course, this form of privacy is nothing to do with the stars’ private lives. Privacy in OOP is all about making the program easier to change in the future, as we will see. What marking them as private specifically means is that the rest of the program (outside the description of a Contestant we are defining) can not see any of these attributes of a contestant directly. We have set up a curtain around the representation of the contestants. We will only allow it to be accessed in carefully controlled ways via methods. In real life you can ask a star for an autograph or a selfie but they won’t go down the pub with you. Their behaviour is controlled. It is as though they come with personal bodyguards that keep all unwelcome advances away – very important in show business it seems for stars, but also for code. Access is strictly controlled to variables marked as PRIVATE. In practice, in compiled languages as least, it is the compiler that does all the checking – it ensures nothing marked PRIVATE is mentioned anywhere it shouldn’t be in the code.

Accessing marks

How then do we access the names and score? We are going to code a set of behaviours – a set of methods – whose job will be to allow those properties to be read and modified, but only in the controlled ways they define. These methods take the place of bodyguards for the code.

For example, we do not want the total score to be set arbitrarily. It MUST always be just the result of adding the four separate judges scores together. Therefore we will write a single method to do this and then enforce that it is used. It will be given the four different judges scores and add them up. No other way will be given to change the total.

So the behaviour we must define for a contestant pair gaining a score is that given four marks, which we will refer to as judge1Mark, judge2Mark, and so on, their sum is recorded. We define a method to do this as part of the description of a Contestant.

TO SetContestantScore:
    GIVEN judge1Mark AND judge2Mark AND judge3Mark AND judge4Mark:
        TotalScore = judge1Mark + judge2Mark + judge3Mark + judge4Mark

Whereas the actual property, TotalScore is private and so cannot be used directly in the rest of the program (just here), the method SetContestantScore is public. It is visible and can be used elsewhere. By setting this method as the only way TotalScore is accessed, we ensure it can ONLY by set by combining four judge’s marks together correctly. This protects the program from bugs caused by programmers getting confused or misunderstanding things elsewhere in the code.

We don’t want to just be able to set the score, we also need a way to access that score too, so it can be displayed on the leader board, for example. We again give a method to access it. By providing this method, we show we are happy for it to be accessed.

TO GetContestantScore:
         RETURN TotalScore

It just accesses the score from the Contestant object and returns it. This seems a little odd at first. Why use a method to do that? As we will see, accessing the values of a type through methods like this makes the program easier to change with less risk of mistakes being introduced. It is a good thing to do even in simple situations like this, just in case we later want to change them.

What’s your name?

What about the contestant’s names? We do the same sort of thing. We want them to be set only when the pairs are put together at the start (in real life with cries of “You were the partner I always wanted”). After being set they should not be changed, so we won’t give a way to change them. We do not want a bug elsewhere to be able to mess with their names so we make it impossible to do. We can build the setting of names in to a method called CreateContestant, say.

TO CreateContestant GIVEN celeb AND pro:
    CelebrityName = celeb
    ProfessionalName = pro
    TotalScore = 0;

We store a given celebrity name (celeb) and a given Pro’s name (pro) in to the corresponding instance variables of the object. We also set the TotalScore Instance variable to 0 as at the point they are created they have no score.

Methods like this that, when executed, create and initialise an actual object of a given class are called constructors. In many languages constructors are by convention called the same name as the type and automatically called as part of creating a new object, so we will do the same and change the name. Actually what they do instead of the above is something more like:

TO CREATE A NEW Contestant GIVEN celeb AND pro: 
    CREATE A NEW OBJECT WITH SPACE FOR THE ATTRIBUTES NAMED ABOVE
    THEN DO THE FOLLOWING
        CelebrityName = celeb
        ProfessionalName = pro
        TotalScore = 0;

We will use this version, called Contestant, that spells out that it creates an object of type Contestant first before setting the values, just to emphasise that constructors are a more complicated kind of method.

Let’s suppose once formed we ALWAYS want to refer to the celebrity-professional couple by combining the names, as in “Alexandra and Gorka”, with the celebrity first. We can enforce this by making the method that returns the name always return it in that form. Again if that is the only method we write, any code using our class can’t get it wrong. We glue the two names together with “ and “ between to make the form as above.

TO GetContestantName:
    RETURN (CelebrityName + “ and “ + ProfessionalName)

Abstract Data Types

At the start we talked about an abstract data type. Where does that come in? An abstract data type is a type that is defined by the things that can be done on values of that type. We have just defined what it means to be a contestant in exactly that way. We have specified that a value – an object – of type Contestant (as seen by the rest of the program) has four behaviours, and only four behaviours. You can

Create a contestant by providing the name of the celebrity and of the professional
Get the name of the contestant as a couple
Set a Contestant’s score as the total of four judge’s scores,
Get a Contestant’s score

You can do nothing else with a Contestant as all its inner details are hidden and inaccessible to any code using the class.

Our description above is a description of an abstract data type because it doesn’t say anything about how the names and scores are stored, and because there is nothing else you can do to manipulate the values other than those four actions.

If we put the parts together we get our whole class description for a contestant – the blueprint for a Contestant that can be used to create and manipulate them elsewhere in the program,

DESCRIPTION OF A Contestant:
  PRIVATE CelebrityName: String
  PRIVATE ProfessionalName: String
  PRIVATE TotalScore: integer

  TO CreateContestant GIVEN celeb AND pro:
      CelebrityName = celeb
      ProfessionalName = pro
      TotalScore = 0;

  TO SetContestantScore
    GIVEN judge1Mark AND judge2Mark AND judge3Mark AND judge4Mark:
      TotalScore = judge1Mark + judge2Mark + judge3Mark + judge4Mark

  TO GetContestantScore:
      RETURN TotalScore

  TO GetContestantName:
      RETURN (CelebrityName + “ and “ + ProfessionalName)

Using Contestants

So we have now created a new type of Contestant. How do we use it? We can now write code, for example, to create pairs of contestants as needed:

Pair1 IS A NEW Contestant USING “Alexandra” , “Gorka”

This is using our class blueprint to create our first object of type Contestant. We give it the names of “Alexandra” and “Gorka”. It calls our constructor, our special method used to make Contestants, creates an empty blank of the object and then sets the CelebrityName to be “Alexandra” and the ProfessionalName to be “Gorka”. It also sets their TotalScore to 0. Notice here we do not actually refer to any of those variable names. Writing the rest of the code we can forget about all that detail. We do not want or need to know the actual representation. We create more pairs as needed

Pair2 IS A NEW Contestant USING “Aston” , “Janette”
Pair3 IS A NEW Contestant USING “Jonnie“, “Oti”
Pair4 IS A NEW Contestant USING “Susan“, “Kevin”

and so on.

For each pair the program will also need to give them their marks as allocated by the judges eg by calling Pair1’s SetContestantScore method to set Alexandra and Gorka’s scores:

Pair1.SetContestantScore WITH 9, 10, 10, 10

Of course the actual marks will be provided by our judge objects. We do not need to know what happens to these values at this point as long as we trust it was the right thing. For all we know, writing this part of the program, each mark might just have been stored separately.

Similarly we can retrieve the total score when we need it. As long as that one method was written correctly, the right thing will always happen for all pairs. We will get the total (and in the following display it).

DISPLAY Pair1.GetContestantScore ()

We could do a similar thing to get the pair’s names. In the following we get both names and score and put space between the name and the score.

DISPLAY Pair1.GetContestantName() + “  “ + Pair1.GetContestantScore()

If we do this for each pair we will have displayed the leaderboard.

You can do no more …

As those method are the only ways our class provides for the code to access the details of the pairs, when writing the rest of the code, we can’t accidentally introduce bugs, printing the names in mixed up pairs, for example, or missing one of the judge’s scores when we total their marks. Programming in this style helps ensure we rule out the possibility of making these easy-to-make kinds of mistake. This is especially important as our whole program gets larger and larger, perhaps millions of lines of code, and as more people become involved in writing it. Those new people may not understand how we have coded a Contestant, but the point is they do not need to. All they need to know are what the methods we have provided to access it do not how they do it. They are the parts we have made public.

Next episode we will see more about the benefits of the abstract data type idea and so of object-oriented programming. Abstraction, i.e., hiding the implementation, allows us to modify the class implementation and yet the rest of the code still works without any further change.

Catch up with Paul’s other blog posts