Programming Style and Structure

Overview

It will make life easier for all of us if we have a place to list out any style notes and guidelines we come up with. Style can be a very personal issue and I expect to hear complaints. I'm open to changes, we can discuss them as necessary.


In general, you will never go too far wrong by heeding the advice in "Smalltalk with Style" by Skublics, Klimas and Thomas. Fortunately it has been made freely available by Stephane Ducasse on his Free Smalltalk Books webpage. However there are some areas where it is less specific, not emphatic enough, or in my opinion plain wrong.


Another excellent source of Smalltalk information is Kent Beck's "Smalltalk Best Practice Patterns". However I think this may be out of print.


Specifics

Naming

Open Cobalt specific classes will start with either Cobalt or Co (the chemical abbreviation for Cobalt). Beyond that I'm open to suggestions. Clearly classes that aren't Cobalt specific are going to need something else.  Generic Croquet classes up until now have been prefixed with either T or Croquet, Wisconsin classes are prefixed with K, and Tweak classes are prefixed with C.  (And here I take a moment to blow raspberries at the people who thought a single character prefix was a good idea.)


As is usual in Smalltalk, we will camel case identifiers. Class names and Class pool variable names start with a capital. Instance variables, temporaries and method arguments start with lower case. Class-instance variables start with lower case.


When converting a phrase to an identifier and the phrase contains acronyms, either expand the acronym or only capitalize the first character of the acronym, modulo the rules above for the very first character. WAN IP Connection becomes (for a class) WanIpConnection, or WideAreaNetworkInternetProtocolConnection or WanInternetProtocolConnection or WideAreaNetworkIpConnection. NOT WANIPConnection.   This is particularly important for instance vars: wANIPConnection just looks ridiculous, wanIpConnection reads much better.


Flagging Problems

Sometime you don't have the time, or don't yet know how to write a given piece of code cleanly. When you run into this situation, write the code in the problematic fashion and mark the code with some marker so we can sweep it later.  I'm suggesting use of the convention which was used in Croquet previously, which was putting in a comment with a "@@@@: FIXME " or "@@@@: TODO " or "@@@@: Check this ". The important part is the 4 @s in a row as this never occurs in normal Smalltalk code so the 'Search all method source for a string' mechanism should easily pick these out.  When you do this ALWAYS annotate it with some description as to why you think it's a problem (DON'T just say it's 

bad) and your initials so someone else can ask later if they do not understand the issue.

Short Methods

If your method is longer than a screenful in the standard editors on a 1024x768 screen in a 12 point font, it's probably too long.  There are exceptions to this for performance reasons, but that is uncommon and is really a job for the profiler to find later.


Accessors

An accessors is defined as a method who's contract claims that you are accessing the internals of an instance directly by using it. Historically, there has been boatloads of arguments over when accessors should be used. I don't have a position on if you should use accessors, my rule is that if you do define accessors for a instance variable you should always use them to get/set the value. If you don't define accessors you obviously should never use the them (this implies the instance variable in question is hidden from outside access).  The reason for this is to make it easier to find all of the places where the variable is being accessed or changed. Instead of having to find all instance variable accesses AND all of the senders of both the accessors, I can just do one of them with confidence I have them all.  This makes refactoring considerably easier.


Hard Coded Classes and Constants

Don't, with the only real exceptions being a method who's whole purpose is to return some class or a call into the Smalltalk support system.  In general [self routerClass new] is better than [TSimpleRouter new]. Using [ ivar := OrderedCollection new ] and the like can be acceptable if the probability of change is low.


In most languages it is recommended that constant/literal values be assigned to a variable (or special constant declaration in languages that support it) and given a name that is suggestive of the purpose of the value. Smalltalk is similar, except it is usually better to define a method with a suggestive name that returns the value.

Example 

A simple example in the Croquet Hedgehog is the CroquetHarness>>createIsland:named: method.  This method is written as a monolithic lump and has been cut and pasted into subclasses duplicating functionality. It looks like in the Mar. 17/2008 version of Cobalt:


CroquetHarness>>createIsland: islandClass named: aString
    "Create a new island. islandClass must be a class that is used to generate the contents of the TIsland. aString is a name for the router/island that allows another user to identify it. aBlock is executed upon completion and existence of the island. This is used to do additional setup with the island."
    | p router controller id |
    id := islandClass islandID.

    "if a controller for this island has already been registered, just use that"
    allowControllerAliasing ifTrue:[
        controller := controllers detect:[:cc| cc sessionID = id and:[cc island == nil]] ifNone:[nil].
        controller ifNotNil:[^self createIsland: islandClass named: aString using: controller].
    ].

    "Create the dispatcher for any local islands we'd like to host the router for"
    dispatcher ifNil:[
        dispatcher := TExampleDispatcher new.
        dispatcher listenOn: 0. "wildcard port"
        dispatcher autoCreate: false.
    ].
    "@@@@ FIXME: This is clearly not the way to do it @@@@"
    router := TSimpleRouter new.
    false ifTrue:[router log: Transcript].
    router addUserName: 'foo' password: 'bar'.
    dispatcher addRouter: router id: id.

    contactPoint ifNil:[contactPoint := TContactPoint new].
    contactPoint addBroadcast:(TContact new 
        address: dispatcher address port: dispatcher port 
        id: id name: aString).

 

    "Create the controller"
    controller := self controllerClass new.
    controller connectTo: dispatcher address port: dispatcher port sessionID: id.
    self addController: controller.
    p := controller login: 'foo' password: 'bar'.
    p wait. "not good style but easier for the setup"
    p := controller join.            "receive messages"
    p wait. "see above"
    controller backDoor: (dispatcher routers at: id).
    ^self createIsland: islandClass named: aString using: controller


If you look carefully at the original method you'll notice that the variable referred to in the comment, aBlock, does not exist! This illustrates a common problem with comments in that they can easily fall out of sync with the code. Rewritten for extensibility and clarity:


CroquetHarness>>createIsland: islandClass named: aString
    "Create a new island. islandClass must be a class that is used to generate the contents of the TIsland. aString is a name for the router/island that allows another user to identify it."
    | id |
    id := islandClass islandID.

    self aliasableController

        ifNotNilDo: [ :aliasController |  ^self createIsland: islandClass named: aString using: aliasController ]. 

    self startDispatcher.
    self startRouterFor: id named: aString.
    self startContactPointFor: id named: aString.
    ^self createIsland: islandClass named: aString using: (self createAndConnectControllerTo: id)


And new we need some new methods:


CroquetHarness>>aliasableController

    allowControllerAliasing

        ifFalse: [ ^nil ].
    ^controllers

        detect:[:cc| cc sessionID = id and:[cc island == nil]]

        ifNone:[^nil]

 

CroquetHarness>>startDispatcher
    "Create the dispatcher for any local islands we'd like to host the router for"
    dispatcher ifNil:[
        dispatcher := self dispatcherClass new.
        dispatcher autoCreate: false.
        dispatcher listenOn: self dispatcherPort.
    ].


CroquetHarness>>dispatcherClass
    ^TExampleRouter

 

CroquetHarness>>dispatcherPort
    ^0 "Wildcard Port"

 

CroquetHarness>>startRouterFor: id named: aStrName
    | router |
    router := self newRouter.
    dispatcher addRouter: router id: id.
    ^router


CroquetHarness>>newRouter

    | router |

    router := self routerClass new.
    self initializeRouter: router.
     ^router

 

CroquetHarness>>routerClass
    ^TSimpleRouter

  

CroquetHarness>>initializeRouter: router

    router addUserName: (self routerUserName) password: (self routerPassword).

 

CroquetHarness>>routerUserName

    ^'foo'

 

CroquetHarness>>routerPassword

    ^'bar'


CroquetHarness>>createAndConnectControllerTo: id

    | p controller |

    controller := self controllerClass new.
    controller connectTo: (dispatcher address) port: (dispatcher port) sessionID: id.
    self addController: controller.

    p := controller login: (self routerUserName) password: (self routerPassword).
    p wait. "not good style but easier for the setup"
    p := controller join.            "receive messages"
    p wait. "see above"
    controller backDoor: (dispatcher routers at: id).

    ^controller 


Profiler

We have two time profilers in the class MessageTally: statistical and simulating.  They are wonderful. Please use them.  The statistical profiler is typically called by [MessageTally spyOn: [ ...Code to profile... ]]. We also have a memory usage profiler. It is also wonderful.  To get a simple millisecond resolution timing use  [ Time millisecondsToRun: [ ...Code to time...  ]  ].

Class Hierarchies (Proposed)

Up until now Croquet has suffered from an overuse of subclassing as the main code structuring mechanism.  There are two sets of things that the Croquet/Cobalt internals should be good at:  Supporting Cobalt and being a solid base for future Croquet API development, particularly in areas that have nothing to do with virtual worlds.


I'm thinking about having a three layer-ish structure as in this example:

  • CroquetHarness - Minimal functionality, rewritten for maximum extensibility.
    • CroquetVirtualWorldHarness - Extended for Virtual World specific support in a pluggable fashion
      • CobaltHarness - Specific Cobalt class we use.
    • CroquetBusHarness - Harness for hypothetical communications only croquet apps
      • DistributedRendererHarness - Harness for a hypothetical parallel frame renderer  

The top classes are essentially the existing Croquet classes, minimal functionality but rewritten for maximum extensibility.

The middle layer (and it could potentially be multiple subclasses) is where the real work is done. Application area functionality, as configurable as we can make it.

The lowest layer is application specific classes. If the middle layer is properly written, most of the action here will be configuration of the higher level extensions.

 


Law of Demeter (LoD)

http://c2.com/cgi/wiki?LawOfDemeter

Short form of the LoD is "only play with your friends". 

  • governs the communication structure within an object-oriented design
    • restricts message-sending statements in method implementations
    • only talk to your immediate friends
  • message target can only be one of the following objects:
    1. the method's object itself (self, super, thisContext)
    2. an object that is an argument
    3. an object referred to by the object's attribute
    4. an object created by the method
    5. an object referred to by a global variable

This tends to result in the creation of many small methods. It avoids fragility in the face of structural changes, for instance when you have a long chain of accessors such as [Processor activeIsland project ownerMorph harness.] . If any of those intermediate results should change, the whole statement could break. It is hard giving examples of correct usage of the LoD as it often depends heavily on context. In general though you want to emphasize the actions you want undertaken.

 

This example really isn't a very good one...this code is flawed because a lot of the code surrounding it is flawed. 


CPostcardDialog>>listOfFriends

    | harness aList |
    harness := Processor activeIsland project ownerMorph harness.

    aList := Dictionary new.

    "First, add the worlds we run"
    aList addAll: harness contactPoint broadcastContacts.

    "Now add the ones we hear on the local LAN"
    aList addAll: harness contactsByName.

    "Now, disable the one we are currently at."
    ^aList.


This should probably have the harness set in an instance variable on creation by the calling method and look more like:

 

listOfFriends

    | aList |
    aList := Dictionary new.
    aList addAll: harness myContacts.
    aList addAll: harness localNetContacts.
    ^aList.


Don't Repeat Yourself (DRY) / Once and Only Once

http://c2.com/cgi/wiki?DontRepeatYourself 

Every piece of knowledge must have a single, unambiguous, authoritative representation within a system. 

This form of the rule is from the Ruby community however it's been around longer than that. Database people will recognize this as the purpose of the higher normal forms in relational systems.


http://c2.com/cgi/wiki?OnceAndOnlyOnce

Like DRY, except it applies to code. If you find yourself writing essentially the same code twice, look at it again and see if it can be refactored intelligently.  This isn't so much a local rule as the whole purpose of code refactoring.  


Code formatting

On one hand, I'm not sure about this besides to say you should the built in formatter if you're in doubt.  There are all kinds of problems with it's output, but it's there.   On the other hand, my favoured style is a version of GNU style with a Smalltalk accent and greater consistency. The advantage is that in ST blocks are objects in their own right (and this is emphasized) while in C they're just syntax. :


someMethod: ballOrNil


    ballOrNil isBall

        ifTrue:[

            ball bounce.

        ]

        ifFalse: [

            self recordBadBall.

        ]
 

On the gripping hand, if we keep our methods short, the code formatting style shouldn't matter.


Miscellaneous

Macintosh OS X Application Signatures

Mac applications and file types have a 4 byte character code assigned to them. Apple keeps a registry of them and you can allocate new ones as required, as long as they don't collide with existing ones. I (jdougan) have registered a set of them for use with Cobalt, mainly so noobs and users don't have to worry about getting the right VM started when opening an image file. Of the allocated signatures below, I'm proposing that the cobalt vm have the signature CBLT and the image files have the signature COim.   The others either are for future expansion or have direct analogs with Squeak file types in case we need them.  It's better to overregister a bit now than find that the signatures you want are not available later.


Dear John,

Thank you for registering your creator code information with Apple Computer.
We appreciate your continued product development and support. The following
product information has been registered:

Company: Cobalt Collective
First Name: John
Last Name: Dougan
Address:
City:
State:
Postal/Zip Code:
Country: United States

Phone Number:
Email Address: jdougan@acm.org
Application: Cobalt Croquet

Application Signatures:
ASCII: CBLT HEX: 43424C54
ASCII: COim HEX: 434F696D
ASCII: COch HEX: 434F6368
ASCII: COpr HEX: 434F7072
ASCII: COap HEX: 434F6170
ASCII: CO3d HEX: 434F3364
ASCII: COav HEX: 434F6176
ASCII: COst HEX: 434F7374
ASCII: COso HEX: 434F736F
ASCII: Cblt HEX: 43626C74

This email serves as your registration confirmation. If you require any
changes to this information, please contact Apple Developer Support at
cfreg@apple.com with any requested changes.

Again, thank you for supporting Apple Computer.
Comments