Coding StandardsG2

From GeeklogWiki
Jump to: navigation, search

Introduction

The purpose of this document is to lay out a clear set of guidelines for developing code for the Geeklog 2 project. The scope of this document is limited mainly to PHP code, although Geeklog will occasionally use SQL and HTML.

Why have CodeConventions?

Code conventions are important to programmers for a number of reasons:

  • 80% of the lifetime cost of apiece of software goes to maintenance.
  • Hardly any software is maintained for its whole life by the original author.
  • Code conventions improve the readability of the software, allowing engineers to understand new code more quickly and thoroughly. If you ship your source code as a product, you need to make sure it is as well packaged and clean as any other product you create.

Acknowledgements

This document has been largely created by culling information from the following documents:

In some places we've copied it verbatim. Please don't sue us, we've got no money anyway. If you modify and redistribute this document please maintain the above credits and give us some also. The original Gallery version was mostly written by Bharat Mediratta with input from Chris Smith, Matthew McEachen and Jesse Mullan. Modifications for Geeklog were added by Vincent Furia.

Indentation

Four spaces should be used as the unit of indentation. The construction of the indentation should be exclusively spaces (i.e. no tabs).

Use an indent of 4 spaces, with no tabs. If you use Emacs to edit code, you should set indent-tabs-mode to nil. Here is an example mode hook that will set up Emacs according to these guidelines (you will need to ensure that it is called when you are editing PHP files):

      (defun php-mode-hook ()
        (setq tab-width 4
              c-basic-offset 4
              c-hanging-comment-ender-p nil
              indent-tabs-mode
        (not
          (and (string-match "/\(PEAR\|pear\)/" (buffer-file-name))
            (string-match ".php$" (buffer-file-name))))))

Here are vim rules for the same thing:

      set expandtab
      set shiftwidth=4
      set softtabstop=4
      set tabstop=4

Line Length

Avoid lines greater than 100 characters. Yes, 100 not 80. We're in the new millenium now and we've got bigger screens.

Wrapping Lines

  • Break after a comma.
  • Break before an operator.
  • Prefer higher-level breaks to lower-level breaks.
  • Align the new line with the beginning of the expression at the same level on the previous line.
  • If the above rules lead to confusing code or to code that's squished up against the right margin, just indent 8 spaces instead (not 4 spaces).

Here are some examples of breaking method calls:

      someMethod(longExpression1, longExpression2, longExpression3, 
          longExpression4, longExpression5);
 
      var = someMethod1(longExpression1,
          someMethod2(longExpression2,
              longExpression3)); 

Following are two examples of breaking an arithmetic expression. The first is preferred, since the break occurs outside the parenthesized expression, which is at a higher level.

      longName1 = longName2 * (longName3 + longName4 - longName5)
              + 4 * longname6; // PREFER

      longName1 = longName2 * (longName3 + longName4
              - longName5) + 4 * longname6; // AVOID

Following are two examples of indenting function declarations. The first is the conventional case. The second would shift the second and third lines to the far right if it used conventional indentation, so instead it indents only 4 spaces.

      // CONVENTIONAL INDENTATION
      function someFunction($anArg, $anotherArg, $yetAnotherArg,
                            $andStillAnother) {
          ...
      }

      // INDENT 8 SPACES TO AVOID VERY DEEP INDENTS
      function superExtraHorkingLongMethodName($anArg, $anArg, $anArg,
              $anotherArg, $yetAnotherArg, $yetMoreArg, $anArg, $anArg
              $andStillAnother) {
          ...
      }

Line wrapping for if statements should generally use the 8-space rule, since conventional (4 space) indentation makes seeing the body difficult. For example:

      // DON'T USE THIS INDENTATION
      if ((condition1 && condition2)
          || (condition3 && condition4)
          ||!(condition5 && condition6)) { // BAD WRAPS
          doSomethingAboutIt();            // MAKE THIS LINE EASY TO MISS
      } 

      // USE THIS INDENTATION INSTEAD
      if ((condition1 && condition2)
              || (condition3 && condition4)
              ||!(condition5 && condition6)) {
          doSomethingAboutIt();
      } 

      // OR USE THIS
      if ((condition1 && condition2) || (condition3 && condition4)
              ||!(condition5 && condition6)) {
          doSomethingAboutIt();
      }

Line Termination

Ensure that your editor is saving files in the Unix format. This means lines are terminated with a newline, not with a CR/LF combo as they are on Win32, or whatever the Mac uses. Any decent Win32 editor should be able to do this, but it might not always be the default. Know your editor.

Naming Convention

Naming conventions make programs more understandable by making them easier to read. They can also give information about the function of the identifier-for example, whether it's a constant, package, or class-which can be helpful in understanding the code.

Geeklog 2 uses the idea of names spaces to prevent different plugins from using the same class or table names. Plugins should prefix both classes and tables by the plugin name followed by an underscore.

In the block below, we'll refer to a capitalization strategy called StudlyCaps. In this strategy, multiple words are combined into one where the beginning of each internal word is capitalized. Acronyms are capitalized like regular words. For example a variable expressing the "maximum cpu speed" would be called $maximumCpuSpeed.

Type Description Examples
Files File names are in StudlyCaps with the first letter capitalized. Try to keep your file names simple and descriptive.

Files that contain PHP code (including php classes) should end in .php. Files that contain a PHP class definition should contain a class of the same name as the file (minus the name space).
Geeklog.php
Classes Class names are in StudlyCaps and should be nouns. Classes should be prefixed with the plugin name (or "Geeklog" for core classes) followed by an underscore (_). Try to keep your class names simple and descriptive. Use whole words-avoid acronyms and abbreviations (unless the abbreviation is much more widely used than the long form, such as URL or HTML). class Geeklog_PasswordGenerator
class Forum_Post
class Geeklog_UrlGenerator
Variables Variables are in StudlyCaps and should have a lowercase first letter. Variable names should not start with underscore (_) or dollar sign ($) characters, even though both are allowed.

Variable names should be short yet meaningful. The choice of a variable name should be mnemonic- that is, designed to indicate to the casual observer the intent of its use. One-character variable names should be avoided except for temporary "throwaway" variables such as loop indices. Common names for temporary variables are i, j, k, m, and n for integers; and c, d, and e for characters.
var $fields;
var $userInformation;
var $linkUrl;
Functions Functions are in StudlyCaps and should be verbs. Functions that return boolean values should be in the form of a question as in "isEnabled". The noun-verb formation makes code easier to read, eg "Is the user enabled?" becomes "if ($user->isEnabled)". function getField()
function setField()
function isEnabled()
Constants The names of variables declared class constants and of ANSI constants should be all uppercase with words separated by underscores ("_"). (ANSI constants should be avoided, for ease of debugging.) Constants used in localization should begin with an underscore. $MIN_WIDTH = 4;
$FULL_ONLY_MODE = 2;
Tables The names of all tables should be prefixed by the plugin's name and the geeklog table prefix. Table names should be all lower case and spaces should be represented by the underscore character (_). <table_prefix>_<plugin_name>_table

Class and Function Declarations

When coding classes and functions, the following formatting rules should be followed:

  • No space between a method nameand the parenthesis "(" starting its parameter list
  • Open brace "{" appears at the end of the same line as the declaration statement
  • Closing brace "}" starts a line by itself indented to match its corresponding opening statement, except when it is a null statement the "}" should appear immediately after the "{"

The above rules are sometimes referred to as the K&R style. Example:

      class Foo {
          function getBar($a, $b) {
          }
      }

Statements

Simple Statements

Each line should contain at most one statement. Example:

      $argv++;                // Correct
      $argc--;                // Correct

      $argv++; $argc--;       // AVOID!

Compound Statements

Compound statements are statements that contain lists of statements enclosed in braces "{ statements }".

The enclosed statements should be indented one more level than the compound statement. The opening brace should be at the end of the line that begins the compound statement; the closing brace should begin a line and be indented to the beginning of the compound statement. Braces are used around all statements, even single statements, when they are part of a control structure, such as a if-else or for statement. This makes it easier to add statements without accidentally introducing bugs due to forgetting to add braces.

return Statements

A return statement with a value should not use parentheses unless they make the return value more obvious in some way. Example:

      return;

      return $myDisk->size();

      return ($argv * $argc - $defaultValue);

if, if-else, ifelse-if else Statements

The if-else class of statements should have the following form:

      if (condition) {
          statements;
      }

      if (condition) {
          statements;
      } else {
          statements;
      }
      
      if (condition) {
          statements;
      } else if (condition) {
          statements;
      } else {
          statements;
      }

Note: if statements always use braces {}. Avoid the following error-prone form:

      if (condition) // AVOID! THIS OMITS THE BRACES {}!
          statement;

The ternary operator (example below) should not be used. Instead use an if-else statement.

      $alpha = (booleanExpression) ? $beta : $gamma;   // AVOID!  Use if statement instead!

for Statements

A for statement should have the following form:

      for (initialization; condition; update) {
          statements;
      }

An empty for statement (one in which all the work is done in the initialization, condition, and update clauses) should have the following form:

      for (initialization; condition; update);

When using the comma operator in the initialization or update clause of a for statement, avoid the complexity of using more than three variables. If needed, use separate statements before the for loop (for the initialization clause) or at the end of the loop (for the update clause).

while Statements

A while statement should have the following form:

      while (condition) {
          statements;
      }

An empty while statement should have the following form:

      while (condition);

do-while Statements

A do-while statement should have the following form:

      do {
          statements;
      } while (condition);

switch Statements

A switch statement should have the following form:

      switch (condition) {
      case ABC:
          statements;
          /* falls through */
      
      case DEF:
          statements;
          break;
      
      case XYZ:
          statements;
          break;
      
      default:
          statements;
          break;
      }

Every time a case falls through (doesn't include a break statement), add a comment where the break statement would normally be. This is shown in the preceding code example with the /* falls through */ comment. Every switch statement should include a default case. The break in the default case is redundant, but it prevents a fall-through error if later another case is added.

White Space

Blank lines improve readability by setting off sections of code that are logically related. Two blank lines should always be used in the following circumstances:

  • Between sections of a sourcefile
  • Between class and interface definitions

One blank line should always be used in the following circumstances:

  • Between methods
  • Between the local variables in a method and its first statement
  • Between logical sections inside a method to improve readability

Blank Spaces

Blank spaces should be used in the following circumstances: A keyword followed by a parenthesis should be separated by a space. Example:

      while (true) {
          ...
      }

Note that a blank space should not be used between a method name and its opening parenthesis. This helps to distinguish keywords from method calls. A blank space should appear after commas in argument lists. All binary operators except . should be separated from their operands by spaces. Blank spaces should never separate unary operators such as unary minus, increment ("++"), and decrement ("--") from their operands. Example:

      $a += $c + $d;
      $a = ($a + $b) / ($c * $d);
      while ($n < $s) {
          $n++;
      }
      print("size is " . $foo . "
");

The expressions in a for statement should be separated by blank spaces. Example:

      for (expr1; expr2; expr3)

Casts should be followed by a blank space. Examples:

      myFunction((int) $a, (int) $b);

Comments

If you change code and/or write new code you should add appropriate comments. This applies to whole files (so called modules when we talk about procedural code, or classes when talking OO), function/method definitions, included files and important variables.

C style comments (/* */) and standard C++ comments (//) are both fine. Use of Perl/shell style comments (#) is discouraged. Multiple line C-style comments should see the *'s aligned in a column (including the first line).

In addition, commenting any tricky, obscure, or otherwise not-immediately-obvious code is clearly something we should be doing. Especially important to document are any assumptions your code makes, or preconditions for its proper operation. Any one of the developers should be able to look at any part of the application and figure out what's going on in a reasonable amount of time.

Class/Module Comments

These should appear at the beginning of every file, and help to explain the purpose of the file. Further they give the file a name (module) and put the file in a certain group (a module group). All Geeklog2 core modules should belong to the package Geeklog2.core. This might be helpful in the future for identifying plugins and other non-core code. Module comments apply to classes as well, here they magically turn into class comments. Such a comment looks like this:

      /**
       * Short explanation (1 line!)
       *
       * Some more text which explains in more detail
       * what the file does and who might be interested
       * in understanding that.
       *
       * @author Name <email address>
       * @module modulename
       * @modulegroup group
       * @package Geeklog
       */
  • The first line should be short but meaningful.
  • The longer explanation may span several lines. Currently HTML markup is not retained, so try to avoid it.
  • The @module statement gives the file a more meaningful name. Usually it should be the filename without the suffix .php. Dots are not allowed in the name, convert them to underscore if needed.
  • The @modulegroup should be one of the following. If you think a new group is needed, please email me (better yet the developers mailing list). We will add it if appropriate.
  • TBD
  • The @package statement should be left unchanged.

Function/MethodComments

These explain in detail what a function does, what parameters it expects and what is returned by the function. Function comments apply to classes as well, here they magically turn into method comments. Such a comment appears directly above a function definition looks like this:

      /**
       * Short explanation (1 line!)
       *
       * Some more text which explains in more detail what
       * the function does and who might be interested
       * in understanding that.
       *
       * @author Name <email address>, Name2 <other email address>
       * @author Name2 <other email address>
       * @param type description
       * @return type description
       */
  • The first line should be short but meaningful.
  • The longer explanation may span several lines. Currently HTML markup is not retained, so try to avoid it.
  • One @author statement may be present per author. One author may appear in each statement, consisting of his/her name and optionally the email address in < and > signs. If given, the email address will be converted into a hyperlink automagically.
  • One or more @param statements describing the arguments the function expects. They must be given in the order in which they appear in the function definition. A return statement, if the function returns something.

Class Variable and Include File Comments

These are simple: They just quickly explain what a class varibale is used for, or what an included file does, or why we need it. These comments may be longer, if you have to explain more (e.g. the $dbp variable in the config file). They should appear just above the corresponding variable or include/require statement. They have to be one line and look like this.

      /**
       * Some explanation of the variable or file just below this comment.
       */

Document Header

All files should contain the following text in a form where it will not interfere with the purpose of the file (ie, commented out). In this example, it's presented in a commented out form for inclusion into PHP files. The following header is tentative awaiting a decision on what licensing scheme to use for Geeklog2.

      <?
      /**
      * Geeklog2 -- A web base Content Management System for building communities
      *
      * This source file is subject to version 2.02 of the PHP license, that is bundled with this package
      * in the file LICENSE, and is available at through the world-wide-web at
      * http://www.php.net/license/2_02.txt. If you did not receive a copy of the PHP license and are
      * unable to obtain it through the world-wide-web, please send a note to license@php.net so we can
      * mail you a copy immediately.
      *
      * @author Name <Name>
      * @copyright 2004
      * @version $Id:$
      *
      */

      ?>

Special Comments

Occasionally you wind up checking in code that's not totally satisfactory. Sometimes this is inevitable. In order to locate these bits of code so that we find and resolve it later, use the following tags in a comment, above the code in question:

  • REVISIT - this is an optimization waiting to happen, or something that could be improved on later. Optionally. If we're bored. And have itchy C-x v v fingers.
  • TODO - this is missing functionality (so by definition, it's broken) that needs to be addressed at some point.
  • FIXME - this is stubbed/broken functionality. But I need to commit. And it can limp for now.

Keep in mind that you may not get back to this code for a while. You may not even be the one to fix the thing, so the more information that you provide while it's still fresh in your mind, the better. Potential solutions or work arounds are great, and may prove invaluable to whomever gets around to addressing the issue.

If the comment isn't clear it may be ignored and eventually deleted.

At some point in the future this will enable us to dictate the following:

  • No point release with FIXMEs
  • No major release with TODOs

PHP SpecificGuidelines

Associative Array Keys

In PHP, it's legal to use a literal string as a key to an associative array without quoting that string. We don't want to do this -- the string should always be quoted to avoid confusion. Note that this is only when we're using a literal, not when we're using a variable. Examples:

      /* wrong */
      $foo = $assoc_array[blah];

      /* right */
      $foo = $assoc_array['blah'];

Quoting Strings

There are two different ways to quote strings in PHP - either with single quotes or with double quotes. The main difference is that the parser does variable interpolation in double-quoted strings, but not in single quoted strings. Because of this, you should always use single quotes unless you specifically need variable interpolation to be done on that string. This way, we can save the parser the trouble of parsing a bunch of strings where no interpolation needs to be done. Also, if you are using a string variable as part of a function call, you do not need to enclose that variable in quotes. Again, this will just make unnecessary work for the parser. Note, however, that nearly all of the escape sequences that exist for double-quoted strings willnot work with single-quoted strings. Be careful, and feel free to break this guideline if it's making your code harder to read. Examples:

      /* wrong */
      $str = "This is a really long string with no variables for the parser to find.";
      do_stuff("$str");
    
      /* right */
      $str = 'This is a really long string with no variables for the parser to find.';
      do_stuff($str);

Including Code

In most cases require_once() should be used over include_once(). The two constructs are identical in every way except how they handle failure. include_once() produces a Warning while require_once() results in a Fatal Error. In other words, use require_once() if you want a missing file to halt processing of the page. include_once() does not behave this way, the script will continue regardless.

Note: include_once() and require_once() are statements, not functions. You don't need parentheses around the filename to be included.

Autoload

Since the classes for GL2 will be located in many different locations, it is still unknown whether GL2 will use PHP 5's autoload feature.

PHP Code Tags

Always use <?php ?> to delimit PHP code, not the <? ?> shorthand. This is the most portable way to include PHP code on differing operating systems and setups.

Uninitialized Variables

Don't use uninitialized variables. Geeklog will use a high level of run-time error reporting. This will mean that the use of an uninitialized variable will be reported as an error.

All variables should be initialized before use. Form variables should be referenced from the appropriate variable array. See the section on Register Globals.

      if ($var) {      // AVOID, unitialized variable
          ...something
      }

      if (isset($var)) {  // OK
          ...soemthing
      }

      $var = 1;
      if ($var) {      // OK
          ...soemthing
      }

Register Globals

Geeklog2 will be designed to work with register globals disabled. To access GET, POST and COOKIE variables use the $_GET, $_POST, and $_COOKIE global arrays respectively. Alternatively the $_REQUEST global array variable can be used; it merges the $_GET, $_POST, and $_COOKIE arrays.

Note: Keep in mind that the precedence of $_REQUEST is defined in the php.ini file and may not be what you expect.

Sessions

GL2 will use PHP sessions for session management. The $_SESSION array will be available globally for storing session data. Since session information will be stored in a database, the size of data kept in the $_SESSION array should be kept as small as possible.

Miscellaneous

Magic Numbers

Don't use them. Use named constants for any literal value other than obvious special cases. Basically, it's OK to check if an array has 0 elements by using the literal 0. It's not OK to assign some special meaning to a number and then use it everywhere as a literal. This hurts readability AND maintainability. Included in this guideline is that we should be using the constants TRUE and FALSE in place of the literals 1 and 0 -- even though they have the same values, it's more obvious what the actual logic is when you use the named constants.

Shortcut operators

The only shortcut operators that cause readability problems are the shortcut increment ($i++) and decrement ($j--) operators. These operators should not be used as part of an expression. They can, however, be used on their own line. Using them in expressions is just not worth the headaches when debugging. Examples:

      /* wrong */
      $array[++$i] = $j;
      $array[$i++] = $k;

      /* right */
      $i++;
      $array[$i] = $j;
    
      $array[$i] = $k;
      $i++;

Operator Precedence

Do you know the exact precedence of all the operators in PHP? Neither do I. Don't guess. Always make it obvious by using brackets to force the precedence of an equation so you know what it does.

      /* what's the result? who knows. */
      $bool = ($i < 7 && $j > 8 || $k == 4);
    
      /* now you can be certain what I'm doing here. */
      $bool = (($i < 7) && (($j < 8) || ($k == 4)))

SQL code layout

Since we'll all be using different editor settings, don't try to do anything complex like aligning columns in SQL code. Do, however, break statements onto their own lines. Here's a sample of how SQL code should look. Note where the lines break, the capitalization, and the use of brackets. Examples:

      SELECT field1 AS something, field2, field3
      FROM table a, table b
      WHERE (this = that) AND (this2 = that2)

Appendix A: Document History

  • Version 0.2 (08/06/2004): Adapated to Wiki, incorporated comments from geeklog.net.
  • Version 0.1 (06/30/2004): initial release for comments, adapted from Gallery2 (G2) coding standards document.