Code is typically written once but read by many. Current or future team members will read the code. In an open source project, many more folks including end users may read the code. Therefore, code readability is important. Sometimes when the code alone doesn't provide context or clarify intent, the developer may write extra descriptions. These descriptions are called code comments.

Code comments enhance readability. They facilitate code reviews, refactoring, and maintenance. Moreover, code comments are ignored by compilers and interpreters when producing the final executable. Thus, they incur no runtime performance overhead. However, too many or unnecessary comments can reduce readability.

All modern languages support code comments. Different languages have different syntaxes. Comments can be single-line or multi-line.

## Discussion

• What are the purposes for which code comments make sense?

Code comments are used for various purposes:

• Description: Used for describing the logic behind the code, not the code itself. Used for describing the algorithm or explaining why it was chosen. Even flowcharts in ASCII could be included.
• Metadata: Could include author and date of the first version, name and contact of current maintainer. Could include external references to provide context, such as, a published paper or a StackOverflow answer.
• Pending Work: Comments tagged with FIXME or TODO indicate pending work. Editors/IDEs often recognize them.
• Debugging: Developers comment out redundant code or code causing problems. These comments are ideally deleted once the problem is found.
• Directives: Directives meant for editors and interpreters. For example, UNIX scripts use #! to know what interpreter must be used. Python uses # -*- coding: UTF-8 -*- to inform the encoding used for the source file.
• Documentation: For documenting interfaces and to assist editors/IDEs. Tools exist to automatically create documentation from these comments. Some developers treat these as documentation, not comments.
• What are some anti-patterns when writing code comments?

Writing lots of comments explaining every single detail of what the code does is an anti-pattern. Too many comments clutter the source file and reduce readability. Moreover, such comments are also hard to maintain as the code evolves.

Using comments to keep a historical record of how the code has evolved is an anti-pattern. This is best done by your version control system. Old code that's no longer used should not be commented. It's cleaner to delete the code. If required, we can always get it from the version control system.

Magic comments are those that give editors and interpreters directives. These may be important for the correct working of the software. Mixing such comments with regular comments can be problematic. Someone may delete the magic comments by mistake.

Some developers believe that comments are not necessary if the code is clean. Comments don't fix bad code. It's been said that,

Don't document bad code – rewrite it.

From this perspective, there are ways to refactor bad code and get rid of unnecessary comments:

• Naming: A variable i could be renamed to numGoals, which clarifies intention. Do this for variables, methods and classes.
• Structure: If a code fragment can't be understood without comments, try to change the code structure. Design patterns help in this regard.
• Sub-expressions: Where there's a complex expression, split it into multiple sub-expressions.
• New Method: A comment that explains a block of code can be removed by extracting that code into its own method. The method's name should clarify intent.
• Assertion: A variable may be expected to obey some constraint. Don't state this in a comment. Instead, use an assertion.

When all the above are employed and code is refactored, only a few comments may remain. Mostly likely, these explain why the code is written in a certain way or describe complex algorithms that couldn't be simplified.

• Could you share some tips for writing code comments?

Avoid commenting what's already obvious in the code. Comments should explain at a higher level of abstraction. In other words, "why" comments are good, "how" comments are bad.

Organize code into blocks, each block performing a single task. A comment could precede each block. This makes it easier for others to follow the flow at a high level. In fact, some developers write these comments first before writing the code.

Get to the point. Avoid jokes, poetry and verbosity. Be polite in your comments. Write comments in a consistent style. Write them for other developers, although some write them for non-programmers. If you write TODO comments, remember to take action on these later. Update comments when updating code.

Take inspiration from guidelines defined by others. For example, Google Style Guides are available for many languages and most of these include guidelines for comments. There are also guidelines for documenting APIs.

• What are some patterns for writing code comments?

Experienced developers most likely know when and how to write good code comments. Chris Travers has attempted to formalize these practices by giving them names:

• Documentation Comment Pattern: For documenting interfaces and not for explaining the code itself.
• Section Heading Comment Pattern: Comments are used to separate sections of code, such as public vs private methods. Helpful to organize code and find things quickly later.
• Footnote Comment Pattern: Comments that describe why a particular approach was adopted. Short and informative. Generally used when such information can't be deduced from the code.
• Warning Comment Pattern: Comments that warn developers of some special needs, such as calling a function as a superuser. Warnings may be about security or design flaws. Comments may include TODO or FIXME annotations.
• Signed Comment Pattern: Useful in a team. Comments are accompanied by initials of the developer. Makes it easy to know who to approach for discussions.
• Woven Code Pattern: Code and documentation are go together. Document first and then code to that documentation.
• Which are the syntax types for code comments?

Comment syntaxes vary across languages but basically there are two types:

• Inline Comment: Used for a single-line comment, marked by a start comment token till the end of the line.
• Block Comment: Used for a multi-line comment, marked by a pair of start-end comment tokens. In some languages, block comments can be nested, that is, a block comment within another block comment is permitted.

A language may support one or both types of syntaxes. Some languages such as ALGOL, Mathematica, OCaml, Simula, and Smalltalk don't have a single-line comment syntax but the token pair for block comments may be used within the same line. Some languages such as R, Python, FORTRAN, Erlang and Visual Basic don't have block comments. Though Python has """ … """, these are meant for documentation, not comments.

PHP is an example that supports multiple syntaxes for inline comments (# and //).

Without being exhaustive, we list some start tokens used for inline comments:

• #: UNIX/Linux shells, Cobra, Perl, Python, Ruby, Seed7, Windows PowerShell, PHP, R, Make, Maple, Elixir, Nim
• //: ActionScript, C, C++, C#, D, F#, Go, Java, JavaScript, Kotlin, Objective-C, PHP, Rust, Scala, SASS, Swift, Xojo
• --: Euphoria, Haskell, SQL, Ada, AppleScript, Eiffel, Lua, VHDL, SGML, PureScript
• %: TeX, Prolog, MATLAB, Erlang, S-Lang, Visual Prolog
• ;: Assembly x86, Lisp, Common Lisp, Clojure, Scheme
• REM: BASIC, Batch files

We list some start-end token pairs for block comments:

• /* … */: ActionScript, C, C++, C#, D, Go, Java, JavaScript, Kotlin, Objective-C, PHP, PL/I, Prolog, Rexx, Rust, Scala, SAS, SASS, SQL, Swift, Visual Prolog, CSS
• (* … *): Delphi, ML, Mathematica, Object Pascal, Pascal, Seed7, Applescript, OCaml, Standard ML, Maple, Newspeak, F#
• #| … |#: Lisp, Scheme, Racket
• =begin … =end: Ruby
• Can data have comments attached to them?

Indeed, data can have comments too. Like code comments, these can be descriptions, metadata, references, etc. For example, static web content is served commonly in HTML. HTML (and XML) documents can have comments within <!-- … --> that are not displayed by web browsers.

It's common for computer systems to use configuration files. Typical formats are Microsoft Windows INI, XML, JSON and YAML. Except for JSON, all others allow for comments. Although JSON doesn't support comments by syntax, developers have found a way to add comments from a semantic standpoint by following certain naming conventions. For example, "_comment1": "this is my comment" is a comment although JSON parsers will treat is as data. The reason JSON doesn't have a comment syntax is that developers were adding parsing directives in them, thereby making it less interoperable.

Comments in configuration files probably make sense just as comments in code. In a quote that's attributed to James Gosling, it's been said that,

Every configuration file becomes a programming language, so you might as well think that way.

## Milestones

1957

Designed at IBM, FORTRAN is released as a programming language along with a compiler. FORTRAN allows comments, thereby making the code accessible to non-programmers. A comment is a full-line comment marked by letter C at position 1 till the end of the line. COBOL and BASIC that follow soon after also support comments from a fixed position. The use of fixed position simplified compiler design. We should note that since the early 1950s assemblers have supported comments starting at any position, including trailing comments.

1960

ALGOL is defined by a committee as a "second-generation" language. It supports block comments. Pascal takes inspiration from ALGOL and is released in 1970. It also supports block comments.

1970

Through the 1970s, structured programming becomes more popular. The same can be said of object-oriented programming in the 1980s. Programs evolve to higher levels of abstraction. Programs become more organized. Computer systems have more memory and so variable names could be longer and more descriptive. Hungarian notation is less favoured. These developments imply a lesser need to write comments.

1981

In one of the earliest experiments, Woodfield et al. find that code comments improve readability. Tenny (1985) shows that comments improve the understanding of the Banker's Algorithm. Further experiments through the 1980s and early 1990s confirm these findings and highlight the importance of code comments in maintaining large software projects.

Oct
2007

Fluri et al. report results of a study on how code and its comments evolve. Based on three open source projects, they find that new code is barely accompanied by comments. Class and method declarations are commented more often than method calls. About 97% of comment changes are done in the same version as the code changes. Their research method associates code with comments based on proximity and a token-based string similarity measure.

2010

Since writing comments is extra work for developers, it's common to encounter code that's inadequately commented. During the period 2010-2014, tools emerge to automatically generate code comments. These use techniques from Information Retrieval (IR). From the mid-2010s, neural networks are more often employed for this problem.

2011

Haouari et al. present one of the first taxonomies for code comments. They identify four high-level categories.

2013

Steidl et al. identify seven high-level categories of comments. Their main aim is to understand developers' commenting habits. They find that commenting out code has negative impact on code readability. Section comments such as // ---- Getter and Setter Methods --- improve readability. Method comments help callers understand the system design and how to call those methods.

Feb
2019

Apr
2019

Pascarella et al. manually classify 40,000 lines of code comments from 14 open source Java projects. Their goal is to empirically determine the types of comments developers are writing. In the process, they develop a taxonomy of code comments. They also develop machine learning features and algorithms to automate the classification.

Author
No. of Edits
No. of Chats
DevCoins
3
0
1378
1
0
8
2064
Words
1
Likes
2403
Hits

## Cite As

Contributed by
2 authors

Last updated on
2022-02-15 11:56:52