• One possible refactoring of Java's stream class hierarchy. Source: Murphy-Hill and Black 2008, fig. 1.
• Popularity of automated refactoring techniques in Eclipse for Java: Developer (row) vs. Technique (col). Source: Murphy-Hill and Black 2008, fig. 5.
• Some refactorings take more time than others. Source: Szőke et al. 2014, fig. 5.
• Refactoring is a distinct step in a TDD process flow. Source: Albano 2018.
• Refactoring techniques and their change scores. Source: Ouni et al. 2016, table I.
• Visual Studio 2017 lists six refactoring techniques for C# code. Source: iNoryaSoft 2018.

# Code Refactoring

arvindpdmn
1356 DevCoins
Last updated by arvindpdmn
on 2020-04-29 06:02:59
Created by arvindpdmn
on 2020-03-28 03:49:00

## Summary

Software is rarely perfect. Bugs need to be fixed. Code and its structure can be improved. Even when no new features are added, restructuring code can make it easier to understand and maintain. Refactoring is thus about restructuring existing code without changing its behaviour. Refactoring changes internal structures while preserving external behaviour. Refactored code works just as before, passes existing tests but is better in terms of structure and organization.

Refactoring shouldn't be a special task that needs to be scheduled. Refactoring should be a day-to-day programming discipline. Whenever developers see an opportunity to improve existing code, they should refactor.

IDEs can help in automated refactoring. Testing is an essential activity. It ensures that refactored works as before and nothing is broken. Incremental refactoring is preferred over a full-scale code rewrite.

## Milestones

Aug
1981

In an article on Smalltalk in BYTE magazine, Ingalls uses the word factoring in a software context. He mentions factoring as one of the design principles behind Smalltalk, supported via class inheritance. He defines factoring thus, "Each independent component in a system should appear in only one place." Without factoring, it would be hard to keep interdependent components synchronized and consistent. Factoring makes it easier to locate and maintain the component. The concept of refactoring itself comes from mathematics where a complex algebraic expression might be factored into a simpler and equivalent expression.

1986

R.S. Arnold uses the term software restructuring that's about incrementally making changes to software internals as a way to manage its increasing complexity. It's been noted that the term refactoring came to be used a little later in the context of object-oriented software development.

1990

William Opdyke and Ralph Johnson present a conference paper titled Refactoring: An Aid in Designing Application Frameworks and Evolving Object-Oriented Systems. This is possibly the first use of the term refactoring in published literature. Indeed, refactoring is initially adopted for object-oriented programs.

1991

William Griswold publishes his PhD dissertation on the topic of refactoring functional and procedural programs. A year later, William Opdyke's own dissertation does the same for object-oriented programming.

Jul
1999

Fowler et al. publish their book titled Refactoring: Improving the Design of Existing Code. After introducing the topic, the book describes over 70 refactoring tips. The word refactoring is defined both as a noun and as a verb. In the years to come, this book influences software development. IDEs go on to implement many of the practices to automate refactoring. The second edition of the book appears in 2018.

2006

Although many tools existing, developers might not use them if they're not aligned with refactoring tactics preferred by developers. A survey of 41 users of Eclipse IDE for Java development finds that only a few of them use these tools. Simpler techniques such as Rename and Move are more often used than more complex ones such as IntroduceFactory and PushDown.

## Discussion

• When does it make sense for me to refactor working code?

Given any working code, developers may be hesitant to change code unnecessarily. But refactoring can still be useful. When program flow is not clear, refactoring improves clarity. Software becomes easier to update and maintain in the long run. We may also refactor to improve performance. Refactoring first can help developers add new features more easily. Sometimes refactoring is done to enable code reuse.

There's a common misconception that refactoring activities need to be scheduled into a project. While planned refactoring is possible, it's better to refactor continuously. Whenever developers detect bad code and they sense an opportunity to make it better, refactoring should be done. When small improvements to code are done continuously, there will hardly be a need to schedule refactoring as a separate task.

A common excuse is that refactoring takes time away from actual development. On the contrary, refactoring saves time in the long run. Software tends to degrade over time as complexity increases. Refactoring is a way to mitigate this.

• What are some indicators that my code may need refactoring?

As multiple developers work on the same code base over many release cycles, complexity increases. There's less cohesion in terms coding styles and design. Some call this code rot, characterized by duplicate code, myriad patches, and bad classifications. Other symptoms include unhealthy dependencies between classes or classes doing too many things.

In one project, 20 developers had created 65,000 lines of code over 5 years. The codebase had lot of dead code, code pertaining to many older API versions, and classes with too many dependencies.

When a bug fix is required and the fix has to be made in multiple places, this indicates duplicated code. Refactoring can help here. Another example is when a bug fix itself introduces another bug. This implies fragile code that needs refactoring.

When a new feature request comes in, the current design might make it difficult to build this feature. This is another area where refactoring can help.

• What does it mean to say that refactoring shouldn't change external behaviour?

Preserving external behaviour means that given an input and the current system state, the software should give a predictable output.

However, there are some specialized systems where input-output behaviour is insufficient. Other aspects of behaviour are just as important and must be preserved during refactoring. We note three such systems:

• Real-time software: Execution time is important. Refactoring should preserve all temporal constraints.
• Embedded software: Memory usage and power consumption are important. Refactoring should preserve these constraints.
• Safety-critical software: Concrete notions of safety should be preserved.
• Could you mention some case studies of code refactoring?

One team took a bottom-up approach to code refactoring by "extracting new classes or coercing existing classes to enforce single responsibility, loose coupling, testability, and low complexity". However, one area of the codebase had to be rewritten. They adopted an incremental TDD approach of writing unit tests, making some changes, and testing.

Another project team decided not to bring in new dependencies during refactoring. They identified concept boundaries (Admin, HR, User) and refactored within these boundaries.

A study from 2007 of a small team of mobile app developers showed that refactoring not only improves code quality but also improves productivity. Productivity was measured as lines of code written per hour.

One misconception is that refactoring can reduce performance. One study found that replacing conditional logic with polymorphism improved performance since compilers optimize polymorphic methods.

Martin Fowler explains many use cases with original and refactored code. These are worth reading: loading JSON data returned by another service, calculating and formatting a bill for a video store, and refactoring to manage module dependencies.

• Is there a process that I can follow when refactoring my code?

Those who practice Agile methodology, know that refactoring is part of Test-Driven Development (TDD). Refactoring is done only when current tests pass. This gives developers confidence to go ahead and refactor code. Should the refactored code fail some tests, developers can either fix the problem or revert to the working code. Tests ensure that code behaviour isn't affected by refactoring.

The essence of this approach is incremental refactoring and testing. Test after every small refactoring change. Testing after dozens of refactoring changes can be problematic. If tests fail, it will be hard to isolate the faulty refactoring. It's also clear that we shouldn't add new features or change behaviour while also refactoring. The process isolates adding functionality and refactoring.

Martin Fowler identifies different refactoring workflows: TDD Refactoring, Litter-Pickup Refactoring, Comprehension Refactoring, Preparatory Refactoring, Planned Refactoring, and Long-Term Refactoring. Frequent planned refactoring might indicate that the team isn't doing enough of the other workflows.

• Which are the different levels of code refactoring?

Refactoring can happen at three different levels:

• Code-level: Remove dead or unused code. Rename variables. Reduce number of method arguments by packaging them into a class. Convert global variables into class data members. Create access methods.
• Function-level: Merge and consolidate duplicated code. Merge similar code blocks into methods.
• Architecture-level: Create new class hierarchy. Reorganize responsibilities and encapsulation between subclasses and superclass. Refactor to make way for future changes. Refactor to interface better with databases, external services or frameworks.

Some make the distinction between primitive refactoring and composite refactoring. The latter is a sequence of primitive refactorings. For example, Rename and Move are primitive refactorings. Others use the terms low-level refactoring and high-level refactoring, or equivalently in a dental metaphor, floss refactoring and root canal refactoring.

• What are some techniques to refactor code?

Refactoring Guru explains a number of refactoring techniques, of which we mention a few:

• Composing Methods: Refactor long methods into smaller method calls. Use local variables instead of modifying method parameters. Move duplicated code into a method.
• Moving Features between Objects: If a method/field is called more often by another class, move it into that class. Refactor to multiple single-purpose classes if a class is doing many different things. Remove a method that simply calls another method.
• Organizing Data: Know when to use references and when to use value objects. Make a public field private with access methods.
• Simplifying Conditional Expressions: Instead of many if-else statements or complex expressions, use a method call that returns a boolean result. In loops, use break, continue or return instead of control flags.
• Simplifying Method Calls: Remove unused method parameters. Replace methods fivePercentRaise() and tenPercentRaise() with the parameterized method raise(percentage).
• Dealing with Generalization: Move common methods/fields from subclasses to the superclass. Create a subclass for a feature that's used in only some scenarios.
• Which are some best practices for code refactoring?

Refactor only if you've good regression tests and the tests are passing. If some tests are lacking, add extra tests. It's always better to refactor before adding new features since tests exist only for current features. Refactor after delivery and before starting the next release cycle. One practical problem is that some tests might depend on old program structure and would need to be rewritten.

Use your best judgement to decide what to refactor now and what could be done later. Refactoring too much at once can impact software delivery. Always use an incremental approach.

In one research study, Analytic Hierarchy Process (AHP) was applied to rank different refactoring techniques and apply those that work best for the codebase. A multi-objective search-based approach has also been proposed.

When team members are working on different feature branches, refactoring can make merges difficult. If some members own some pieces of code, there will be reluctance to refactor their code. These are barriers to refactoring. Work as a team to see how these can be overcome.

Don't refactor just because you love writing new code or wish to use a coding pattern you've just learned.

• Are there tools to automate code refactoring?

Smalltalk's Refactoring Browser (released in 1997) was well-liked and adopted by Smalltalk developers. This was followed by refactoring tools for Java and C#.

By 2020, most IDEs supported code refactoring. This includes Eclipse, Visual Studio, Xcode, Squeak, Visual Studio Code, , IntelliJ-based IDEs (AppCode, IntelliJ IDEA, PyCharm, WebStorm), and more.

Back in 2003, it was noted that creating a refactoring tool for C++ has been difficult. Refactoring makes use of the program's Abstract Syntax Tree (AST) but macros and templates add complexity. In fact, it's been said that any tool that works for the whole of C++ would be AI Complete. A subset of refactoring techniques for C++ is supported by Visual Studio.

Automatic refactoring is aided by static type information. A codebase's development history could be used to identify refactoring opportunities.

## Sample Code

• // Source: https://refactoring.guru/extract-method
// Accessed: 2020-04-28
// Before
void printOwing() {
printBanner();

// Print details.
System.out.println("name: " + name);
System.out.println("amount: " + getOutstanding());
}

// After refactoring (Extract Method)
void printOwing() {
printBanner();
printDetails(getOutstanding());
}

void printDetails(double outstanding) {
System.out.println("name: " + name);
System.out.println("amount: " + outstanding);
}

## Milestones

Aug
1981

In an article on Smalltalk in BYTE magazine, Ingalls uses the word factoring in a software context. He mentions factoring as one of the design principles behind Smalltalk, supported via class inheritance. He defines factoring thus, "Each independent component in a system should appear in only one place." Without factoring, it would be hard to keep interdependent components synchronized and consistent. Factoring makes it easier to locate and maintain the component. The concept of refactoring itself comes from mathematics where a complex algebraic expression might be factored into a simpler and equivalent expression.

1986

R.S. Arnold uses the term software restructuring that's about incrementally making changes to software internals as a way to manage its increasing complexity. It's been noted that the term refactoring came to be used a little later in the context of object-oriented software development.

1990

William Opdyke and Ralph Johnson present a conference paper titled Refactoring: An Aid in Designing Application Frameworks and Evolving Object-Oriented Systems. This is possibly the first use of the term refactoring in published literature. Indeed, refactoring is initially adopted for object-oriented programs.

1991

William Griswold publishes his PhD dissertation on the topic of refactoring functional and procedural programs. A year later, William Opdyke's own dissertation does the same for object-oriented programming.

Jul
1999

Fowler et al. publish their book titled Refactoring: Improving the Design of Existing Code. After introducing the topic, the book describes over 70 refactoring tips. The word refactoring is defined both as a noun and as a verb. In the years to come, this book influences software development. IDEs go on to implement many of the practices to automate refactoring. The second edition of the book appears in 2018.

2006

Although many tools existing, developers might not use them if they're not aligned with refactoring tactics preferred by developers. A survey of 41 users of Eclipse IDE for Java development finds that only a few of them use these tools. Simpler techniques such as Rename and Move are more often used than more complex ones such as IntroduceFactory and PushDown.

## Tags

• Code Refactoring for the Cloud
• Database Refactoring
• Software Design Patterns
• Clean Code
• Code Quality Metrics
• Static Code Analysis

Author
No. of Edits
No. of Chats
DevCoins
3
0
1356
1991
Words
0
Chats
3
Edits
1
Likes
610
Hits

## Cite As

Devopedia. 2020. "Code Refactoring." Version 3, April 29. Accessed 2020-10-21. https://devopedia.org/code-refactoring
• Site Map