Impact of Linguistic Antipatterns on Code Understanding

Context

Understanding a software system is a major activity during the maintenance and evolution of the system. Researchers studied how developers understand source code and which factors reduce code understanding. They reported that shorter identifier names have a negative impact on code understanding. The use of short identifiers is a symptom of a broader naming problem. Consequently, researchers introduced linguistic anti-patterns (LAs) to describe bad practices in the naming, documentation, and implementation of code entities, which can have a negative impact on code understanding.

Objective

We proposed and evaluated the impact of the presence of LAs and the effect of knowledge of LA on code understanding, we also evaluated how different types of LAs affect understanding of code entities.

Method

We examined seven different types of LAs through two experiments with 229 participants performing understanding tasks on code containing or not occurrences of LAs, with and without prior knowledge of the LAs. We also asked participants to identify LAs in different code snippets. We assess the participants' performance using three metrics: the correctness of their answers, the time they spent answering, and their perceived effort.

We performed two experiments to evaluate the impact of LAs on participants' code understanding by both unknowledgeable and knowledgeable participants. Figure 1 shows our experimental process.

Figure 1: Experimental Process

The first experiment called ExpBefore, involved participants without prior knowledge of LAs to assess the impact of LAs on code understanding. Then, we taught these participants about LAs and evaluated their knowledge of LAs through a quiz test to ensure they understood LAs well. To investigate whether knowing LAs can improve code understanding in terms of correctness, time, and effort, in the last step participants had the second experiment, called ExpAfter.

Results

We observed that LAs negatively affect participants' understanding slightly and prior knowledge can mitigate the negative effects of LAs. We also observed that LAs “Says one but contains many”, “Set method returns”, and “Attribute signature and comments are opposite” have the most negative impact on code understanding.

Conclusion

We conclude that LA negatively affects code understanding and actively learning about LAs positively improves code understanding. It is therefore essential to prioritize the awareness of LAs among students, as it can lead to improved code comprehension and correctness. Also, development teams should consider (1) educating their team members about LAs, and (2) removing LAs from their software systems as soon as possible.