The Effects of Native Language on Requirements Quality

[Context and motivation] More and more often software development projects involve participants of diverse nationalities and languages. Thus, software companies tend to use English as their business language. Moreover, to better prepare for future jobs, students consciously choose university courses in English. [Question/problem] As a result there is an increasing number of software engineers who are working or studying in a language which is not their native language. The question arises whether native language has an effect on the quality of natural language requirements. [Principal ideas/results] From the analysis of the requirements formulated by 44 participants of our empirical study, it follows that native language may have a negative effect on requirements quality, e.g., ambiguity, variability, and grammar issues. Furthermore, different native languages might drive to different quality issues. [Contribution] In order to prevent quality issues, our findings might be used by educators to adjust their materials to cater to different language groups, while practitioners might use them to improve their requirements review process.


I. INTRODUCTION
S OFTWARE engineering is a diverse field, both in terms of research areas and worker backgrounds.This diversity is present in the industry, and companies are increasingly using English as their business language, no matter what country they are based in.University students are also globally mobile, with many who have the means often choosing to study all or part of their higher education abroad in English.This means that there is an increasing number of software engineers who are working or studying in a language that is not their native language.
Software engineers often use requirements specifications, either writing or developing systems from them, where the quality of the specification could determine the quality of the end product.The success of a software development project is said to depend on the quality of its requirements specification [1], [2].Requirements are often written in natural language and, thus, the language used in that requirement could also have an effect on the quality of the specification.
The purpose of this study is to analyze natural language requirements written in English to determine (1) whether a author's native language has an effect on the quality of these requirements, and (2) which qualities are affected.In this paper, the term "native language" is defined as being the language of the country in which a person is born, raised, and receives early years of education.In an agile context, natural language requirements can either be written in the Software Requirements Specification (SRS) style or as user stories.
The findings from this study could support industry practitioners, research, and requirements engineering education.Targeted teaching and training could be developed to improve not only the overall quality of requirements but also to focus on the qualities that native speakers frequently have problems with.The study outcome could also help companies with requirements review processes, and quality checklists definition to identify or avoid requirements issues early on in development.

II. BACKGROUND AND RELATED WORK
The IEEE Recommended Practice for Software Requirements Specifications [3] presents guidelines on how to produce "good" natural language SRS-style requirements.The guidelines detail eight characteristics that individual requirements should possess and five characteristics that a set of requirements should have.The recommended practice states that individual requirements should be: necessary; appropriate; unambiguous; complete; singular; feasible; verifiable; correct; and conforming (when applicable).A set of requirements should be: complete; consistent; feasible; comprehensible; and able to be validated.If an individual requirement or set of requirements violates one or more of these qualities, then it is not considered to be "good".
The INVEST criteria, originally discussed by Wake in 2003 [4], are specifically for evaluating the quality of user stories, rather than SRS-style requirements.According to the criteria, a user story should be: independent, negotiable, valuable, estimable, small, and testable [5].If the story does not meet one or more of these criteria, then it is not of good quality.
There is a large body of work on requirements quality, with some focusing on specific qualities of a requirements specification and others giving a broader overview of what quality might be.Kiyavitskaya et al. [1] and Fabbrini et al. [6] take a detailed linguistic approach to identify ambiguity in requirements specifications.Antinyan et al. [7] focus on different requirements quality and developed a metric to measure the complexity of a requirement.With a broader look at all of the potential qualities of a requirements specification, Knauss et al. [8] developed a GQM approach to improving requirements quality.Genova et al. [2] also had a wider view of which requirements qualities to consider when creating the framework and tool for improving the quality of a requirements specification.However, while these studies were conducted in English, none of them looked at the linguistic background of the participants.

III. RESEARCH METHODOLOGY
Research Questions.Our study aims to answer the following research questions: RQ1: Does the native language have an effect on the quality of natural language requirements?
• RQ1.1:Which requirements qualities are affected?• RQ1.2: Do any particular languages have greater effects on requirements quality?Participants and Data Collection.We aimed to find participants who had a software engineering background, and who could potentially be asked to write requirements.The participants were selected on the basis of convenience sampling.Survey participants were reached via the REFSQ 2022 conference, LinkedIn, Facebook, Twitter, Discord, and email, and via sharing the survey link with the students studying software engineering at the Universities the authors work for.Thus, the participants were a mix of students, researchers, and industry practitioners within software engineering.We created an online survey hosted on sosci.de.The survey was piloted by two representatives of the study participants.We decided to ask two students (those who might have the lowest experience with requirements) who gave feedback which was used to refine the survey questions.The first five questions in the survey were demographic questions.The sixth question was a simple domain description after which the participant was asked to write five natural language requirements (either SRS style or user stories) for the example domain.The survey questions and study material are available online [9].
Data Analysis.The qualitative data was analyzed using thematic coding as per Saldana [10] with two coding iterations.The thematic coding process used a coding dictionary that we created, which covered violations of any of a selected subset of the IEEE characteristics of individual requirements [3] or four of the INVEST criteria for user stories [4], [5].When analyzing SRS-style requirements, we used the 2018 IEEE guidelines [3] that detail what good individual requirements should possess: correct; ambiguous; verifiable; necessary; appropriate; complete; singular; and feasible.The characteristic of "conforming" was not included in the analysis as the participants in the case study were not given a set template or writing style to follow.We chose to exclude the five characteristics for a set of requirements as we only asked participants to provide a sample of requirements rather than a complete requirements specification, and we did the analysis on each individual requirement.We also looked at whether a requirement is vague because we felt that being imprecise might not necessarily mean the requirement is ambiguous or unverifiable -it may just need more details or explanation.
For user story analysis, we used the INVEST criteria [4], [5]."Independent" was excluded as it would require evaluation of the user stories as a set, while analysis was conducted on individual user stories.We also made note of whether the user story was correctly formed according to the Agile Alliance user story template [5].As SRS-style requirements and user stories have different purposes and quality criteria, we did not use the SRS-style characteristics to analyze user stories, and the INVEST criteria were not applied to SRSstyle requirements.The requirements in this study are in written form, and so we also considered language quality as a contributor to the overall requirements quality.Therefore, we applied codes for typos and grammar issues.
After the first author completed the first analysis pass, a sample of 10 randomly-chosen responses (a total of 50 requirements) was analyzed by the second author.Then, we came together to discuss any differences and how to improve the coding book.Coding was redone by the first author based on these discussions.Tab.I shows three examples of requirements received in the survey and the final codes that were applied.The final coding book with examples is available online [9].Fig. 1 gives an overview of the thematic codes.

IV. RESULTS
47 people answered the survey.However, three respondents did not complete the requirements writing task sufficiently; therefore, 44 survey responses were considered for the analysis with 220 requirements in total.For simplicity, and to aid comparison, we report percentages over all collected requirements (user stories and SRS-style ones), even though not all errors are applicable to all requirements.
Respondent Demographics.Fig 2 shows the native languages of our respondents.The majority of respondents had Polish as a native language, due to the third author sharing the survey link with the Master students of software engineering specialty.Swedish, Chinese, and English were the next most common native languages of respondents.Although there are many dialects and languages, the participants are known as students of Beijing University of Technology where the language of instruction is Beijing Mandarin.
In terms of roles within software engineering, 22/44 respondents were students of master-level studies who might be treated as novice requirements engineers.Industry practitioners were the next largest group with 8 participants, and there were also 5 Researchers.9/44 respondents had multiple roles within software engineering: 6 were both a student and an industry practitioner; 2 were both a student and researcher; one person was an industry practitioner and a researcher.
Among the 14 respondents who selected the industry practitioner role as either their only role or as one of their multiple job roles, 4 stated their roles as "Developer" and 3 "Software Developer".There was one answer each for the following roles: "Senior Software Engineer"; "software engineer"; "System Architect"; "Technical project manager"; 914 PROCEEDINGS OF THE FEDCSIS.WARSAW, POLAND, 2023 the 44 participants) given in the online survey, 233 codes were applied.This means that multiple codes were applied to some requirements.The four codes that were applied the most were: unverifiable (25.91% of all codes); ambiguous (21.82% ); grammar issue (18.64%); and incorrect format (11.82%).
Looking at Table II, the native Chinese speakers had by far the highest percentage of occurrence of unverifiable codes (46.67%).The native Arabic speakers had the second highest percentage (30%), and the native Polish speakers had the third highest percentage of unverifiable requirements with 28.24%.Native Arabic speakers had the highest percentage of ambiguity occurrences with 50% of the requirements given being coded as ambiguous.The Polish native speakers had the second highest percentage of ambiguous code occurrences with 28.24%.
Observation 2: There are four requirements qualities that were affected the most that are: verifiability, unambiguity, grammar correctness, and correct format.
Observation 3: Native speakers of Polish, Arabic, and Chinese introduced the highest number of errors.
Other Factors.In our survey, we collected data on other factors such as level of education, number of languages spoken, and mother tongue.We found that holding a Bachelor's degree as the highest level of education and speaking four or more languages had a negative effect on requirements quality.This data is omitted for space reasons, but results are available online [9].

V. DISCUSSION
All participants in the study did make requirements quality errors, regardless of their native language.However, being a native speaker of Chinese, Arabic or Polish may have a negative influence on the quality of requirements that are written by those speakers.Two of these three languages have a writing system that is entirely different from English, which uses the Roman alphabet.
Unverifiability was the most common error made by the study participants and is a quality that often concerns Non-Functional Requirements (NFRs).The second most common error was Ambiguity.Althouth, as mentioned in Section II, ambiguity is a widely-researched topic within software engineering [11], [12], [1], [13], [14], the results from the study in the present paper suggest that continuing research and education in this area seems still needed.
The third most common error-grammar issues-could also be considered to be connected to ambiguity in some cases.Introducing grammar-checking tools and proofreading into the requirements writing process might help in preventing these errors.Then, there was the incorrect format error type as the survey participants did not use what is considered to be the standard user story format [5], [4].Thus, using such frameworks and tools for improving user story quality [15], [16] might be valuable.
Chinese, Arabic and Polish appeared to have a greater negative effect on requirements quality than the rest of the languages in our studies.However, we cannot claim what is the root cause of this observation.It is necessary to investigate whether requirements quality is affected by the native language itself (linguistic differences), the level of English education, education within software engineering, or other factors.Future studies that discover the root causes might deliver guidelines for requirements for engineers and educators.

VI. THREATS TO VALIDITY
Internal: Thematic coding brings threats to validity due to being subjective in its nature and subject to the bias and experience of the person doing the analysis.In order to mitigate this and minimize the threat, the second author received the coding dictionary that we created and independently coded a sample of 20% of the requirements obtained in the study.The English level of participants was not taken as the variable in the study, but we had an inclusion criterion-the participants need to have enough knowledge and skills so that they are able to either study or work in English.External: The study may not have a large scope of generalisability as even though the survey was shared with nonstudents, a large portion of the data collection was reliant on students.However, it could be argued that the results from student data could be indicative of the software engineering industry as they frequently work and might be treated as novice employees.

VII. CONCLUSION
This study investigates whether native language has an effect on the quality of requirements.The results from the analysis of the online survey data suggest that native language may indeed have an effect on requirements quality as well as on the type of error introduced by the requirements writer.It follows from our study that more work and education need to be carried out on improving verifiability and ambiguity within requirements.Moreover, more training is needed also on how to write user stories so that they are well-formed.Grammar issues were also quite prevalent across all requirements.Our results might be used by practitioners to include quality checks of the errors in their review process and by educators to draw the attention of students to errors they might introduce and teach them how to prevent making those errors.Moreover, researchers might use our results to investigate the root causes of why native speakers of some languages make more errors than native speakers of other languages.