The Interagency Language Roundtable Scale
The approach to speaking proficiency testing used in the US federal government today is the outgrowth of approximately forty years of conceptualization, theoretical and applied research, and cooperative development of measurement instruments intended to determine the extent of the examinee's ability to function successfully in a variety of oral communication contexts typical of real-life language-use situations. Although interagency work in this area is proceeding at the the Center for the Advancement of Language Learning (CALL) through its Federal Language Testing Board (FLTB)--composed of representatives from each of six participating agencies (CIA, DIA, DLI, FBI, FSI, and NSA) and from the Interagency Language Roundtable (ILR), it must be clearly acknowledged that these efforts are in very large measure dependent on the wide array of speaking proficiency testing activities carried out within and across a number of government and private-sector organizations over the past four decades-- activities which, in turn, have derived their essential focus and impetus from "breakthrough" testing efforts undertaken by the Foreign Service Institute some four decades ago. A brief chronological account of the salient activities within this large and productive testing arena will help to situate current initiatives and place them within this larger conceptual and operational context.

As of the early 1950s, there was growing dissatisfaction on the part of the Foreign Service Institute with regard to its own and other available language tests, which were focused on discrete-point, achievement-oriented formats and item types and did not indicate in any useful or meaningful way how well the examinee could be expected to function in real-life representational and other situations which he or she would regularly encounter during service abroad. The testing "breakthrough" conceived of by FSI in response to this problem was the elaboration of a set of verbal descriptions defining six levels of general language proficiency, ranging from no functional proficiency in the language (Level 0) up to proficiency equivalent in all respects to that of an educated native speaker (Level 5).

The testing technique used to place examinees at the proper level within this 0-5 scale (subsequently expanded to include additional "plus" categories for each of the "base" levels 0- 4) was, quite appropriately, a carefully conducted face-to-face conversation. This was intended and designed to reflect, as closely as possible, the language behaviors at issue in real-life communicative interchange, and at the same time to gather sufficient relevant information about the examinee's speaking abilities in the target language to permit the accurate "mapping" of the observed performance onto the verbal level descriptions.

Following its development, the "oral proficiency interview" and its corresponding rating scale were adopted and used by a number of major government agencies in addition to FSI, including the Defense Language Institute (DLI) and the language school of the Central Intelligence Agency (CIA). The latter organization was particularly active in the development of tester training materials and other documentation to further explicate the new elicitation and scoring techniques.

Growing requirements to test the speaking ability of its volunteers on a world-wide basis prompted the Peace Corps, in the early 1970s, to enter into an agreement with the Educational Testing Service (ETS) to develop instructional materials and procedures for training in- country target language-native Peace Corps staff to conduct and rate the "FSI"-type interview. A further dissemination initiative launched approximately ten years later was a series of so- called Testing Kit Workshops planned and coordinated by the then-dean of the FSI School of Language Studies, Dr. James R. Frith. These workshops were attended by a number of college and university language instructors, and resulted in familiarizing a large number of academic leaders in the foreign language education field with the basic concepts and procedures of oral interview testing.

A further major step in the dissemination process was the launching, by ETS, of the so-called Common Yardstick project which convened, under a U.S. Department of Education grant, both academic and government agency representatives to review and discuss the "FSI test" and rating procedure and evaluate its potential as a commonly-understood and uniform metric for speaking proficiency assessment within the academic language training community. Although the group fully endorsed the measurement concepts underlying the FSI interview process, it was also very cognizant of the fact that the 0-5 level rating scale, even including the additional "plus" values, was not fine-grained enough to reflect the relatively modest improvements in overall language performance that could reasonably be expected over the course of one or two years of high- school, or even college level, language study. To address this issue, the Common Yardstick participants proposed a modified scale, subsequently further elaborated and jointly adopted by the American Council on the Teaching of Foreign Languages (ACTFL) and ETS. Under this new "ACTFL/ETS" scale, the original level 0 to 0+ range of the FSI scale was broken into three categories (Novice-Low, Novice-Mid, Novice-High), with the level 1 range in turn segmented into Intermediate-Low and Intermediate-Mid categories. The basic functional meanings of Levels 1+ through 2+ were retained in the ACTFL/ETS scale but were re-designated as Intermediate-High, Advanced, and Advanced-Plus. A single "Superior" category on the ACTFL/ETS scale was adopted to include all 3-and higher levels of the FSI scale. In addition to broadly disseminating the generic (non language-specific) scale through its publications and other mechanisms, ACTFL subsequently developed a series of language-specific level descriptions for both commonly and less-commonly taught languages, with the assistance of committees of high school and college instructors and other resource persons in these languages. ssAlso in the early 1980s, the Testing Committee of the Interagency Language Roundtable (ILR), with representation from all government agencies concerned, undertook the rather considerable chore of developing draft proficiency guidelines for the other three language skills of listening comprehension, reading, and writing. In addition, the committee took the opportunity to review and fine-tune the ILR speaking descriptions. Finalized descriptions of the proficiency levels for all four skills, including, in each instance, functional examples of characteristic examinee performance at each level, were officially adopted by the ILR membership in 1983 and disseminated as the "Interagency Language Roundtable Language Skill Level Descriptions". In July 1985, these descriptions were ratified and promulgated by the Office of Personnel Management (OPM) as the official standards for documenting language proficiency within the U.S. government.

Language proficiency tests are used by any US govenrnment agency or organization that has an operational need to classify individuals with respect to oral communication ability in a target language. Potential examinees include job applicants, personnel being considered for language- requiring positions, students at government language training schools, or any other individuals for whom verification of current speaking competence in a foreign language would be operationally needed by the requesting organization.

The tests are intended to serve as accurate and valid indicators of the nature, extent, and degree of effectiveness with which the examinee is able to communicate orally in the target language, as defined by the Interagency Language Roundtable (ILR) language skill level descriptions. In accordance with these proficiency descriptions, an examinee receiving, for example, a rating of ILR level 2 would be considered able to "satisfy routine social demands and limited work requirements," "handle routine work-related interactions that are limited in scope," and in addition accomplish the other specified level 2 language-use requirements--all with the degree of accuracy, facility, and overall communicative effect characteristic of that particular level description.

A major objective of CALL's efforts in proficiency testing is to develop and maintain a high degree of uniformity of testing approach and corresponding equivalency of reported scores across participating agencies, through such techniques as joint training of testers, mutually agreed-upon validation and quality-control procedures, and increased emphasis on interagency cooperation and resource sharing with regard to oral testing activities.


An Unabridged Version of the ILR Scale

