You may want to read Testing in the Teaching-Learning Process. Part One
Testing in the Teaching-Learning Process. Part Two
Consuela Popa, Romania
Consuela Popa is an English teacher. In the past she used to teach English and French. She has taught in state schools, high schools and secondary schools, at all levels, and different profiles. She is interested in linguistic research, cultural studies and writing in English and in the study of other languages: French, Spanish. Christian theology, sociology, psychology are her other fields of interest. She cherishes a lot the opportunity of writing for HLT, since the attitudes and values discovered this way help grow and feed the spirit. As artists, linguists should be aware of the fact that interdisciplinary aspects are unavoidable and that we should touch a variety of fields through our writing. E-mail: email@example.com
Criterion referenced tests and norm referenced tests
Indirect test type items and direct test type items
Subjective and objective testing: Skills and communicative testing
Tests can be classified in different ways, and there are different types of tests, according to the aims of these tests towards the fulfillment of the goals of instruction. One type of classification that I would like to take an insight at now is the criterion referenced and norm referenced type of tests.
Basically, a criterion based type of test or “criterion referenced’, as this notion was terminologically set, is a test that aims to check how well the content of the subject matter is mastered by the students. Of course, not only qualitatively, but also quantitatively joint, this checking of content can show the score of grading for a specific student, or for groups of students, not by comparing one test with another, or one students result with another’s, but by setting some previously agreed criteria concerning the subject matter. Students must possess mastery of content, and skills, or competence necessary for their further profession. Hence, these skills or proficiency that someone has to prove that he or she possesses, have to meet certain agreed standards, according to which that person could be considered as, let us say, “qualified, or “proficient”, or competent, in the field area (which could be anything, like languages, but also medical, technical, etc).
Certain scores or grades are set in order to describe the performance or proficiency criteria that reflects whether that person does, or does not have the skills, or whether someone masters the content of a certain field or domain that is subject to examinations of this type. The results can be a “pass” or “fail”, type, or we can have results that can mean grading/scoring such as A, B, C, D, in Cambridge examination, for instance; other exams have different numerical system of evaluation, like summative exams ranging from 12 to 20, or from 1 to 20, (20 being maximum score in summative exams in France) or from 1 to 5, (Russia), where the percentage that describes or reflects quantitatively and qualitatively, the agreed standards, is to be carefully established before by specialists and examination boards, evaluation forums, specialists and researchers.
In Romania we have a grading system that ranges from 1 to 10, and 5 is considered as minimum required is order to be considered as a “pass”, but different types of examination or evaluation during the school period can vary. These can vary whether the total score can be “rounded up”, or not, or maximized towards the next higher grade(for instance a 4.50 means a total 5 in the teacher’s grading eventually, but not always, as some other time, exams final grades are expressed with percentage e.g. 4.50, 5.66, etc and the final grades are left like that in order to reflect students` achievement). Thus, this can be, in terms of grading, a criteria that is supposed to be directed to the student’s benefit or not, it depends again on the type of examination or graded paper, or on the subject matter’s in case specific requirements. Whether to the final mark we “add”
some little percentage in order to help the student pass or whether we do not “add” anything, by some previously accepted standards, may express the value that a certain grade or exam has during the instructional period, or may reflect the stage in which we are.
More important summative type of exams in Romania, for instance, like baccalaureate exams, tend to express a final mark taken altogether with the partial percentage like the above examples, (4.55, 9.80), while during the school year subject matter grades are rounded up by some specific rules (Maths has different rules as compared to English, and it depends upon the class profile type, the number of classes per week that are allotted, etc). It can also depend upon the school type. In language forms or schools, that are placing high importance on foreign language teaching or humanities, with more language classes/hours per week, there must be a final term/semester paper, summative type of examination, that is supposed to “round things up”, by means of results. For the final score we must consider and balance the grades obtained during the semester, with the final test paper during the semester. The percentage that allows the formation of a “final grade’, end of semester grade, in Mathematics, for instance, has varied slightly in terms of modality or established criteria. Also in universities, there can be different standards, and for some colleges and specialties, the formative “weight”, or value of some disciplines, of the so called “partial exams”, or training, is also expressed through grades, which can add-up in a certain pre-established way, towards the final grade, the one that contains the more exhaustive type of examination at the end of an instruction period.
Colleges and universities can place an increased interest upon certain formative types
of evaluation , like the portfolio type. The portfolio should reflect the individual student’s specific work or added work samples throughout their learning period, and it is
actually aimed at reflecting in the best possible way the autonomous learner’s personality, experience and creativity.
However, this issue was debated strongly lately, in some schools or countries.
The portfolio assessment type was replaced by more official or formal like, summative type of examinations, because of the fact that it did not necessarily reflect the student’s personal contribution and some students failed to understand and consider portfolio assessment as a right opportunity for them in order to be themselves and draw out of it, the intended benefits.
I do not think that there should not be other voices out there, though, who might feel this is the best formative type of examination for them, especially in the case of students with a raised awareness of their own responsibility and agency in learning. But this issue can be an idea for further, very passionate debate, since it is controversial, because it did produce different effects on learners. The process of changing or of adopting this learning and evaluation type reflects the troubled or turbulent itinerary that the portfolio practice has taken in different times and for different schools, subjects, and pupils, so the variety with which this issue was appreciated and used, or the degree of its effectiveness, are very important features for observation. I think good analyze and statistics can help methodologists and assessors to figure out what works best and when, and for what particular learner types and in what (complex) circumstances.
Norm referenced evaluation has different aims. It is supposed to “discriminate” against one student or another’s level in the examination or in a certain form of evaluation/assessment, to make comparisons, to compare a particular student’s achievement against another’s, for instance, to compare also the percentage of low-achievers against the high achievers` rate, on a percentage curve that shows how many of them have obtained results that range at the lowest score, how many of them have been scored average, how many of them have scored high. This type of statistical curve, however, is not the only norm referenced type of assessment. I would say it is a means of gathering information about achievement and assessment alike. At this point, one important remark would be that in my research sources, this curve is not recommended to be taken in consideration with too much importance when we think about individual students` achievements. That is, because the items that are used in norm referenced tests might fail to accurately reflect the actual “differences” within students` individual level, their level might be well discriminated, one against another, but they may prove to be having different reactions to test conditions and stress, different levels of creativity, mastery of content, aptitude towards one issue or another under consideration; the items may not cover such a broad range of data in order to make a correct judgment. Moreover, when statistics like this are attempted, the variety in test types from one school or another, from one state or another, or simply the school types and profiles, pupil profiles, etc, may lead onto misinterpretation and bad feed-back, with no good source or “open-door” for real conclusions.
But, leaving off the idea of such attempt for statistics curve, norm referenced assessment is generally known to be intended for placement or advancement programs, and comparisons are generally meant to eventually lead onto the accurate ability of placing students into groups or classes according to some similarly shared characteristics, characteristics that are supposed to help further instructional plans and strategy, techniques, interaction, collaboration, etc. If criterion referenced assessment has got in view the objectives and the mastery standards against which the students, each individual student, should be challenged, and they are scored or graded according to the specific assessment scales designed for this specific goal, norm referenced assessment operates with standardized comparisons, and works with individual comparison in relationship to others.
Since assessment is aimed at illuminating teachers and specialists in order to assist students in their learning, in this respect formative assessment as good practice that leads to success comes back as a recurrent important tool and prerequisite for good instruction, then norm referenced assessment, through its student placement goal, should serve as an important source of information regarding students. And as a useful observation and analyze resource which assessment specialists and test designers, examination forums, and any schools in general that organize exams, should know how to handle well, norm referenced assessment in an important while-learning or “during learning” factor for evolution, or involution, depending upon how well this is used. It is difficult to try to nuance now how much an intermediary role norm referenced testing plays; or to which extent norm referenced assessment reflects also what we might call an initial stage or “starting point” of the learning process.
Norm referenced assessment could be seen as an initial placement way, as well, even if students, similarly or jointly gathered along certain characteristics, upon this placement procedure, do come with individual learning styles whatsoever. They come with different and unique personalities, but which are only meant to be completed in an efficient way within the groups and classes that are then formed. Some authors and methodologists consider placement tests as a formative assessment type, some other take this notion somewhat separately and give special treating to placement as a particular stage in itself, but however these interpretations might differ, placement assessment means are a norm referenced way of assessing, being a standardized comparison means that serves a certain purpose in further instruction.
In order to place students and help them progress, in order to assist their further development in cooperative relationship and efficient interactive instructional strategies, learning and information should mean not only learning alone, but also “learning together” and making full use of the wonderful possibility of sharing knowledge and information with others, of interacting and enriching one another. Norm referenced assessment comes to help and monitor teaching and learning if also criterion referenced assessment is placed high within our concern topics and both of these practices are understood and used properly.
There are many test types, and during the teaching process, these test types have been defined with some varying nuances, though. It is important to mention such definition formulations that methodology specialists have tried to give, some of them simply stating some matters of fact, while others acknowledge the interpretations that may exist. These nuanced formulations do not dogmatically stick to a certain way of establishing the nature of the tests in a “mathematical” way, too strict, because after reflection, we realize that more than one interpretation could be given. I shall also attempt to do so in my following essay/article below. I think common sense should allow both stricter definitions, in cases where tests types are more obvious than others, in their nature, while other definitions allow a variable of interpretation, according to the practices and experience, to the reflections that help us realize that there is more out there than just a methodology book generally accepted formulaic, definition.
I think everyone of us should try to use critical thinking and demonstrate this critical thinking with good arguments. I have to mention, though, that to my mind, with respect to the “Testing” subject, there are many opinions, very strong, and great test designers, also great theoreticians of testing. But at the same time this subject is one that has been troubled the most, precisely because there are methodologists, and there are more than just a few, who interpret the testing issue too scientifically, or mathematically limited, to the point of frustration and narrow dogmatism, even when they act as test designers and as scorers. Not realizing that, however “mathematical”, or more objective these issues might be, still, we find ourselves within the realm of practice and experience, of language teaching methodology –a humane, first, if you do not wish to call it humanistic, way of working with people and of teaching and improving through testing. The purpose of testing, that of shaping people and not deforming them, was agreed upon before, and should be kept in mind as a key factor. Other test designers, of course, might be fiercely against the dogmatic way of defining testing and of imposing practices that become a-temporal and anti-student in character. As shown before in the section regarding the vital component of assessment analysis and evaluation of results, in the case of low achieving in exams, it is testing in itself, whether formative or summative or/and teaching practices that need to be altered!
Since there are personalities out there, great test designers, who should recognize themselves as liberal and open minded, while at the same time not making compromise with regard to the high demands of such aspect as evaluation/assessment, I hail all those voices that have tried to make a difference and they did not only try to give us good “practice” samples, but also made us think deeper. And they do have a point, in terms of philosophy of testing, besides being, what I should call, “lofty technicians of the great art of testing and magicians or masters of testing”. To master the “magic” of testing and its techniques, one needs to know how the mind of our students works, and they must also know well how to devise such tests. Thinking and practice should go hand in hand. I am sure that they will know what I mean when they think back to their excellent feed-back, either edited, either during conferences, that they gave to all teachers and trainees, including myself, who were interested in testing! In this way, we see that, behind a test designer’s personality that voices out techniques and strategies, and testing discipline, there is, and should be, an abstract thinker as well, one who empathizes with the students and test solvers and who anticipates possible reactions/their psychology. (For instance, Cambridge and IELTS test specialists and coursebook authors like Bob Obee, during conferences, explained how behind any test item that we devise and solve, must be the techniques and strategies that prove that we know how the minds of our learners work).
Since I have mentioned earlier the issue of norm referenced tests and criterion referenced tests, and this way of classification is only one, the classification of tests according to the particular skill or area of knowledge that are tested also arises, but this time, the distinction is made at the level of skills evaluation/assessment.
Thus, a test is considered to be “direct”, if it tests well one communicative skill that we are interested in, for instance. But as many authors have argued, the skills question is much more complicated than it seems and it goes deeper into further questions that will be left out there for further research.
Receptive skills (listening and reading), and productive skills (speaking and writing) should be understood as functioning well, in order for us to gain language competence or proficiency, only when these skills are seen as integrated. However, when there is a weakness shown in one area (for instance listening), (or reading), for the student, and we notice that this draws him back from acquiring mastery in the other skills involved (productive, like either speaking, either writing), than setting activities and our more focused attention towards a better preparation of our student for the particular receptive skill or productive, will help the student “integrate” well, the receptive-productive cycle into his level of mastering the language and will push him forward. Sometimes there are the receptive skills which are deficient, because of lack of practice (and through practice we can also heal or mend in-born deficiencies or weaknesses). Or we can discover that productive skills (either speaking, either writing, or both), have been insufficiently developed and this influences the overall performance, language competence and fluency of our learners.
Indeed, although listening as a language skill is acknowledged as a rather primordial skill, coming first as importance, since listening is the first thing language learners do, it may be that the productive skills should be deficient in our learners, generating thus, a need for academic language practice (polished speaking, academic, and also polished writing).
Nevertheless, one learner may not be able to speak fluently because his listening sessions have not been sufficient, appropriate, and efficient, as a primordial reason. Similarly, his/her writing skills may suffer because he or she have not practiced reading well enough, they have not been exposed to the written word by means of visual and mental contact long enough and well enough for them to acquire also the necessary writing skills. The influence of receptive skills over productive skills and viceversa makes up a vicious circle. The reality of our learners` performance and linguistic competence is a complex summing up of factors having to do with individual receptive skills, and with individual productive skills; it also has to do with the way these receptive and productive skills have been integrated and perceived, on the whole, by the learner. The cycle reception-production reflects an integrative phenomenon.
It is known that for particular language learners, the receptive skills can differ as level of acquisition and development, (listening skills can be better developed than reading skills for some students, for instance), and also productive skills level can differ. One person might perform better in writing, than in speaking, or vice versa, he can make a good impression in speaking sessions, while academically we can discover that he lacks the training in order to possess good writing skills, or he can perform quite well in all these skills, seen in their integrality. I think there is never enough of what we might call “good amount” of testing, in terms of improving and developing someone’s level. In this respect, as practice is always required when speaking or writing in a foreign language, the formative value of testing for maintaining a certain level and for improving it, is a reality that cannot be denied.
We must challenge ourselves into testing our level, as often as we can. No matter how academic and proficient a person might be, there will always be situations in which new words or notions should arise. The language being a “living body”, if I might say, makes it possible that we should encounter new notions and vocabulary often enough. I mean here, neologisms, words from the technical field, scientific and science-fiction, that have been formed (especially in American English), as a mirror of what happens everyday around us, as a mirror of the technological development and as a mirror of social and economic evolution in general. Some other times, we might hear or read words and notions, phrases, expressions, that are not “new” to us, we might have come across or heard them before, but yet, we perceive these words as freshly arisen. It is simply because such language has been left somehow “in the back corners of our mind”, we have, consciously or less consciously, “acquired” language, or those words before, but we have not used such language and it seems like unknown to us. It may seem challenging to us, fresh, or “dry”, depending upon individual perception. Such less frequent words or language, when arising, can stir interest for people who hear chunks or bits of language in
new situations. Even native speakers can come across linguistic elements, vocabulary, chunks of language, etc, that are completely new to them, or that have been somehow “activated”, or brought back to life, after much time.
Formative testing may, as I was trying to state, appear under the form of sequential skills assessment. For instance, we have discovered that a speaking difficulty for our student is derived from his poor exposure to language, and therefore we decide that what he needs is more listening practice activities, so we concentrate more on this “isolated”, aspect. But this aspect will not be isolated anymore, since as a person is complete, as a learner personality or simply as a subject. Listening and speaking faculties and aptitudes must be seen in integrality. The “isolated” listening skill that we focus on improving upon at a certain time period, through practice activities, will reasonably and unavoidably join the cycle for its completion with other skills, like speaking, reading, writing.
The abilities and the interest that are supposed to be developed and “stirred”, depend one upon another in a globality of factors and circumstances, including affect and atmosphere, factors of humanistic teaching, basic principles, and we shall have to address one person’s language training by taking all these aspects together, along with the person’s personality traits and behavior/temperament.
Thus, testing one isolated aspect at one time, because we need that matter solved in order to be able to progress, may be one formative stage in our teaching-learning-assessment cycle, and we may address that language deficiency and work upon it, correct it, through direct testing. Direct testing can involve some sequences, aspects that involve one linguistic skill that we have mentioned (it does not matter which one, it can be anything), or direct testing can also involve integrated skills, two of them or all of them.
And as I have said, more space is left for further reflection and research into this matter, since psychologically speaking, a learner personality with whom we work, is complex and whole, and relationships between all in-born “gifts”, aptitudes, or talents, and between all faculties of one person, raise into discussion a matter of great concern that should be taken into account and debated. This sends us to cognitive psychology, behaviorist issues, sociolinguistics, fields of interest that might have remained yet unexplored and have to do with the background and the universe of our learners.
Direct testing, that is aimed at communicative skills, is not the only method of testing which should lead onto growing and language development, because we might wish to build knowledge in a more complex way. And as it was said, we might wish to explore as teachers and testers, and put into practice, much more than that which is at the communicative level of our students, we might wish to go deeper, into language study and grammar knowledge, beyond the communicative (surface), skills.
That is when indirect testing occurs, when we might wish to check upon the language level of our students by taking into account the grammar issues, vocabulary, discourse management, text comprehension and mastery and also “tricky”, or other touchy clues, details, that can escape our students` knowledge or awareness.
Examples of indirect test type items that can be used for such language reinforcement mentioned before, are: multiple choice test items, cloze sentences, transformation/ paraphrase, fill in, completion, different kinds of exercises, sentence re-ordering, true-false sentences, etc. In this way we can spot if there is one language item or grammar issue that had not been properly learnt previously which poses problems of fluency and we can remediate these problems through the alternation of indirect testing and direct testing.
Different stages during the teaching and learning process generate different types of tests. Traditionally, there are some typically recognized types of tests. (Although we can find diverse research sources, even posted online, that can argue slightly different).
But generally agreed, the common types of tests that are known as such in methodology books can be easily acknowledged as being used everywhere and by everybody. Such test types are, according to the stages during the teaching and learning process: placement tests, diagnostic tests, progress tests/ achievement tests, proficiency tests. For instance, in “The Practice of English Language Teaching”, by Jeremy Harmer, progress tests are taken somehow, as being the same as what we could call “achievement tests”. Some authors, of course, and any of us, can surely distinguish nuances, depending upon the time period when we test, the amount of knowledge/content, test conditions and the situation, the country or specific school circumstances that have experienced various types of assessment.
Thus, as said before, placement tests are a norm referenced assessment category that helps a school institution or a training/educational group or organization, internally sort out students upon their level and skills (general linguistic level, for languages), in order to work in classes or groups with approximately the same level and in order to start well a teaching and learning program by organizing students through level categorization. One class with a certain group of students can be consistently different from another, and there is richly developed criteria through which evaluators/teachers are able to conceive placement tests so that students that are similar in aptitudes, level, or can be on the same wavelength on a projected instructional course, should interact at their best, and should help the further instructional stages meet their outcomes. Some methodologists put placement tests as a separate category, while others suggest that placement tests are a part of the formative stage (an initial stage or phase), and include placement testing in the formative testing type.
However you might consider this, I think that debate that leads to observation and reflection over the current realities in practice should lead us not onto giving “sudden-death” sentences upon the test types, for such result might be futile, but make us reasonably admit that on the continuum axis of teaching practice and testing, the nature and structure of evaluation, and its chronology, is at times, a “mixed”, an inter-related, phenomena, and this phenomena cannot be separated as an independently existing one, but as a complex element in relationship with others. This relationship is absolutely necessary, for just like in physics, any experiment or practice situation, longer or shorter, has stages that are intrinsically linked, and there is no cause without effect or vice versa.
Placement tests being taken, we can move on towards our teaching schedule and purposes, once we have made up our classes and groups. These classes or groups should be constituted presumably upon good principles which should ensure that we shall work with similar level students and with subjects whose skills and knowledge can lead towards good collaboration and global, as well as individual goals achievement.
Diagnostic assessment is again, a notion that by some authors, has been intertwined with that of progress tests, still with a high formative value. When we assess students by having them take diagnostic tests, we want to check upon their strengths and weaknesses, whether these strengths or weaknesses are more obvious, or whether these are less obvious. We certainly do not want our learners` weaknesses to become real hindrances in our teaching-learning process.
Some diagnostic tests attempt to eliminate even the least of such hindrances in teaching and learning, and their principles and purposes aim at making the teachers know in detail about any aspect, even secondary, or psychological issue about their students, that could prove of great (in)formative value in future. Diagnostic tests are meant to individually detect students` knowledge gaps, level, potential difficulties, strong points and weak points, again, not in order to sanction them with low grades-but with a remedial value in view.
Teachers wish to remediate their students faults, they want to make them progress, and once they have “diagnosed” their students` weaknesses and strengths, and their problems, they can thereupon decide what are the next steps to follow. Teachers will decide on how to teach them further, whether to change the teaching style/or testing style or not. They will decide on what strategies and techniques are to be employed later on, in order for the remedial plan to be achieved and in order to “heal”, students, and make them move on and advance, make them progress and attain their goals and learning purposes/outcomes.
Further on, if we just attempt to advance in our teaching and learning process, and make students “polish” their learning styles or their performance in a particular area (in our case linguistic area), we could use testing or progress testing in the proper sense, (here I do not mean “diagnostic testing”, intended as above).
We can use progress testing in the sense that students should, through testing, formatively draw a “lesson” out of such progress tests, and learn more, and learn better as well, or discover new interesting issues, develop their skills, knowledge, learner personality, enrich their experience and practice, and move towards the achievement of their learning outcomes. This is why, such progress testing is sometimes taken by many authors and methodologists as “achievement tests”, because they might be extended, and quantitatively consistent, being able to link, over a longer period of time, what has been previously acquired and learned, with what has been perfected during the later stages of instruction. As I have argued before, any progress or achievement test (like those at the end of a longer unit or units of instruction, end of term period, etc), should formatively “shape”, and not de-motivate or sadden students with the menace of “official grading”. Whether we have transmitted our students` values and attitudes regarding these grades and how they should regard them, is a matter of high importance for every teacher.
Achievement tests move somehow towards the more “summative”, value, or summative type of testing, because their results are officially recorded (most of the time, especially in bigger or longer period tests). Also because achievement tests, besides formatively attempting to generate also progress, do “sum up”, or round up, an amount of information regarding the process of learning and teaching, regarding how these processes have been achieved previously.
And as I have shown before, formative and summative testing/assessment are “intrinsically linked”. On the “continuum” axis of learning, along the instructional or formation process, achievement or progress tests, taken at different stages and with different values or purposes in mind, can move towards a more formative or summative type of testing, or can engulf both.
While advancing towards the desired goals of learning, progress and achievement testing, formatively used, can be also “summatively” transformed, when we really need to get a score about a certain reality/level. Summative testing should not only indicate level and overall teaching and learning efficiency rates, but this is needed for further career or job position utilization, for further official entrance to Universities, different countries` formal (but yet reliable), requirements for professionals, etc. Summative testing also has, in terms of “visible”, official side, a practical usage, it is required as a “certified” quantifier, for our students` future life careers.
Proficiency tests, language proficiency tests, are an example of summative tests, that check upon general language ability, and achievement, and these tests can follow a certain syllabus or “officially communicated” content, or not. Such examples like Cambridge and IELTS exams, TOEFL, all kind of language tests that are further on needed and required in order to enter a foreign university, environment, a certain specialty field, are quite important and practiced on a large scale. Some proficiency tests tend to relate more to general, standard English, required level, while others, like the IELTS exams for professionals, prepare students for specialized terminology and discourse mastery.
Assessment can, however, embrace other forms than those mentioned before. One example of modern assessment would be the portfolio assessment. Portfolio assessment can require from students a great variety of items. Different teachers can give different tasks. Together with students also, they can also negotiate and agree upon what other items to include, what sort of creativity work would fit best. The students` free spirit
will choose to add to their portfolio in order to give a better image upon their personality as individual learners and in order to “feed” the teacher with precious information about their personalities and learning styles.
Portfolio assessment has got some clear benefits. The characteristics of portfolio assessment are multiple, and among the most important ones, I could list the following:
- it makes use of one major principle in teaching and learning, which is learner autonomy-the student can individually choose, with more or less teacher’s guidance, depending upon the situation, to add to the portfolio the works and pieces of information, solved problems, exercises, writings, tasks, etc, that can reflect not only his strong points, but also his weak points, his level of creativity, interests, passions, etc;
- portfolio assessment can help students identify their goals in learning and can, through the previously mentioned “learner autonomy” principle, drive students towards a better understanding of their learning styles, work, etc;
- portfolio assessment is a way of driving students towards the goals and purposes of their learning on the whole, goals that include the acquisition of life long values and attitudes, as well as practical skills with respect to their studied subject-in our case language skills.
Language skills have direct connection to what we might call the communicative, real life size ability of interaction and of personalizing interaction in the target language.
For instance, communicative skills improvement and creativity work can be reflected through a communicative activity, upon finishing portfolio assessment work, that is to be delivered in front of an audience, as a way of exposing portfolio or items/sections of the portfolio, in a direct way, to people with whom students can interact through questions, debate, interviewing techniques, etc;
Portfolio assessment is a way of assessment that can help students become autonomous through self-evaluation, a principle that is not without its pitfalls, of course, but that is supposed to bring teachers and students, both before as well as after their portfolio work, towards better communication and mutual exchange of information. This type of information, in other types of assessment, is restricted from students to teachers exclusively, or almost exclusively. Such portfolio work forces students to enter the realm of self-assessment, and they can evaluate themselves, they are bound to learn and find out more, via this way, about the principles of testing, global assessment scales or global principles and balanced elements that are needed when evaluating a piece of work that belongs to their portfolio. For instance, they might learn better about the demands and requirements of academic writing, through their essay individual work, (they can, especially at higher levels, gain a lot of experience in all essay types, from narrative-descriptive, to argumentative, opinion essays, to reflective and abstract essays). Students are able to do their own research, web quests, as well as classical bibliography book research, in order to understand the demands and ways of structuring essays, when they act as their own assessors. When students are self-assessing, they also become, besides researchers and discoverers, methodologists and in a way, teachers!
Teachers should take more time to train them in taking up this posture, also, of self-assessors of their own work, and explain them how they could possibly grade themselves and assess themselves in an overall way in extended work, in academic essays, in creativity samples, or in other, varied, forms of individual work. Students can thus, not only get accustomed to different exam techniques, but they are given the time and space, through individual portfolios, to search for a way of assessing themselves better, to enter the realm of teachers, concerning assessment, and get some very important and interesting pedagogy tips concerning evaluation and its principles, concerning methodology.
The self-assessment topic can arise challenging debates, and even if teachers and trainers can come out with different feelings and opinions regarding its way of understanding by the students and its practice, this does not mean that it should be left out, just because it is an issue that can lead to unanswered or hard to answer questions, if not touchy aspects on the whole. Just like individual learning or assessment, the principles of self-assessment that students should reflect upon, and try to adopt, should actually mirror the specific practices and philosophies of each learning environment, and the teaching culture of that specific place.
Several issues that are very interesting when it comes to testing are the widely debated issues of objective and subjective testing, of testing approaches and their evolution, of the relationship between skills acquisition and their development/improvement and the types of tests/tasks we use in order to generate skills reinforcement. Teaching and learning should go hand in hand with testing, like I mentioned before, and using good and appropriate test samples is a very important aspect, influencing our teaching effectiveness and results, as well as the learning process of our students.
The acquisition and development of good linguistic skills depends to a great extent upon how we design and use test/task samples. Receptive skills development should go together with productive skills. Consequently, we have test types according to the need that we have to focus on a particular receptive, or productive skills aspect. Listening and reading, as receptive skills, cannot lead towards language proficiency, of course, unless these skills are properly mixed with the reinforcement of the other two linguistic skills, the productive ones, which are speaking and writing.
Integrating skills is a must towards learning a language well, since no one can acquire proficiency while concentrating upon one isolated language aspect alone, or while ignoring one of them. While it may be that certain learners could be less endowed with respect to a particular skill, and more gifted with others, still, when we train our students, we integrate all the four skills and we expect them to respond to all of them. Proficiency tests and achievement tests, as well as other type of tests, for instance those that aim at checking the communicative competence of our learners, usually integrate all four skills or at least two of them, in more isolated test samples. Objective tests are usually designed for receptive skills, while subjective tests are designed for productive skills. And since we have mentioned the humanistic testing subject, from a formative point of view, if we use tests efficiently, we could structure our teaching and assessment by following a task based learning model (the test being seen as a task). A lot of exercises and practice examples can fit into this category, of task based learning/teaching, and we will need to explore this a bit more later on.
Since we need to integrate skills in order to acquire language proficiency, and we need objective plus subjective testing practice in order to assess the mastery of language in integrality, then another classification arises, that of discrete point testing (when we test one isolated aspect of language), and integrative (or global-integrative) testing, when we wish to engulf a complete image of everything related to a through mastery of language. Thus, global integrative testing should aim at increased validity and reliability, since assessing all four skills demands professionalism in creating test samples and in the practice of these tests upon our real life subjects. Global integrative testing is practiced at its best when the level of our students is more advanced even though all skills practice and testing exists even at incipient stages of our language teaching. Discrete point testing, on the other hand, tends to gather a formative value which is perceived as being an “on the track”, or “while learning”, necessary ingredient.
For instance, we cannot move on to make our learners progress in their creative writing abilities, unless we have made sure that they do possess enough grammar or discourse knowledge in order to allow their minds run freely and not anxiously and use the linguistic patterns and ideas with confidence, ease and fluency (at advanced stages). In such moments, we might need to check upon some particular aspects that our learners do not know, or to dig in and investigate what are the particular areas in which they do have difficulties, if there are language bits that they do not know. We need to have a complete picture of what exactly they miss and how their minds work since they have failed to acquire, or to “learn”, to accomplish, a particular aspect in a language area. These failed aspects hinder their further learning and language mastery and discrete point testing should help us finding out all about their deficiencies, and provide a basis for us to explore, analyze their mistakes, and draw important information out of their test performance. The information is meant to help us in our teaching, when we want to help them recuperate and improve their level, fill up their previous gaps.
Checking productive skills through subjective testing means, for example, checking our students` essay or compositions achievements, in writing productions, or checking their speaking level by setting a Cambridge type speaking test, in which they have to perform orally by personalizing to their own experience a photo that we give them as starting point for their free debate/compositions, etc. These are only examples, and the list can continue by letting our imagination enrich the range of test samples and of possible demands for our students.
Productive skills tasks/tests , for both speaking and writing, require increased imagination and experience from the teacher, also, since when setting up such activities, the teacher must give his best to avoid boring, repetitive, or unchallenging topics. When it comes to creative writing, for instance, especially in a humanistic class or during humanistic type of activities, the teacher must know that there is the risk of formulating compositions or essay demands/titles that will not be fancied by some students, and that some strong willed students will probably resent some topics or express their opinion against one topic or another, and it is the humanistic teacher’s task to psychologically observe the learners and respect their opinions, change the strategies/methods, if necessary, alternate activities, ensure variety, authenticity, to the tasks and test demands.
I do not mean by this that the teacher should change everything just by following the whims of some spoiled and problem behavior pupils, but we should responsibly make sure that we realize what are the causes of the students acting negatively towards some test demands; and when low achievers express their opinions, we should take the most of benefit from this, trying to understand what is the real cause of their failure, lack of motivation and of interest/attention in the task that we give them.
Subjective testing for checking productive skills is not without its flaws, however. That is why, to prevent the teachers from acting negatively towards their students when they have them take such tests or when they give them such task demands, teachers are taught how to objectify their subjective test scoring. From country to country, the means of objectifying such subjective tests might vary, depending upon the specific or the demands of the subjective tests in case, upon the teaching culture, the situation in which students take the particular subjective test, the marking system, and other factors. But generally speaking, there are two “tools” by which we can objectify our subjective testing and these instruments are known in methodology as “the detailed marking scale” and the “yardstick”.
The detailed marking scale deserves a larger analyze, but I shall try to list a few of the most important aspects. When devising a detailed marking scale in order to objectify subjective testing, we should keep in mind key words such as, for example in “assessing” written productions, different essay types, compositions, creative writing of various types: relevance of content with respect to the topic or subject title required, main ideas, clarity of writing, ideas, appropriateness, fluency, accuracy, (not in the least!), clarity and persuasiveness in the presentation of ideas/arguments, the power of arguments, negotiation skills, discourse easiness and mastery of discourse structure, also factors such as layout, form, balance, elements` weight, etc. We should encourage our learners, give them agency, in order to become writers, even before starting their academic formation. For oral performance also, when devising a detailed marking scale we take into account numerous factors, such as content and clarity, fluency, stress and intonation, pace, appropriacy, emphasizing, summarizing, lexical range, discourse type again, not in the least pronunciation (although physical factors such as mouth form and mother tongue influence are still aspects that some voices say that are reasonable excuses); also other factors like confidence, body language, volume, rhythm, the so called speaker`s charisma, etc.
Similarly, the yardstick as an instrument for objectifying subjective scoring may include elements for consideration such as those named generically “descriptors”; different score specialists may give various denominations and add to these descriptors` list, according to the test type, country culture, circumstances, we could name those requirements that are likely to be encountered in most performance descriptors: fluency, accuracy, content, clarity, register, operational command of language, analyze, synthesis skills, emphasizing, etc.
Subjective testing is seen from two angles. This is because the history of teaching and learning approaches, as well as testing, has known numerous changes and not only in theory, since these changes have been more obvious in practice than theoretically. One angle from which to consider and understand the concept of subjective testing is the traditional one. That is, traditional subjective testing items like for instance, testing that contains a composition or an essay task, are considered to be at the extreme of their “subjective” nature, since this rather old fashioned (although academically rich and still effective, in many well-stated opinions), method, is too much depending upon the personal perception of the teacher assessor. Or the grammar translation method, again, extremely rich from a lexical or grammatical point of view, and still preferred by classical practitioners, could be seen as a subjective testing/task that leans back towards the past in a rather inappropriate way. Even like that, valuable academic corners and schools at lower levels even, place the essay writing and writing in general high on their list of priorities, because, even though wronged by potential unfair assessors, there is nothing more constructive than encouraging writing and authorship for our students as early as possible. Writing and free writing in general, can provide inexhaustible opportunities for language development and growth and the way out of a negative result of subjective testing would be the humanistic approach, that is, teachers acting as formatively as possible, as positive feed-back providers, and maximizing their potential for building up bridges, and not for hunting for mistakes.
Moving further with respect to subjective testing, there is another angle, officially separated from the first one I have commented upon above, that of communicative testing. This type of testing is seen as subjective, but being perceived within the communicative trend context, it is modern and at pace with the evolution of the teaching approaches and practices. Communicative testing aims at assessing productive skills and speaking and writing for specific purposes, with functional value, and in this sense its subjectivity is not seen as something with a negative value in view, although there are risks of devising a bad marking scale to grade such tests.
Communicative testing as a subjective means of testing is a reflection of the student’s reactions and preferences, of his personalized efforts that aim to replicate life-size, authentic interactions and simulations of future professional linguistic encounters. Communicative testing aims at replicating the functional value of the linguistic real interaction, whether spoken or in writing, of the students.
Personalized approach, spontaneity, the mirroring of the student’s preoccupations, anxieties and enthusiasm, can be found in communicative type of activities. All feelings that someone is supposed to be encountering when facing, for instance, an interview, a debate, can be “retrieved” through communicative items. Just as well, interaction with institutions, officials, meetings and sessions for specific purposes, can be “played” or “acted” out in advance through subjective communicative test items. Again, an opportunity for humanistic teachers/trainers to reflect upon this type of assessment and to structure it in a formative way, in a way that should render the best possible effects upon communicative teaching practice!
Harmer, Jeremy, The Practice of English Language Teaching, Pearson Longman, fourth edition, 2007
Green, T and Hawkey, R 2004 Test washback and impact. Modern English Teacher 13/4
Vizental, Adriana, Metodica predarii limbii engleze, Strategies of Teaching and Testing English as a Foreign Language, second edition, Collegium, Polirom, Romania, 2008
Hughes, A 2003 Testing for Language Teachers, 2nd edition Cambridge University Press
Canale, M&M, Swain, 1980(a).Theory of Language Assessment. Oxford: Oxford University Press
Canale, M&M, Swain, 1980(b). Theoretical Bases of Communicative Approaches to Second Language Teaching and Testing, Applied Linguistics
Heaton, J.B (1982). Language Testing. Modern English Publications.
Arnold, J. (1999), Affect in Language Learning. Cambridge University Press
Arnold, J (1998), Towards more humanistic English teaching, ELT Journal 52/3
Please check the Methodology and Language for Secondary Teachers course at Pilgrims website.
Please check the Teaching Advanced Students course at Pilgrims website.
Please check the How the Motivate your Students course at Pilgrims website.