Self-Assessment of Foreign Language Reading and Writing Abilities among Adolescent Chinese Learners of English

Info: 9472 words (38 pages) Dissertation
Published: 9th Mar 2021

Tagged: EducationEnglish as a Foreign Language

Share this: Facebook Twitter Reddit LinkedIn WhatsApp

Abstract

A growing amount of research has delved into the role of self-assessment (SA) of language abilities; however, SA as a metacognitive tool has not been investigated extensively with adolescent Chinese learners of English. The present study aims to explore self-ratings of reading and writing abilities by adolescent Chinese learners and the relationship between self-ratings and subsequent reading comprehension and writing production. A total of 106 (ages12 to 14) completed a Reading Comprehension Test (captured by three tasks – Free Recall, Sentence Completion and Multiple-Choice Questions), a Writing Task (a picture-based writing prompt), and criterion-referenced SA Items. Correlational analysis revealed that SA of reading ability was significantly correlated with subsequent reading comprehension. SA of writing ability was also significantly correlated with subsequent writing production. The study concluded that adolescent Chinese learners of English can accurately self-assess their strengths and weaknesses in reading and writing. Self-assessment could serve as a useful tool in the classroom to help identify strengths and weaknesses in English language learning among adolescent Chinese learners of English.

Keywords: Self-Assessment (SA); English reading and writing; Chinese adolescents

“I Know English”: Self-Assessment (SA) of Foreign Language (FL) Reading and Writing Abilities among Adolescent Chinese Learners of English

1. Introduction

More and more countries have started to teach young and adolescent learners foreign languages at schools (Rea-Dickins, 2000; Zangl, 2000). In China, English has been a compulsory subject for educational requirement at all levels (Jin, Wu, Alderson & Song, 2017). However, becoming literate in a foreign language for young and adolescent learners is not easy. Reading and writing in a foreign language involve heavy cognitive and social processes (Koda, 2005; Grabe, 2009; Silva & Matsuda, 2002; Alderson, 2006). It is even more challenging if the target language has an orthography different from learners’ native language (Huss, 1995). Considering these challenges as well as characteristics of young learners such as sensitivity to “failure” (Hasselgren, 2000), there is a consensus that the assessment should incorporate a variety of engaging tasks (Hasselgren, 2000; Rea-Dickins & Gardner, 2000) and use multiple procedures to capture different aspects of language learning (Rea-Dickins, 2000). It should be a potential tool to monitor language learning process and promote positive attitudes towards language learning (Weigle, 2002).

Among a variety of assessments, alternative evaluations accompanied by a component of self-assessment has been highlighted for its appropriate use as a metacognitive tool. Self-assessment (SA), as defined as “procedures by which learners themselves evaluate their language skills and knowledge” (Bailey, 1998, p. 227), is an “internal” assessment from learners’ own perspectives (Oscarson, 1989) to self-rate what they “can do” in the target language and self-identify their strengths and weaknesses. Research has found that SA engages learners in making decisions about their language ability and helps to set learning goals and objectives (Chapelle & Brindley, 2010; Chen, 2008). In practice, SA as a critical component of alternative assessment that has been adopted by, for instance, Common Europe’s Framework of Reference (CEFR), European Language Portfolio (ELP), and the Bergen “Can-Do” project (see details in Hasselgren, 2000) to capture and understand language performance.

To date, however, since the implementation of English learning in secondary schools in China in late 20th century (Wang & Lam, 2009; Hu, 2002), Chinese students are evaluated by massive large-scale standardized exams which make unsuccessful learners become easily unenthusiastic and hold negative English learning experiences (Carless & Wong, 2000, as cited in McKay, 2006). Students are not always provided with the opportunities to independently self-assess their own strengths and weaknesses in English learning. One reason may be a concern about the power relationship between teachers and students. Teachers view SA as a violation of their authority (Towler & Broadfoot, 1992). Another reason lies in the limited empirical research on SA with adolescent Chinese learners of English. Without adequate research justifications, teachers cast doubts on the rationale for an implementation of SA in classrooms. To shed light on the use of SA in the context of English education at secondary schools in China, the present study was guided by two overarching research questions. First, how do adolescent Chinese learners of English self-assess or self-perceive their English reading and writing abilities? Second, what is the relationship between SA ratings of language abilities and subsequent English reading comprehension and writing production? In other words, could adolescent Chinese learners of English accurately self-assess their English reading and writing abilities? Results of the study will help uncover the self-perceived strengths and weaknesses in English learning by adolescent Chinese learners of English, and may provide justification for the implementation of SA as a metacognitive tool among adolescents in secondary schools in China.

2. Literature Review

2.1. FL reading and writing among young and adolescent language learners

Reading and writing in a foreign language is not easy for young learners, as both involve heavy cognitive process and multifaceted social process. Koda (2007) defines reading as “a product of a complex information-processing system” including three major components: decoding, text-information building, and reader-model construction. Decoding is reader’s extraction of written text information based on their linguistic knowledge. Text-information building is how readers organize the information they extract from the written text. Reader-model construction is reader’s synthesis and interpretation of the written text based on their background knowledge (Koda, 2007). With different background knowledge, different readers would have different understandings or interpretations of the same written text (Brantmeier, 2002; Koda, 2007; Grabe, 2009; Bernhardt, 2010). Alderson (2006) also specified that reading is a complex process impacted by both text level variables such as topic familiarity, genre, and text organization, and variable beyond text such as linguistic skills, learning motivation, affect and learner characteristics. In short, reading is an interactive process in which readers themselves, reading texts, and the language itself play critical roles (Bernhardt, 2010). Such process integrating cognitive and social dimension challenges young learners especially when they are still developing their first language literacy (McKay, 2006).

Similar to reading, writing process interleaves cognitive, social and cultural dimensions. Writing may be a social act if it is goal-directed and serves to communicate for a particular group of audience (Grabe & Kaplan, 1996). It is also a cultural phenomenon as researchers have found that cultural norms influence the variations in writing patterns (Grabe & Kaplan, 1996) and coherence of text (Leki, 1992). The cognitive load for writing is heavy as writers need to engage in time-consuming process of pre-writing, writing, revising, and editing (Weigle, 2002). In addition, impact from L1 rhetorical knowledge complicates the process and increases the intricacies of L2/FL writing. L2 writers’ knowledge of appropriate genres is constructed differently from L1 knowledge of genres in various aspects such as communicative purposes, register use, and intertextuality (Silva & Matsuda, 2002). For young writers of a foreign language, they have to make great efforts to learn this complex process while at the same time “[dealing] with the cognitive demands of early literacy, either in their first or the target language, or in both” (Weigle, 2002, p. 245).

2.2. Assessing young and adolescent language learners

Considering challenges and complexities of FL reading and writing young and adolescent learners are faced with, assessment for this group of learners should not focus on the assessment of learning outcomes; instead, it needs to be a tool to develop their language ability, monitor their learning process, promote positive emotion and motivation to learn the target language (Weigle, 2002). Multiple assessment formats need to be utilize to capture a full range of young learners’ language performance in diverse contexts (Rea-Dickins & Gardner, 2000) and serves as a tool to build a language profile for a better understanding of language performance (Rea-Dickins, 2000). Assessment tasks need to be varied and engaging to highlight what they can do in the target language, which is especially important when considering young learners’ characteristics such as short attentional span and sensitivity to “failure” (Hasselgren, 2000). In consensus by research, assessment for young learners need to recognize their cognitive development and should take into consideration of their motivation, vulnerability and interests.

In the context of English teaching and learning in China, Chines students are evaluated by numerous large-scale standardized exams developed by local (provincial or municipal) education authorities for different purposes (Jin, Wu, Alderson & Song, 2017). However, lots of studies have suggested the inappropriateness of using high-stake standardized exams among young and adolescent students (e.g., Chik and Besser, 2011; McKay, 2006; Haggerty and Fox, 2015). Compared with adults, young and adolescent students are more vulnerable by assessment (Chik & Besser, 2011), and more easily to lose learning motivation (Haggerty & Fox, 2015). Those learners with unsuccessful performance become easily unenthusiastic and hold negative English learning experience (Carless & Wong, 2000, as cited in McKay, 2006). In addition, from the pedagogical implication perspectives, standardized exams on young learners is not enlightening to classroom teaching (McKay, 2006) and fail to bring the expected washback effects (Qi, 2004, 2005, 2007). In contrast, alternative assessment has it great value helping teachers to gain an understanding of young learners’ progress in language learning so that teachers could be responsive to individual learner differences, learner needs as well as teaching pedagogies (Rea-Dickins & Gardner, 2000; Rea-Dickins, 2000).

2.3. Self-assessment (SA) of FL/L2 abilities

Self-assessment (SA) is defined as the “procedures by which learners themselves evaluate their language skills and knowledge” (Bailey, 1998, p. 227). Research has found that SA raises learners’ self-awareness of learning (Oscarson, 1989; Babaii, Taghaddomi & Pashmforoosh, 2016), guides learners to think about the learning process (McKay, 2006), promotes self-regulated learning and autonomy to self-identify their strengths and weaknesses (Butler & Lee, 2010; Oscarson, 1989; Dann, 2002), and engages learners to self-assess in an interactive and low-anxiety way (Bachman & Palmer, 1996). SA has also been found positively associated with learners’ self-confidence and performance (De Saint-Leger, 2009; Little, 2009; Butler & Lee, 2010). In addition, SA narrows the gap between learner perception and actual performance (Andrade & Valtcheva, 2009) and minimizes the mismatches between learner assessment and teacher assessment (Babaii et al., 2016). SA also serve a number of alternative purposes such as expanding the range of assessment (Oscarson, 1989), supporting a learner-centered curriculum (Little, 2005), and fostering the perception of assessment as a “shared responsibility” between teachers and learners (Oscarson, 1989; Little, 2005).

Self-assessment (SA) of FL/L2 abilities has been explored by researchers across four language learning skills: reading, writing, listening and speaking. As prior research indicates, whether or not SA is an accurate predictor of language abilities varies by the type of constructs of SA items (Brantmeier, Vanderplank & Strube, 2012; Ross, 1998), L2 proficiency level (Sahragard & Mallahi, 2014; Heilenman, 1990), specific language skills (Sahragard & Mallahi, 2014; Wan-a-rom, 2010; Brantmeier, 2005; Matsuno, 2009), and task types (Brantmeier et al., 2012; Butler & Lee, 2006). In spite of those variables, Ross’ (1998) meta-analysis indicated that the correlation coefficients between SA scores and objective performance ranged between .52 to .65 across four language skills: reading, writing, listening and speaking. Ross (1998) also found that, in general, the correlation between SA and receptive skills (reading and listening) was stronger than the correlation between SA and productive skills (writing and speaking). Higher proficiency learners more accurately assess their L2 abilities than lower proficiency learners (Alderson, 2006), which was echoed by Brantmeier et al. (2012) that there is a threshold beyond which learners can self-assess their language abilities in a more accurate manner. Table 1 selects studies particularly examining SA of L2/FL reading and writing abilities.

Table 1 Selected Studies on SA of L2/FL Reading and Writing Abilities (Modified from Brantmeier et al., 2012, p. 146)

Author(s)/Year	Participants	SA Skills	Findings
Sahragard and Mallahi (2014)	N=30 Upper-Intermediate Adult EFL learners in Iran	Writing	SA writing is not an accurate predictor of subsequent writing performance. More proficient writers underestimated their writing abilities whereas less proficient learners overestimated their writing abilities.
Brantmeier, Vanderplank, and Strube (2012)	N=276 Adult L2 Spanish learners	Reading Writing Listening Speaking	Criterion-referenced SA reading, writing, listening and speaking is an accurate predictor of subsequent performance among advanced L2 Spanish learners.
Baniabdelrahman (2010)	N=136 11th Graders Jordanian EFL learners	Reading	SA reading is positively associated with subsequent reading performance among young Jordanian learners of English.
Wan-a-rom (2010)	N=5 High school students Thai EFL learners	Reading	SA reading is not an accurate predictor of performance, but can be used as a tool to help learners find appropriate graded reading level.
Javaherbakhsh (2010)	N=73 Adult advanced EFL learners in Iran	Writing	SA writing using a checklist promotes learners’ autonomy and is positively associated with subsequent writing performance.
Matsuno (2009)	N=97 Adult Japanese EFLs in Japan	Writing	SA writing is not an accurate predictor of writing performance. Learners underestimated their writing abilities, especially for high-achieving students.
Brantmeier and Vanderplank (2008)	N=359 Adult advanced L2 Spanish learners in the U.S.	Reading	SA is a reliable predictor of subsequent performance measured by a computer-based test and classroom performance measured by sentence complete and multiple-choice tasks. However, SA is not an accurate predictor when measured by written recall task.
Brantmeier (2006)	N=71 Adult advanced L2 Spanish learners in the U.S.	Reading	SA reading is not an accurate predictor of advanced Spanish learners’ reading ability measured in both computer-based test and classroom performance.
Brantmeier (2005)	N=88 Adult advanced L2 Spanish learners in the U.S.	Reading	SA reading is an accurate predictor of enjoyment in reading and reading performance measured by written recall but not multiple choice among L2 Spanish learners in advanced level of instruction.

SA of L2 reading has been extensively examined by Brantmeier and her colleagues (2005, 2006, 2008, 2012), with the major finding that the accuracy of SA reading varies across task types of reading comprehension, and type of SA items. For examples, with adult advanced L2 Spanish learners in the U.S., Brantmeier (2005) found that SA reading was an accurate predictor of enjoyment in reading performance measured by written recall but not multiple choice. Brantmeier and Vanderplank (2008) substantiated Brantmeier (2005)’s finding by involving test format as a variable. With 359 adult advanced L2 Spanish learners, criterion-referenced SA was found as a reliable predictor of reading performance when evaluated by both a computer-based test and a paper-based test measured by sentence completion and multiple-choice task, but not by written recall. With 276 adult Spanish learners across beginning intermediate and advanced level, Brantmeier, Vanderplank, and Strube (2012) found that skill-based SA was significantly correlated with an online reading test with Spanish learners at advanced level, of which the result validated the relationship between SA instrument with specific criteria and the subsequent reading comprehension.

Research on SA writing has shown mixed findings. Javaherbakhsh (2010) examined SA writing with adult Iranian EFL learners at the advanced levels and found a positive association between SA of writing and writing performance. However, some findings revealed that SA writing was not reliable. Sahragard and Mallahi (2014) studied how 30 Iranian EFL learners at the upper-intermediate level would self-assess their writing ability using a comprehensive 30-item SA scale. It was found that more proficient learners tended to underestimate all aspects of their writing ability whereas less proficient learners tended to overestimated, suggesting the role of language proficiency in determining the accuracy of SA writing. Matsuno (2009) analyzed SA writing with adult Japanese EFL learners in Japan and noted that this group of learners self-assessed their writing ability in a harsh manner. Most learners (particularly higher proficiency learners) tended to underestimate their writing abilities. Matsuno (2009) argued that Japanese EFL learners underestimating their abilities was a cultural phenomenon. In Japanese culture, being modest is a valued virtue; a tendency to self-assess higher than actual ability is considered as a violation to the virtue.

To date, studies on SA with young and adolescent language learners has been very limited. Two studies by Baniabdelrahman (2010) and Wan-a-rom (2010) particularly focused on adolescent language learners’ self-assessment of reading ability. With 136 Jordanian EFL learners at 11th Grade, Baniabdelrahman (2010) found that SA reading was positively associated with subsequent rereading performance. However, Wan-a-rom’s (2010) case study with five high school students who were Thai learners of English indicated that SA reading was not positively associated with reading performance, but can be used as a tool to help learners find appropriate graded reading level. There are two other studies (Dolosic, Brantmeier, Strube & Hogrebe, 2016; Butler & Lee, 2010) examining SA with young learners, though not with a focus on foreign language reading and writing. Dolosic et al. (2016) examined the relationship between criterion-referenced self-assessment and oral production in French with 24 adolescent learners in a French language summer camp in the U.S. Pre- and post-test SA of speaking ability revealed that learners who could not accurately self-assess their speaking ability at the beginning of the program were better able to self-assess themselves at the end of the program. Dolosic et al. (2016) concluded that self-assessment could be used as a metacognitive tool for young learners to self-identify their strengths and weaknesses in language learning. Butler and Lee (2010) scrutinized SA among 254 young Korean learners of English at 6th Grade in South Korean, and found that young Korean learners of English were able to self-assess their English language abilities, and could even better self-assess after self-assessment intervention. Butler and Lee (2010) also found that SA was positively associated with learners’ English performance and English learning confidence. The researchers suggested that SA can play a very positive role in learning and teaching environment in Asian classrooms where “effort is a highly valued part of educational success” (p. 25).

3. The Present Study

3.1. Research Questions

An understanding of how adolescent Chinese learners of English self-assess their English reading and writing abilities and an examination of the relationship between self-ratings and the actual performance play a critical role in term of the rationale for an implementation of SA as a metacognitive tool in English learning context at secondary schools in China. To provide insights on the issue, the present study attempts to address the following three research questions:

1) With adolescent Chinese learners of English, what are their self-ratings of English reading and writing abilities?

2) Is there a relationship between scores on three different reading comprehension tasks (free recall, sentence completion, and multiple choice, respectively) and SA scores of reading ability? Does the overall reading comprehension score correlate with SA scores of reading ability?

3) Does the writing performance score correlate with SA scores of writing ability?

3.2. Participants and Context

A sample of 106 adolescent Chinese EFL learners (56 males and 50 females) from two 7th Grade classes in a public middle school in a metropolitan area in mainland China participated in the study in July 2016. The age range of the participants was between 12 to 14 years old (Mean = 12.93, SD = 0.51). No participants had any experience living in an English-speaking country. Participants received a total of 90 minutes’ formal English instruction each day from Monday to Friday, a total of 450 minutes per week. The English book series used for English instruction is “Go For It!” designed by People’s Education Press in China. The book series contains two volumes of which each has a total of 12 learning units. The book series aims to develop four skills: listening, speaking, reading and writing. According to the 2011 English Curriculum Standards for Compulsory Education issued by the Department of Education in China, students should achieve an English proficiency of Level Three upon finishing 7th grade. At Level Three, students are able to read and comprehend simple stories in English, summarize reading texts, and read over 40,000 words outside the classroom. According to the national curriculum, the required vocabulary size for junior secondary is 1,500, among which there are 941 word families (Jin et al., 2017). Noteworthy, these are all guiding principles for English education as there is no national language proficiency scales in China (Zeng & Fan, 2017). The assessment for junior secondary is dominated by traditional standardized exams including mid-term and final-term exams designed by the school itself to evaluate the learning outcome. In addition, there are standardized exams administered by Education Department at provincial or municipal level (Jin et al., 2017). All the exams are comprehensive exams including Listening, Multiple-Choice Questions (focusing on grammatical knowledge), Cloze Test, Reading Comprehension (all multiple-choice questions), Sentence/Conversation Completion, and Prompt Writing (usually a total of more than 70 words is required).

3.3. Materials

3.3.1. Reading Comprehension Instruments

The reading Comprehension Instrument consisted of two passages. A summary of the two reading passages is presented in Table 2.

Table 2 Summary of Two Reading Passages

Title

# of Words

# of Sentences

# of Embedded Clauses

# of

Pausal Units

Passage 1

Ann and Frank

180

Passage 2

A Restaurant

210

A total of 1.5 hours was given to participants to complete the reading comprehension tests. Each reading passage was followed by three test types: Free Recall, Sentence Completion, and Multiple-Choice Questions. See Appendix A for the sample Reading Comprehension Test. Free Recall asked participants to write down in English as much as possible about the reading passage without looking back at the passage. The number of pausal units was used as the benchmark for scoring the free recall task (Brantmeier, Strube, & Yu, 2012). A pausal unit is a unit that has a “pause on each end of it during normally paced oral reading” (Bernhardt, 1991, p. 208). The first passage had a total of 14 pausal units, and the second passage has 14 pausal units. Sentence Completion task (5 items for each passage; one point for each item) asked participants to complete sentences according to the reading passages. Multiple-Choice Questions (5 items for each passage; one point for each item) asked participants to circle the correct answer to the question from four choices. Adopting these three task types allows participants to give “a full range of responses, i.e., a score and some insight about the reader” (Bernhardt, 2010, p. 103). Quantitative measures along such as multiple-choice questions fail to elicit insightful and complete understandings of learners’ reading comprehension, and the score itself has limited practical values (Bernhardt, 2010). However, free recall which asks participants to write down as much as possible about the reading passage can provide both quantitative and qualitative insights (Bernhardt, 2010).

Brantmeier (2001, 2003a, 2003b, 2003c, 2004) has highlighted that gender and passage content interact with each other on reading comprehension. Therefore, the present study chose reading topics with the hope to eliminate possible interaction effects between gender and passage content on reading comprehension. After participants finished all the three tasks in each reading passage, one question about topic familiarity was presented. Participants were asked to self-rate how they were familiar with the topic of the reading passage from a 5-point Likert scale: (1) not familiar at all, (2) not very familiar, (3) somewhat familiar, (4) very familiar and (5) really familiar. Preliminary independent sample t-test analysis indicated that there was no significant gender difference in topic familiarity for both reading passages, thus concerns about possible impact of gender difference on topic familiarity can be eliminated. Table 3 summarizes the independent sample t-test results.

Table 3 Independent sample t-test: gender difference in topic familiarity

		Mean	SD	t value	p value
Passage 1	Male	2.64	0.98	-0.21	> .05
Passage 1	Female	2.68	0.84	-0.21	> .05
Passage 2	Male	2.82	1.08	-0.61	> .05
Passage 2	Female	2.94	0.91	-0.61	> .05

N: male=56, female=50

3.3.2. Writing Task

The writing task asked participants to describe a picture presented. Participants were expected to describe the characters in the picture and develop a story about what might happen between the characters they saw and then write the story down on the sheet of paper provided. The writing prompt was presented in both English and L1 Chinese. Two trained and qualified raters scored the writing tasks, and the average score between the two raters were the final score of the writing task for each participant. Jacobs, Zinkgraf, Wormuth, Hartfield, and Hughey’ (1981) criterion was used for rating 5 subscales – content, organization, vocabulary, mechanics, and language use. The inter-rater reliability (Cronbach’s alpha) between raters was .90. A total of 30 minutes was given to participants to complete the writing task. A minimum of 100 words was required. The maximum points for the writing task was 20. See Appendix B for the writing task used in the present study.

3.3.3. Self-Assessment Items

Self-Assessment Items on reading and writing abilities were completed by participants before the reading comprehension and writing tests. Constructing SA items was a challenging process for the present study. In China, there is no national language proficiency scales (Zeng & Fan, 2017) and the education management structure at different provincial or municipal levels makes it hard to be consistent defining learning objectives and outcomes in English education curriculum at different educational stages (Jin et al., 2017). In other words, there is no consensus on what constitutes English reading and writing abilities at different proficiency levels, which makes it vague and challenging to construct SA items in China context. SA items used in the present study were finally adapted and modified from multiple sources, including the self-assessment grid (Reading and Writing) by the Common European Framework of Reference for Languages (CEFR), European Language Portfolio (ELP), and SA questionnaires from Brantmeier et al. (2012) which were modified from the DIALANG. Self-assessment grids by CEFR and ELP illustrate proficiency level using “Can Do” statements and the descriptors focus on what language learners can do with the target language and has been widely recognized as a useful tool to identify different language skills by individuals. SA items from Brantmeier et al. (2012) have been tested and utilized in a number of empirical studies and high validity has been identified.

To best tailor these SA items to the particular group of learners in China context, the present study referred to Hasselgreen’s (2000, 2005) proposal on how to best adapt CEFR and ELP for specific use. Some key considerations when creating the present SA descriptors included, for instance, the topics and text types that junior secondary students are familiar with, the curriculum guide for English education, English text books and materials, and the general teaching practice of English reading and writing at junior secondary schools in China. After consulting with English teachers and experts, the final SA items were constructed for use among the particular groups of learners in the present study. All the SA items are criterion-referenced which can be appropriately used to collect descriptive data for an understanding of young and adolescent learners’ strengths and weaknesses across language skills (Bachman, 2000)

In total, SA of reading abilities consisted of 11 items. SA of writing abilities was consisted of 14 items. For each item, participants were asked to indicate how they would rate their English reading and writing ability in each situation presented in each item. Participants were then asked to circle the appropriate rating accordingly. Each item was rated on a 5-point Likert Scale consisting of “1 (Strongly Disagree)”, “2 (Disagree)”, “3 (Neutral)”, “4 (Agree)” and “5 (Strongly Agree)”. All items were translated by a professional translator and presented to the participants in L1 Chinese. See Appendix C for sample SA items.

3.3.4. Demographic Questionnaire

Each participant completed a demographic questionnaire before the reading comprehension test. The questionnaire asked participants to self-report their name, age, gender, number of years studying English, years of experience living in an English-speaking country, experience being taught by a native English teacher, enjoyment with English language learning, reasons for learning English, and whether their parents speak English. In addition, participants were asked to self-rate their English proficiency level. Five proficiency levels were presented: Novice, Intermediate, Advanced, Superior (Native-like), and Distinguished (Native). The demographic questionnaire was translated by a professional translator and was presented to participants in L1 Chinese.

3.4. Data Collection

Data collection took place at the end of Spring 2016 semester in June in a public school in mainland China. The whole data collection was completed on a weekday in participants’ regular classrooms. The total time commitment of the data collection was two hours. An English teacher and a researcher of the present study monitored the entire process of data collection.

4. Data Analysis and Results

4.1. Preliminary Analysis – Internal Consistency

R Statistical Software (Version 0.99.903) was used for data analysis. The internal consistency of SA Reading items and SA Writing items was checked by Cronbach’s alpha if item deleted, item-total correlation and the average inter-item correlation. Two criteria were evaluated: 1) to drop an item if the item deletion increased alpha by at least .01; and 2) to drop an item if the item-total correlation was smaller than .30. In terms of the inter-item correlation, the ideal average inter-item correlation is between .20 and .40, indicating that the items, though reasonably homogenous, still contain enough unique variance so that each item is not isomorphic with each other (Piedmont, 2014). Based on these criteria, no item in either questionnaire was dropped. Table 2 summarizes the statistics.

Table 2 Internal consistency

Number of Items

Cronbach’s alpha

Average

Item-Total Correlation

Average

Inter-Item Correlation

SA Reading

.80

.57

.32

SA Writing

.88

.62

.38

4.2. Descriptive Statistics

Table 3 shows the descriptive statistics of scores of overall reading comprehension, free recall, sentence completion, multiple choice, writing production, SA reading, and SA writing. The overall score of reading comprehension was the composite of free recall, sentence completion and multiple-choice questions. Writing production score was the average score of the two trained raters. SA reading score was the average score of the 11 SA reading items and SA writing score was the average score of the 14 SA writing items. Variables including reading comprehension, free recall, sentence completion, and multiple choice did not pass the Shapiro normality test; therefore, a non-parametric statistical analysis, Spearman’s correlation was employed in some of the following correlation analyses.

Table 3 Descriptive statistics

	Mean	Standard Deviation	Min	Max	Skewness	Kurtosis	Shapiro Normality
	Mean	Standard Deviation	Min	Max	Skewness	Kurtosis	W Statistics	p value
RC	14.51	7.98	2	33	0.5	-0.86	0.94	< .001
FR	5.58	4.04	0	20	1.1	0.98	0.91	< .001
SC	3.76	2.72	0	10	0.49	-0.69	0.94	< .001
MC	5.16	2.39	1	11	0.24	-0.75	0.96	< .01
WP	10.75	2.86	3.5	17.5	-0.13	-0.21	0.99	> .05
SAR	3.64	0.56	1.55	4.82	-0.4	0.58	0.98	> .05
SAW	3.24	0.62	1.5	4.57	-0.23	-0.07	0.99	> .05

RC=Reading Comprehension; FR=Free Recall; SC=Sentence Completion; MC=Multiple Choice; WP=Writing Production; SAR=Self-Assessment Reading; SAW=Self-Assessment of Writing

4.3. Results

4.3.1. RQ1: With adolescent Chinese learners of English, what are their self-ratings of English reading and writing abilities?

Table 4 details each SA reading item and associated mean and standard deviation. The table also shows the one-sample t-test results comparing the mean of each item and overall SA reading mean. Overall, the mean for SA reading was 3.64, with a standard deviation of 0.56. The one-sample t-tests revealed that the means for Item 2, Item 9, Item 1, and Item 3 were self-rated by participants significantly higher than the overall mean of SA reading. Item 7, Item 6, and Item 5 were self-rated significantly lower than the overall SA reading mean. From perspectives of adolescent Chinese learners of English, they had the abilities to understand the general idea of simple informational texts and short descriptions, to locate specific information for task completion Reading Comprehension (all multiple-choice questions), to identify the characters, settings, problems, and solutions occurring in a story, and to infer the meaning of new vocabulary based on the reading text. In contrast, they self-reported their significant weaknesses in making predictions when reading, making connections between the text they read and their life experience, and making connections between the text they read with other texts they read before.

Table 4 SA Reading ratings and one-sample t-test results

Item	Description	Mean	Standard Deviation	One-Sample t-test
2	I can understand the general idea of simple informational texts and short descriptions, especially if they contain pictures which help to explain the text.	4.32	0.74	p < .001
9	I can locate specific information I need to help me complete a task.	4.01	0.85	p < .001
1	I can identify the characters, settings, problems, and solutions occurring in a story.	4.00	0.83	p < .001
3	I can infer the meaning of new vocabulary based on the text I read.	3.90	0.95	p < .01
11	I can use some reading strategies (such as rereading) to help me understand the text.	3.75	1.06	p > .05
10	I can identify the main idea discussed in a text and how the idea is supported.	3.55	0.91	p > .05
4	I can choose a reading text appropriate to my reading ability for myself.	3.51	0.99	p > .05
8	I can come up with questions (such as why, what and how) by myself when I am reading.	3.51	1.08	p > .05
7	I can make predictions when I am reading.	3.34	1.05	p < .01
6	I can make connections between the text I read and my life experience.	3.11	1.04	p < .001
5	I can make connections between the text I read with other texts I have read.	3.09	1.11	p < .001

Table 5 listed each SA writing item and corresponding mean and standard deviation. The table also included the one-sample t-test results comparing the mean of each item and overall mean for SA writing. Overall, the mean for SA writing was 3.24, with a standard deviation of 0.62. The one-sample t-tests demonstrated that the means for Item 14, Item 5, and Item 10 were significantly higher than the overall mean of SA writing. Item 12, Item 8 and Item 11 were significantly lower than the overall mean of SA writing. Adolescent Chinese learners of English self-identified their relevant strengths in using appropriate spellings, punctuations and capitalization when writing, writing personal feelings and emotions, and writing a story ending that can be clearly understood by readers. However, they also self-perceived their insufficient English writing ability to keep readers in mind when writing a story, to write an opening paragraph that can attract readers’ attention, and to use appropriate metaphors when writing.

Table 5 SA Writing ratings, and one-sample t-test results

Item	Description	Mean	Standard Deviation	One-Sample t-test
14	I can use appropriate spellings, punctuations and capitalization when I am writing.	3.81	1.01	p < .001
5	I can write my personal feelings and emotions when I am writing my personal story.	3.57	.91	p < .001
10	I can write a story ending that makes readers clearly understand what I am writing about.	3.53	1.04	p < .01
6	I can write a story step-by-step, introducing the characters, the problem/conflict, and then the solution.	3.38	1.07	p > .05
4	I can add dialogues to images when I am writing a story.	3.36	1.03	p > .05
7	I can separate a story into several paragraphs and make the idea of each paragraph clear.	3.36	1.08	p > .05
3	I can write vivid details about a story.	3.29	.93	p > .05
1	I can write about a personal story from my life experience.	3.27	.90	p > .05
2	I can write about a story about other characters.	3.20	.97	p > .05
13	I can use appropriate grammar and sentence structures when I am writing.	3.14	.92	p > .05
9	I can use different word choice when I’m writing a story.	3.12	.94	p > .05
12	I keep my readers in mind when I am writing a story.	2.93	1.05	p < .01
8	I can write an opening paragraph that attracts readers’ attention.	2.83	1.03	p < .001
11	I can use metaphors when I am writing a story.	2.62	1.05	p < .001

4.3.2. RQ2: Is there a relationship between scores on three different reading comprehension tasks (free recall, sentence completion, and multiple choice, respectively) and SA scores of reading ability? Does the overall reading comprehension score correlate with SA scores of reading ability?

Spearman’s correlation coefficient, a non-parametric statistic to measure the strength and direction of the association between two variables, was used to to examine the correlations as the distribution of free recall, sentence completion, multiple choice and the overall reading comprehension scores violate the assumption of normality (Field, 2013) and did not match any theoretical distributions. Spearman’s rho (rs) was interpreted to explain the correlation. Results indicated that SA reading was significantly related to free recall (rs = .41, p < .0001), sentence completion (rs = .46, p < .0001), multiple choice (rs = .43, p < .0001), as well as the overall score of reading comprehension (rs = .51, p < .0001). According to Plonsky and Oswald’s (2014) proposal of the general scale of correlation and effect size in the field of L2 research, all those correlations revealed medium to large effect size. It can be concluded that SA reading was significantly related to free recall, sentence completion, multiple choice and the overall reading comprehension. See Figure 1 for the scatterplots of the correlations.

Figure 1 Scatterplots of the correlations

../spearman%205.png ../spearman%203.png

../spearman%204.png ../123.png

4.3.3. RQ3: Does the writing performance score correlate with SA scores of writing ability?

Pearson correlation was used to test the association between SA writing and subsequent writing performance score. The result revealed a significantly positive relationship between SA writing and writing production (r = .30, p < .01), with small to medium effect size (Plonsky & Oswald, 2014). It can be concluded that SA writing was positively related to subsequent writing production. See Figure 2 for the correlation scatterplot.

Figure 2 Correlation between subsequent writing performance and SA writing scores

../123.png

5. Discussions and Implications

The present findings revealed that, with adolescent Chinese learners of English in China, ability to self-assess reading and writing abilities was significantly correlated with subsequent reading comprehension and writing production. Results echoed previous research (e.g., Dolosic et al., 2016; Butler and Lee, 2010) that adolescent language learners demonstrate the ability to self-assess language learning abilities. The effect size of the correlation between SA reading and reading comprehension had a medium to large effect size while the correlation between SA writing and writing production had a small effect size, which confirms Ross’ (1998) conclusion that, in general, the correlation between SA and receptive skills was stronger than the correlation between SA and productive skills. However, the present study did not echo previous finding that lower proficiency learners tends to overestimate their ability (Alderson, 2006); instead, the findings revealed that adolescent Chinese learners of English at the beginning level could accurately self-assess their reading and writing abilities. The findings provides support to argue for an implementation of SA in classroom among this unique group of learners, especially considering the benefits of SA such as narrowing the mismatches between learner assessment and teacher assessment, connecting teachers and learners, identifying learning goals, individualizing instruction, identifying strength and weakness, supporting reading comprehension (Babaii et al., 2016; Black & William, 2010; Solano-Flores & Trumbull, 2006; Shore, Wolf & Heritage, 2016).

An implementation of self-assessment in classroom will have its practical value in terms of helping adolescent Chinese learners of English continue to grow as independent language learners who are able to self-monitor their own strengths and weakness. In most secondary classrooms in China, classroom practices emphasize on the teacher-centered pedagogy, teacher evaluation, teaching towards examinations, rote-learning and memorization (Brown & Gao, 2015). One problem throughout this process is that learner autonomy, defined by Benson (2007) as the abilities to take charge of one’s own learning, is relative low or is not fully activated. Without adequate learner autonomy, learners would not be able to independently set the goals for learning, identifying their own strengths and weaknesses, make decisions on learning process, and then implement relevant remedies for learning. Practices of self-assessment has the potential to foster learner autonomy among adolescent Chinese learners of English so that they can direct their own learning effectively and become more independent and responsible as lifelong learners and users of the language.

Another practical value of self-assessment is associated with learner motivation in China context. Impacted by the phenomenon of English as a global language, English language teaching and learning was introduced as a compulsory course in primary and secondary schools at a national level in China in late 20th century (Wang & Lam, 2009; Hu, 2002). Traditionally believed as a fair and reliable way to measure students’ abilities and learning outcomes, large-scale standardized tests are widely recognized in China, and massively implemented by schools, local and national agents (Cheng & Qi, 2006). Students have to take numerous exams once starting schooling and only those who can succeed in those competitive exams have the opportunity to continue their education (Qi, 2004). English language assessment at secondary schools, like other subjects, is predominately large-scale standardized tests (Cheng, 2008), from the school level tests to the very high-stake National College Entrance Exam. However, such standardized large-scale tests lower learners’ motivation and enthusiasm for learning (Carless & Wong, 2000, as cited in McKay, 2006). Self-assessment, which situates learners in a natural setting where teachers can also consistently monitor progress and provide feedback, can be expected to help decrease learners’ anxiety and promote learning motivation. Previous research has found that self-assessment was positively related to learners’ motivation (Butler and Lee, 2010); however, more research is definitely needed to see the relationship between an implementation of self-assessment and learner motivation in the China context. For instance, perhaps consistent use of SA within the classroom environment could promote higher motivation and enthusiasm for English learning in China. One additional step might be whether consistent use of SA contributes to learners’ performance in large-scale standardize tests in China.

6. Conclusion

In conclusion, the study revealed that adolescent Chinese learners of English demonstrate the ability to self-assess their reading and writing abilities when using criterion-referenced self-assessment instruments. Self-assessment could serve as a useful tool in classroom to help these learners identify their strengths and weaknesses in English language learning. Future research is needed to uncover whether consistent use of self-assessment will promote learner motivation, learner autonomy, as well as performance of high-stake exams in China context.