LEXICAL RICHNESS IN ENGLISH LANGUAGE AND CULTURE

The objective of this research is to find out the lexical richness that the students of English Language and Culture Department have by deeply analyzing their undergraduate theses as the data. The researcher limited the data into undergraduate theses produced by students of batch 2010/2011 of English Language and Culture Department. The researcher used computer software called AntWordProfiler, a kind of software for profiling texts created by Laurence Anthony. The data were processed using the said software in order to produce the type and the token of the text. Afterwards, the researcher used type-token ratio (TTR) as the method in measuring the lexical richness as a part of data analysis. The closer the TTR score to 1, the higher the lexical richness is. However, the result shows that the students‟ lexical richness is quite low since none of the students achieved even 0.5.


INTRODUCTION
Undergraduate thesis is a type of academic composition that university students have to complete in order to get their degree.This type of composition requires much effort from the students since it does not only demand the students" knowledge and analytical skills but also the students" capabilities in writing and organizing a cohesive and coherent piece of writing.Many aspects have to be considered during the process of writing an undergraduate thesis.One of them is the usage of vocabulary.
In the field of foreign language learning, vocabulary is one of the most important aspects for L2 (second language) learners to learn.Wilkins (1972) states, "… while without grammar very little can be conveyed, without vocabulary nothing can be conveyed" (pp. 111-112).In other words, Wilkins would like to emphasize that without grammar, a person can still communicate his or her intention and feeling if he or she has sufficient vocabulary.However, if the person does not know any vocabulary of the language, he or she will not be able to communicate anything despite his or her knowledge of the grammatical structure of the language.
L2 learners who are able to use varieties of vocabulary are more advantageous than L2 learners with limited capabilities in using vocabulary, especially in terms of writing because they are able to manipulate their vocabulary stocks to form effective sentences.Moreover, Laufer and Nation (1995) state the usage of varied and proper vocabulary is important in the process of producing a good composition (p.307).
The variety of vocabulary use is referred to as lexical richness.In relation to writing, Laufer and Nation (1995) also suggest "lexical richness is only one of a variety of factors that affect the overall quality of a piece of writing" (p.308).In other words, lexical richness is considered as one of the criteria in determining the quality of a composition.In order for the thesis to be considered as qualified, it has to fulfill the criteria of lexical richness as Laufer and Nation (1995) stated.
Therefore, in this study, the researcher analyzed undergraduate theses written by students of English Language and Culture Department of BundaMulia University, particularly students from batch 2010/2011 in order to find out the lexical richness of each thesis.

Statement of Problems & Research Questions
The researcher was interested in finding the lexical richness in the students" undergraduate theses.Moreover, since undergraduate theses are constructed of several different parts, the researcher is also interested in finding out which part has the highest lexical richness.Therefore, the researcher formulated two research questions as specified below: 1. How is the students" lexical richness in their undergraduate theses and the implication?
2. Which part of the theses has the highest average of lexical richness?

Research Objectives & Significance
The research aims at finding out the lexical richness that the students of batch 2010/2011 from English Language and Culture Department of Bunda Mulia University have.This research is significant because the result of this study will generate inputs for the practice of teaching and learning English in English Language and Culture Department, especially in terms of teaching and learning English vocabulary and academic writing.If the students" theses do not yield high lexical richness as expected, the researcher hopes that this research can be considered as basis for the improvement of teaching and learning English vocabulary and academic writing.If the research yields satisfying result in lexical richness, the researcher hopes that the achievement can be maintained, or even better, improved further.

Aspects of 'Word'
In the area of linguistics, "word" can be categorized further into several terminologies, which are token, type, lemma, and word family.Token is the narrowest concept, and it is usually referred as "running words" or total words that occur in one particular text.For instance, in the sentence "I had a plate of rice while my mother had a bowl of soup", there are fourteen tokens in total.
Type is a broader term compared to token, and it concerns about the number of "unique word forms in a particular text" (Šišková, 2012, p. 27).For example, in the sentence "I had a plate of rice while my mother had a bowl of soup", there are eleven types in total.The verb "had", article "a", and preposition "of" are repeated, and they occur twice in the sentence.However, as type, they are considered occurring only once.
Lemma is a group containing words and their other forms which still belong to the same part of speech.For example, the words "have", "has", "had" and "having" belong to the same lemma since they are inflected forms of the root word "have" and all of them belong to the same part of speech.
Word family, however, has the broadest scope, and it contains root words, along with their inflected and derived forms.In other words, word family consists of base words and the other forms of the words which have the same and different parts of speech.
Based on the explanation above, it can be seen that each term has its own description and scope.In linguistic studies, especially in corpus linguistics, it is very important to distinguish which unit or aspect of word that is being discussed in the studies in order to avoid confusion.

Lexical Richness
According to Read (2000) and Daller, Milton and Treffers-Daller (2007), "lexical richness" term is used as a more general term.There are several aspects below the term "lexical richness", which are: "… lexical diversity (the proportion of individual words in a text, i.e. the proportion between types and tokens), lexical variation (the same as lexical diversity but focused only on lexical words), lexical sophistication (the proportion of advanced words in a text), lexical density (the proportion of lexical words in the whole text) and lexical individuality (the proportion of words used by only one person in a group…" (qtd. in Šišková, 2012, p. 26).
Lexical richness is the term used to describe the vocabulary size that the learners possess and the vocabulary use that the learners utilize.Learners who are able to use different vocabulary possess high lexical richness, and they are able to utilize their vocabulary knowledge andcommunicate more effectively.Moreover, they are able to form more complex and colorful structures.For example, native speakers of a language are able to use different terms to refer to one thing, which in turn affect the stylistics of language use.L2 learners usually only use one term in referring to one item in order to ensure the hearers or the readers understand which item that they are referring to (Tarone & Swierzbin, 2009, p. 85).
As stated previously in Introduction, lexical richness is closely related to quality of writing.In writing a composition, there are numerous factors that have to be considered such as grammar, cohesion and coherence, organization of writing, flow of ideas, and of course, vocabulary.Laufer and Nation (1995) state "a well-written composition, among other things, makes effective use of vocabulary.This need not be reflected in a rich vocabulary, but a well-used rich vocabulary is likely to have a positive effect on the reader" (p.307).In other words, it is important for the learners to have sufficient vocabulary size, but it is more essential that the learners are able to utilize the vocabulary knowledge so that they are able to produce a more qualified piece of writing.The main purpose of learning vocabulary is to activate the learners" vocabulary knowledge so that they can use their vocabulary knowledge when they communicate with other people.In brief, it is useless if the learners have a very large stock of vocabulary but are unable to use that stock when they are communicating.
A study conducted by Engber in 1993 (cited in Laufer & Nation, 1995, p. 307) shows that there is a significant correlation between lexical variation and holistic measurement towards the quality of writing.This means that it is necessary for the learners to enrich their vocabulary knowledge and enhance their lexical richness if they want to improve the quality of their writing.Without varieties of vocabulary within the composition, the content would sound repetitive, monotone, and uninteresting to be read.

Type-Token Ratio (TTR)
Šišková (2012) states "measuring lexical richness is generally concerned with how many different words are used in a text (spoken or written)" (p.26).Basically, in order to find out lexical richness of a text, the number of different words is counted.However, it is also dependent to the length of the text.There are numerous ways in measuring lexical richness.The most wellknown and frequently-used type of measurement of lexical richness is Type-Token Ratio (TTR) created by Templin (1957).
Type-Token Ratio (TTR) is basically conducted by counting the number of different words in a text (types) and the total number of the text (tokens).Then the number of different words in a text is divided with the total number of the text.The closer the result to one, the higher the lexical richness in the text is (Tarone & Swierzbin, 2009, p. 85).Below is the formula of Type-Token Ratio cited in Šišková (2012, p. 28).

AntWordProfiler
AntWordProfiler is free software developed by Laurence Anthony for corpus linguistics research.It is a profiling program which is similar to Paul Nation"s RANGE program.The program can be downloaded for free in www.laurenceanthony.netand the version that the researcher will use is version 1.4.0.There are two types of tools in AntWordProfiler; the first one is general vocabulary profiling tool and the second one is the file viewer and editor tool (Anthony, 2012).
The vocabulary profile tool allows the users to get the information about statistics and frequency of a text.There are three built-in baseword lists in AntWordProfiler, which are General Service List 1 st one-thousand words and General Service List 2 nd one-thousand words by Michael West and Academic Word List by Averil Coxhead.The file viewer and editor tool allows the user to view the details of vocabulary profiler.

Previous Studies
Laufer and Nation (1995) conducted a study on the lexical richness in L2 writing.In measuring lexical richness, they proposed using other measure called the Lexical Frequency Profile (LFP).The value of LFP is acquired by investigating the number of general words and the number of academic words that are used in a text.The research then aims at analyzing the reliability and validity of the LFP and at justifying why LFP is more useful and beneficial in measuring the lexical richness.The result showed that LFP is reliable and valid, and it can be used to identify a person"s vocabulary development.
There are several similarities between Laufer and Nation"s research and the study conducted by the researcher.The first similarity is the main topic that both the researcher and Laufer and Nation discussed.The main topic is about lexical richness and the measurements.Second similarity is that both studies used the same research approach, which is qualitative and descriptive.Both studies utilized numerical data in order to draw conclusion while at the same time, described the process and the result of the research.
However, there are also some differences between both studies as well.The first difference is the aim of the study.While Laufer and Nation"s study attempted to prove the advantages of using LFP in measuring lexical richness compared to other measurements, the researcher"s study focused on examining the lexical richness reflected in students" undergraduate theses.The second difference is the measurements that we used.The researcher specifically used Type-Token Ratio (TTR) in order to determine the lexical richness while Laufer and Nation made use of Lexical Frequency Profile (LFP).The third difference is the research design and framework which resulted in different procedures in conducting the research.

RESEARCH METHODOLOGY
The research is a quantitative research since the purpose of the research is basically to use quantitative measure in order to draw conclusion about students" lexical richness.
The data were gathered from students" undergraduate theses, specifically undergraduate theses by students from batch 2010/2011 of English Language and Culture Department of Bunda Mulia University.In total, there were 20 undergraduate theses written by students of batch 2010/2011.The researcher accessed Bunda Mulia University library in order to acquire the digital version of the theses.Out of 20 files gathered, one file was corrupted and could not be used.As a result, the researcher decided to use the remaining 19 files as the source of data.

Data Collection Procedures
To collect the data, the researcher conducted several steps as follows: 1. Collecting the files of undergraduate theses from batch 2010/2011 in the form of .doc 2. Separating the chapters in one thesis fileinto different doc.files 3. Selecting materials to be included into the analysis.For instance, the headings such as "Chapter 1" and so on are not included since they are not the students" genuine vocabulary usage; they are labels that have to be included into the thesis since they are specified in the thesis guideline.4. Converting the .docfiles into txt.files

Data Analysis Procedures
To analyze the data, the researcher conducted several steps as follows: 1. Processing the txt.files into Laurence Anthony"s AntWordProfiler 2. Noting down the amount of word types and word tokens for each data into Ms.Excel 3. Using Ms. Excel to find out the values of TTR 4. Analyzing the result of TTR by using relevant theories 5. Concluding the analysis 6. Proposing suggestions and ideas for further research

Findings
The researcher inputted txt.files which only contain Abstract section taken from the students" undergraduate theses into AntWordProfiler.The result is described in the following table.

Discussion
The researcher calculated the average TTR values for each student and the average TTR values for each section.The result can be seen in the following table.According to Tarone & Swierzbin (2009, p. 85), the closer the result to one, the higher the lexical richness in the text is.However, in the case of average TTR values per student, it can be seen from the figure below (Figure 1) that the average does not even reach 0.5.

Figure 1. Average TTR Values per Student
Therefore, it can be assumed that the students tend to repeat the same vocabulary over and over again without considering for alternative variations of the vocabulary that they used.The repeated usage of certain vocabulary causes large number of word tokens yet small number of word types, which leads to low values of TTR.
Furthermore, it can be seen from the figure below (Figure 2) that the average TTR values for each section in the theses also does not reach 0.5.

Figure 2. Average TTR Values per Section
The chart above shows that the vocabulary used in the Abstract section is more varied than other sections.The possible reason of why the Abstract section has the highest values of TTR is because in this section, the students were told to summarize their whole research in a concise manner.Therefore, they needed to utilize different key vocabularies from the whole theses in order to complete this section.
On the other hand, the section with the lowest lexical richness belongs to Data Analysis section.The possible cause of the low values of TTR in Data Analysis section is due to the repetition of the same key vocabulary in order to maintain the cohesion and the coherence of their theses.

CONCLUSION AND SUGGESTIONS
To answer the first research question ("How is the students" lexical richness in their theses and the implication?"), it can be seen from the data that the students" lexical richness is far from 1.As a result, it can be concluded that the students" lexical richness is low.In their theses, students tend to repeatedly use the same vocabulary over and over again, which indicates that their vocabulary usage is quite limited.
To answer the second research question ("which part of the theses that has the highest average of lexical richness?"), it can be seen from the data that the highest average values of TTR belongs to Abstract section.In this section, the students have to summarize the whole content of their theses.Therefore, they use different vocabularies from different parts of their theses, which results in more varied choice of words.
After concluding the result of the research, the researcher would like to give several suggestions for the improvement in terms of lexical richness and for the related topics for the next research.Since the students" lexical richness is proven to be low, it is suggested for the lecturers to increase the teaching and learning of English vocabulary in order to improve the students" lexical richness.Moreover, it is important to encourage the students to learn new vocabulary and to use varieties of vocabulary in their writing.For further research, the researcher would suggest analyzing the students" creative writing.Since creative writing is less constricting than thesis writing, the study might yield completely different result.