5
ABSTRACT
During the last decades, several studies have been carried out in order to
investigate the effect new media have on language. These studies have brought to the
conception of a new form of communication: the so-called ‘Computer-Mediated
Communication’ (CMC). Scholars focusing on the relationship between language and
CMC have also acknowledged the existence of a new variety of English, which has been
called ‘Internet English’ (IE). This language variety, which has its own word-formation
processes, grammar, and syntax, is employed in the context of CMC only.
Nonetheless, few researches have been carried out in order to investigate from a
sociolinguistic perspective one of the new media involving CMC: the social networking
platforms. Therefore, our essay will point out the main sociolinguistic features of one of
these social media, namely Twitter. Similarly to the users of previous CMC technologies
who have developed IE, have the users of Twitter developed a particular ‘Twenglish’
variety, strictly related to Twitter and possibly deriving from IE? Who are its speakers,
and what is the habitat in which they act? Can this habitat be analysed using traditional
ways of describing World Englishes like, for example, Kachru’s ‘Three Circles’ model?
These issues will be investigated by carrying out a quantitative analysis of a
corpus which was built on 3,000 tweets (140-character messages). These messages were
collected among all the English-speaking users, and they were later organized and
classified according to different social variables. In addition, the creation of sub-corpora
reflecting macro-areas based on the variety of English the users are ‘supposed’ to speak
was helpful in order to understand whether similarities between the different varieties
could be identified.
Apart from few distinctive features of non-native varieties of English, the research
would suggest that there is not a relevant differentiation among speakers. Nevertheless, a
‘Twenglish’ variety, i.e. a variety of English strictly related to this particular social
network, was not identified. What can be found, though, is a way of adapting the language
to the medium through the development of contractions, specialized vocabulary, and
symbols, which are the same as the ones of IE. Our conclusion is therefore that the
language of Twitter is Internet English.
6
1. INTRODUCTION
1.1. Overview
The revolution that the Internet would have brought in our lives could be foreseen
since the very first years of the 1980s, when this computer network was starting to become
widely spread around the world. However, the Internet did not only change the way of
collecting and sharing information around the world. It has also changed the way we get
in contact with other people and, more importantly, the way we communicate with them.
The Internet, in fact, offers a wide range of services which are specifically designed to
connect people around the globe, twenty-four hours a day and seven days a week, and
even in real-time. Services such as Instant Messaging and World-Wide-Web-based
websites such as chat rooms, virtual worlds, blogs and forums are actually the main
responsible for the creation and the development of a new variety of the language. This
variety, which is strictly related to the habitat of online communication and, more
generally, of Computer-Mediated Communication (CMC), has been referred to by several
scholars as ‘Internet English’ (Sun 2010:99; Crystal 2011:2). We will give further details
about this new language variety in Chapter 1.2; however, it is important to observe that
the source language from which this variety derives is English. This is not a surprising
fact, since the first network of computers, as well as the Internet, were created and
developed in the United States of America. Hints of the existence of Internet English can
be found since the opening of the Internet to the public in 1979; however, the 1990s are
the most prolific years, as far as the development of this new language variety is
concerned (Crystal 2011).
During the last three decades, several studies have been carried out in order to
further investigate the linguistic phenomena concerning the use of the language on the
Internet, which has been referred to in the most diverse ways: Netspeak, Netlish,
Webspeak, Weblish, and many other ‘-speaks’, which are a clear reference to George
Orwell’s fictional Newspeak language. This is the field of studies of Internet Linguistics,
Computational Linguistics, Human Language Technology, and IT-based learning and
teaching, which all have the purpose to focus on the relationship between ICT
(Information Communication Technology) and human language (Bodomo 2009). In other
words, the main subject of their study is the so-called Computer-Mediated
Communication.
7
Nevertheless, since the late 2000s, another Internet service has become popular,
possibly forcing the variety of English of online communication to further develop. It is
the rise of the social networking websites, which are specifically created and designed in
order to enhance social interactions on the Web. Definitely, these new media also have
an effect in a sociolinguistic perspective, since it is now possible for people to get in
contact with other cultures and other languages in a way that is unprecedented; and, of
course, language contact usually brings to language change and variation as well.
In our essay we would like to contribute to the previous studies concerning the
relationship between social networking websites and language change and variation by
analysing from a quantitative perspective the phenomena affecting the sociolinguistics of
the social media. To do so, we will consider one of the most popular online social
networks, namely Twitter, and we will try to understand whether we can identify hints of
a ‘Twenglish’ language, i.e. a hypothetical variety of English which would be typical of
this social platform. The core instrument for this study will be a corpus of messages which
will be analysed according to different social variables such as age, gender, and place of
residence of the users.
Our study is organised as follows: Chapter 1 offers a literature review concerning
the previous studies on CMC and on the main issues regarding this new field of studies,
as well as a detailed description of the social networking website we are going to analyse;
finally, we will question the appropriateness of traditional World English models in the
context of online communication.
On the other hand, Chapter 2 highlights the three main questions of this research,
regarding the identity of the users of Twitter, the language they write in, and how they
can be classified with respect to their own language.
Chapter 3, instead, gives further details on the methodology with which our study
will be brought on, focusing on possible methodological constrains which may influence
its outcomes.
Chapter 4 goes into details as far as the identity of the users is concerned. In fact,
an analysis of the three main social variables will be offered, giving an idea of who the
hypothetical writers of ‘Twenglish’ are.
With Chapter 5, a further analysis concerning the relationship between the users
of Twitter and their language is offered; furthermore, we will finally establish whether a
‘Twenglish’ variety deriving from English actually exists and, if not, what is the language
which is used in the corpus, and how their speakers should be classified.
8
Finally, we will sum up the results of our research in Chapter 6, which can possibly
be a starting point for further studies as far as the sociolinguistics of Twitter is concerned.
1.2. Computer-Mediated Communication and Internet English
The recent interest of Computer Science towards the development of
communication electronic devices has brought to a change, even as far as the name of this
branch of studies is concerned. In fact, Computer Science is nowadays also referred to as
ICT, namely Information Communication Technology, pointing out the awareness of the
importance of new ways of communicating which can be strictly related to computers and
new media. It is not a surprise therefore the emergence of new fields of research
concerning the relationship between ICT and language. The central subject of these
studies is Computer-Mediated Communication (or CMC), a term which refers to any
communicative interaction which is mediated by the use of two or more electronic
devices.
One of the main definitions of CMC was provided by December (1996:24):
Internet-based, computer-mediated communication involves information exchange that takes
place on the global, cooperative collection of networks.
As can be observed, this is a broad definition, which, however, grasps the
fundamental essence of CMC: in fact, it points out the global dimension of this kind of
communication. However, this definition does not take in account the number of different
ways in which this kind of communication can take place.
A more recent definition is offered by Bodomo (2009:6), who points out that:
CMC is defined as the coding and decoding of linguistic and other symbolic systems between
sender and receiver for information processing in multiple formats through the medium of the
computer and allied technologies such as PDAs, mobile phones, and blackberries; and through
media like the internet, email, chat systems, text messaging, YouTube, Skype, and many more to
be invented.
This more updated definition actually takes in account both December’s main issues;
however, Bodomo points out that this kind of communication can happen both through
mobile devices such as mobile phones and tablets, and through personal computers.
Therefore, she also considers as vehicles of CMC not only the Internet and its facilities,
but also portable devices and their services, such as the Short Message Service (SMS).