Page 1 of 10
Advances in Social Sciences Research Journal – Vol. 9, No. 5
Publication Date: May 25, 2022
DOI:10.14738/assrj.95.12257. Kang, N. (2022). On Keep and Keep on: A Corpora-based Analysis. Advances in Social Sciences Research Journal, 9(5). 12-21.
Services for Science and Education – United Kingdom
On Keep and Keep on: A Corpora-based Analysis
Namkil Kang
Far East University, South Korea
ABSTRACT
The main goal of this paper is to provide a comparative analysis of keep and keep on
in the TV Corpus and the British National Corpus. With respect to the TV Corpus, it
is interesting to note that keep was favored over keep on from the 1950s to the
2010s. A further point to note is that keep and keep on reached a peak (28,586
tokens vs. 1,303 tokens) in the 2010s. In addition, it is worth pointing out that keep
was always preferable to keep on from the 1950s to the 2010s. With respect to the
BNC, it is probably worthwhile pointing out that keep is 71.42% the same as keep on
in the seven genres of the BNC. When it comes to the Euclidean distance between
keep and keep on, the former is the nearest from the latter in the fiction genre,
whereas the former is the furthest from the latter in the spoken genre. Quite
interestingly, the BNC clearly shows that keep going and keep on going are the most
preferred (430 tokens vs. 24 tokens) by the British. Finally, it is significant to note
that 43.58% of thirty nine gerunds are the collocations of both keep and keep on.
Keywords: TV Corpus, BNC, keep, keep on, type, token
INTRODUCTION
According to Murphy (2016, 2019), keep and keep on are used synonymously, as illustrated in
(1):
(1) Keep or keep on (=do something continuously or repeatedly)
(Murphy 2019: 106)
The main purpose of this paper is to provide a comparative analysis of keep and keep on in the
TV Corpus (TVC) and the British National Corpus (BNC). First, we consider which type is the
preferable one in the TV programs of America, the UK, Ireland, Canada, Australia, and New
Zealand, observe the diachronic use of keep and keep on, and compare them. Second, we
consider the genre frequency of keep and keep on in the BNC. By examining the use of keep and
keep on in seven genres, we can see how much alike they are. Additionally, by measuring the
Euclidean distance between keep and keep on, we can see how close they are in each genre.
Finally, after considering the collocations of keep and keep on, we investigate the similarity
between keep and keep on. This paper is organized as follows. In section 2, we try to show that
keep was favored over keep on from the 1950s to the 2010s. We also show that keep and keep
on reached a peak (28,586 tokens vs. 1,303 tokens) in the 2010s. Additionally, we argue that
keep was always preferable to keep on from the 1950s to the 2010s. In section 3, we contend
that keep is 71.42% the same as keep on in the seven genres of the BNC. We also maintain that
keep is the nearest from keep on in the fiction genre, but the former is the furthest from the
latter in the spoken genre. In section 4, we argue that keep going and keep on going are the most
Page 2 of 10
13
Kang, N. (2022). On Keep and Keep on: A Corpora-based Analysis. Advances in Social Sciences Research Journal, 9(5). 12-21.
URL: http://dx.doi.org/10.14738/assrj.95.12257
preferred (430 tokens vs. 24 tokens) by the British. We further argue that 43.58% of thirty nine
gerunds are the collocations of both keep and keep on.
THE TV CORPUS
In what follows, we observe the diachronic use of keep and keep on from the 1950s to the 2010s
and compare them. Table 1 shows the use and frequency of keep and keep on from the 1950s to
the 2010s:
Table 1 Frequency of keep and keep on in the TV Corpus
Type Keep Keep on
All 49,298 2,467
1950s 158 21
1960s 1,022 109
1970s 1,014 95
1980s 1,677 172
1990s 3,862 197
2000s 12,979 570
2010s 28,586 1,303
US/CA 42,234 2,012
UK/IE 5,845 391
AU/NZ 966 50
MISC 253 14
An important question is “Which type is the preferable one in the TV programs of six countries?”
Table 1 clearly indicates that keep was preferable to keep on from the 1950s to the 2010s. To
be more specific, the overall frequency of keep is 49,298 tokens, whereas that of keep on is 2,467
tokens. This in turn suggests that the type keep (49,298 tokens) was preferred over the type
keep on (2,467 tokens) by the celebrities of six countries. The following graph shows the
diachronic use of keep and keep on from the 1950s to the 2010s:
Figure 1 Frequency of keep and keep on from the 1950s to the 2010s
0
5000
10000
15000
20000
25000
30000
35000
1950 1960 1970 1980 1990 2000 2010
Frequency
Year
Keep Keep on
Page 3 of 10
14
Advances in Social Sciences Research Journal (ASSRJ) Vol. 9, Issue 5, May-2022
Services for Science and Education – United Kingdom
There was a dramatic increase (a rise of 864 tokens) in the figure of keep from the 1950s to the
1960s. Interestingly, there was a sudden decline (a fall of 8 tokens) in the figure of keep from
the 1960s to the 1970s. More interestingly, there was a steady increase (a rise of 663 tokens)
in the figure of keep from the 1970s to the 1980s. More importantly, there was a sharp rise (an
increase of 2,185 tokens) in the figure of keep from the 1980s to the 1990s. It is significant to
note that there was a dramatic increase (a sharp increase of 24,724 tokens) in the figure of keep
from the 1990s to the 2010s. It is interesting to point out that keep had the highest frequency
(28,586 tokens) in the 2010s, whereas it had the lowest frequency (158 tokens) in the 1950s.
This in turn indicates that keep was the most preferred one (28,586 tokens) in the TV programs
of six countries in the 2010s, whereas it was the least preferred one (158 tokens) in the 1950s.
Most importantly, keep was the most preferred (42,234 tokens) by Americans and Canadian
celebrities, followed by British and Irish ones (5,845 tokens), and Australian and New Zealand
ones (966 tokens).
It is interesting to note that there was a slight increase (a rise of 88 tokens) in the figure of keep
on from the 1950s to the 1960s. It should be noted, however, that there was a sudden fall (a
decrease of 14 tokens) in the figure of keep on from the 1960s to the 1970s. Interestingly, there
was a gradual increase (a rise of 102 tokens) in the figure of keep on from the 1970s to the
1990s. Quite interestingly, there was a dramatic rise (a rise of 1,106 tokens) in the figure of
keep on from the 1990s to the 2010s. Most importantly, keep on reached a peak (1,303 tokens)
in the 2010s, which in turn suggests that it was the most preferred one (the highest frequency)
in that period. It is important to note that keep was always preferable to keep on from the 1950s
to the 2010s. Additionally, there was a big difference between the frequency of keep and that of
keep on after the 1990s. When it comes to keep on, it was the most preferred (2,012 tokens) by
American and Canadian celebrities, followed by British and Irish ones (391 tokens), and
Australian and New Zealand ones (50 tokens). We thus conclude that keep is preferable to keep
on in the TV programs of six countries (America, the UK, Canada, Australia, New Zealand, and
Ireland) from the 1950s to the 2010s.
KEEP AND KEEP ON IN THE BNC
In the following, we provide a frequency analysis of keep and keep on in the seven genres of the
BNC. Table 2 shows the use and frequency of keep and keep on in the BNC:
Table 2 Frequency of keep and keep on in the BNC
GENRE ALL SPOKEN FICTION MAGAZINE NEWSPAPER NON- ACAD
ACADEMIC MISC
Keep 2,671 995 607 216 294 165 53 341
Keep
on
333 91 72 21 51 27 11 60
An important question is “Which type is the preferable one for the British?” Table 2 clearly
indicates that the type keep is the preferred one in the UK. More specifically, the overall
frequency of keep is 2,671 tokens, whereas that of keep on is 333 tokens. The frequency of keep
(2,671 tokens) is eight times higher than that of keep on (333 tokens). This in turn implies that
keep is preferable to keep on in the UK.
It is significant to note that keep and keep on rank first (995 tokens vs. 91 tokens) in the spoken
genre of the BNC. Quite interestingly, keep and keep on show the same property in rank-one,