Page 1 of 10

Advances in Social Sciences Research Journal – Vol. 9, No. 5

Publication Date: May 25, 2022

DOI:10.14738/assrj.95.12257. Kang, N. (2022). On Keep and Keep on: A Corpora-based Analysis. Advances in Social Sciences Research Journal, 9(5). 12-21.

Services for Science and Education – United Kingdom

On Keep and Keep on: A Corpora-based Analysis

Namkil Kang

Far East University, South Korea

ABSTRACT

The main goal of this paper is to provide a comparative analysis of keep and keep on

in the TV Corpus and the British National Corpus. With respect to the TV Corpus, it

is interesting to note that keep was favored over keep on from the 1950s to the

2010s. A further point to note is that keep and keep on reached a peak (28,586

tokens vs. 1,303 tokens) in the 2010s. In addition, it is worth pointing out that keep

was always preferable to keep on from the 1950s to the 2010s. With respect to the

BNC, it is probably worthwhile pointing out that keep is 71.42% the same as keep on

in the seven genres of the BNC. When it comes to the Euclidean distance between

keep and keep on, the former is the nearest from the latter in the fiction genre,

whereas the former is the furthest from the latter in the spoken genre. Quite

interestingly, the BNC clearly shows that keep going and keep on going are the most

preferred (430 tokens vs. 24 tokens) by the British. Finally, it is significant to note

that 43.58% of thirty nine gerunds are the collocations of both keep and keep on.

Keywords: TV Corpus, BNC, keep, keep on, type, token

INTRODUCTION

According to Murphy (2016, 2019), keep and keep on are used synonymously, as illustrated in

(1):

(1) Keep or keep on (=do something continuously or repeatedly)

(Murphy 2019: 106)

The main purpose of this paper is to provide a comparative analysis of keep and keep on in the

TV Corpus (TVC) and the British National Corpus (BNC). First, we consider which type is the

preferable one in the TV programs of America, the UK, Ireland, Canada, Australia, and New

Zealand, observe the diachronic use of keep and keep on, and compare them. Second, we

consider the genre frequency of keep and keep on in the BNC. By examining the use of keep and

keep on in seven genres, we can see how much alike they are. Additionally, by measuring the

Euclidean distance between keep and keep on, we can see how close they are in each genre.

Finally, after considering the collocations of keep and keep on, we investigate the similarity

between keep and keep on. This paper is organized as follows. In section 2, we try to show that

keep was favored over keep on from the 1950s to the 2010s. We also show that keep and keep

on reached a peak (28,586 tokens vs. 1,303 tokens) in the 2010s. Additionally, we argue that

keep was always preferable to keep on from the 1950s to the 2010s. In section 3, we contend

that keep is 71.42% the same as keep on in the seven genres of the BNC. We also maintain that

keep is the nearest from keep on in the fiction genre, but the former is the furthest from the

latter in the spoken genre. In section 4, we argue that keep going and keep on going are the most

Page 2 of 10

13

Kang, N. (2022). On Keep and Keep on: A Corpora-based Analysis. Advances in Social Sciences Research Journal, 9(5). 12-21.

URL: http://dx.doi.org/10.14738/assrj.95.12257

preferred (430 tokens vs. 24 tokens) by the British. We further argue that 43.58% of thirty nine

gerunds are the collocations of both keep and keep on.

THE TV CORPUS

In what follows, we observe the diachronic use of keep and keep on from the 1950s to the 2010s

and compare them. Table 1 shows the use and frequency of keep and keep on from the 1950s to

the 2010s:

Table 1 Frequency of keep and keep on in the TV Corpus

Type Keep Keep on

All 49,298 2,467

1950s 158 21

1960s 1,022 109

1970s 1,014 95

1980s 1,677 172

1990s 3,862 197

2000s 12,979 570

2010s 28,586 1,303

US/CA 42,234 2,012

UK/IE 5,845 391

AU/NZ 966 50

MISC 253 14

An important question is “Which type is the preferable one in the TV programs of six countries?”

Table 1 clearly indicates that keep was preferable to keep on from the 1950s to the 2010s. To

be more specific, the overall frequency of keep is 49,298 tokens, whereas that of keep on is 2,467

tokens. This in turn suggests that the type keep (49,298 tokens) was preferred over the type

keep on (2,467 tokens) by the celebrities of six countries. The following graph shows the

diachronic use of keep and keep on from the 1950s to the 2010s:

Figure 1 Frequency of keep and keep on from the 1950s to the 2010s

0

5000

10000

15000

20000

25000

30000

35000

1950 1960 1970 1980 1990 2000 2010

Frequency

Year

Keep Keep on

Page 3 of 10

14

Advances in Social Sciences Research Journal (ASSRJ) Vol. 9, Issue 5, May-2022

Services for Science and Education – United Kingdom

There was a dramatic increase (a rise of 864 tokens) in the figure of keep from the 1950s to the

1960s. Interestingly, there was a sudden decline (a fall of 8 tokens) in the figure of keep from

the 1960s to the 1970s. More interestingly, there was a steady increase (a rise of 663 tokens)

in the figure of keep from the 1970s to the 1980s. More importantly, there was a sharp rise (an

increase of 2,185 tokens) in the figure of keep from the 1980s to the 1990s. It is significant to

note that there was a dramatic increase (a sharp increase of 24,724 tokens) in the figure of keep

from the 1990s to the 2010s. It is interesting to point out that keep had the highest frequency

(28,586 tokens) in the 2010s, whereas it had the lowest frequency (158 tokens) in the 1950s.

This in turn indicates that keep was the most preferred one (28,586 tokens) in the TV programs

of six countries in the 2010s, whereas it was the least preferred one (158 tokens) in the 1950s.

Most importantly, keep was the most preferred (42,234 tokens) by Americans and Canadian

celebrities, followed by British and Irish ones (5,845 tokens), and Australian and New Zealand

ones (966 tokens).

It is interesting to note that there was a slight increase (a rise of 88 tokens) in the figure of keep

on from the 1950s to the 1960s. It should be noted, however, that there was a sudden fall (a

decrease of 14 tokens) in the figure of keep on from the 1960s to the 1970s. Interestingly, there

was a gradual increase (a rise of 102 tokens) in the figure of keep on from the 1970s to the

1990s. Quite interestingly, there was a dramatic rise (a rise of 1,106 tokens) in the figure of

keep on from the 1990s to the 2010s. Most importantly, keep on reached a peak (1,303 tokens)

in the 2010s, which in turn suggests that it was the most preferred one (the highest frequency)

in that period. It is important to note that keep was always preferable to keep on from the 1950s

to the 2010s. Additionally, there was a big difference between the frequency of keep and that of

keep on after the 1990s. When it comes to keep on, it was the most preferred (2,012 tokens) by

American and Canadian celebrities, followed by British and Irish ones (391 tokens), and

Australian and New Zealand ones (50 tokens). We thus conclude that keep is preferable to keep

on in the TV programs of six countries (America, the UK, Canada, Australia, New Zealand, and

Ireland) from the 1950s to the 2010s.

KEEP AND KEEP ON IN THE BNC

In the following, we provide a frequency analysis of keep and keep on in the seven genres of the

BNC. Table 2 shows the use and frequency of keep and keep on in the BNC:

Table 2 Frequency of keep and keep on in the BNC

GENRE ALL SPOKEN FICTION MAGAZINE NEWSPAPER NON- ACAD

ACADEMIC MISC

Keep 2,671 995 607 216 294 165 53 341

Keep

on

333 91 72 21 51 27 11 60

An important question is “Which type is the preferable one for the British?” Table 2 clearly

indicates that the type keep is the preferred one in the UK. More specifically, the overall

frequency of keep is 2,671 tokens, whereas that of keep on is 333 tokens. The frequency of keep

(2,671 tokens) is eight times higher than that of keep on (333 tokens). This in turn implies that

keep is preferable to keep on in the UK.

It is significant to note that keep and keep on rank first (995 tokens vs. 91 tokens) in the spoken

genre of the BNC. Quite interestingly, keep and keep on show the same property in rank-one,