-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathmanuscript.tex
2442 lines (2102 loc) · 97 KB
/
manuscript.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
% -*- TeX-engine: luatex; -*-
% Copyright (C) 2023 Wing Hei Chan
% This work is licensed under a Creative Commons
% Attribution-ShareAlike 4.0 International license. To view a copy of
% this license, visit
% <https://creativecommons.org/licenses/by-sa/4.0/>.
\documentclass[12pt, a4paper]{report}
\usepackage{imakeidx}
\usepackage{setspace}
\usepackage[type=CC, modifier=by-sa, version=4.0]{doclicense}
\usepackage[margin=1in]{geometry}
\usepackage{lua-widow-control}
\usepackage{microtype}
\usepackage{mathtools}
\usepackage{unicode-math}
\usepackage[style]{abstract}
\usepackage[style=unified]{biblatex}
\usepackage[style=british]{csquotes}
\usepackage{datetime}
\usepackage{enumitem}
\usepackage{titlesec}
\usepackage{ebproof}
\usepackage{tikz}
\usepackage[linguistics]{forest}
\usepackage{linguex}
\onehalfspacing
\setlength{\jot}{0pt}
\setlist{noitemsep}
\titleformat{\chapter}{\normalfont\huge\bfseries}{\thechapter}{1em}{}
\titlespacing*{\chapter}{
0pt}{3.5ex plus 1ex minus .2ex}{2.3ex plus .2ex}
\setmainfont{Libertinus Serif}
\setmathfont{Libertinus Math}
\setmonofont[Scale=MatchUppercase]{Iosevka}
\addbibresource{./reference.bib}
\makeindex[intoc]
\newdateformat{monthyyyy}{\monthname[\THEMONTH] \THEYEAR}
\DeclareNameWrapperFormat{labelname:poss}{#1's}
\newrobustcmd*{\posscitealias}{%
\AtNextCite{\DeclareNameWrapperAlias{labelname}{labelname:poss}}}
\newrobustcmd*{\posscite}{\posscitealias\textcite}
\usetikzlibrary{automata}
\tikzset{auto}
\forestset{auto/.style={for root=baseline, for tree={calign=first}}}
\newcommand{\context}{\mathrel{/}}
\newcommand{\cuhk}{The Chinese University of Hong Kong}
\newcommand{\gap}{\underline{\hspace{1em}}}
\newcommand{\shift}{\hspace*{1em}}
\newcommand{\textemph}[1]{\textsc{#1}}
\newcommand{\textfeat}[1]{\textsc{#1}}
\newcommand{\textfor}[1]{\textit{#1}}
\newcommand{\textform}[1]{\textit{#1}}
\newcommand{\textgls}[1]{`#1'}
\newcommand{\textphon}[1]{[#1]}
\newcommand{\textterm}[1]{\textsc{#1}\index{#1}}
\renewcommand{\implies}{\Rightarrow}
\renewcommand{\ExLBr}{(\thechapter.}
\renewenvironment{flushleft}{\raggedright}{}
\begin{document}
\pagenumbering{alph}
\begin{titlepage}
\vspace*{\fill}
\begin{center}
\begin{Large}
\bfseries
Subregular Complexity of Tonal Systems:
Case Studies of Chinese Languages
\end{Large}
by
CHAN Wing Hei
under
Professor LAI Yee King Regine
\bigskip
A Thesis Submitted in Partial Fulfilment of
the Requirement for the Degree of
Bachelor of Arts
in
Linguistics
\bigskip
\cuhk
\monthyyyy\formatdate{08}{05}{2023}
\end{center}
\pretocmd{\doclicenseLongText}{%
Copyright \copyright\ 2023 Wing Hei Chan\par}{}{}
\doclicenseThis
\vspace*{\fill}
\end{titlepage}
\cleardoublepage
\pagenumbering{roman}
\chapter*{Abstract}
\addcontentsline{toc}{chapter}{Abstract}
Phonological transformations based on finite-state transducers have
received an important status since their introduction under the guise
of rewrite rules. Finite-state transducers represent regular
relations, which are closed under composition. This shows that an
apparently complex system of phonological rules can be compiled into a
single finite-state transducer. On the other hand, regular relations
are still too powerful as a probable upper bound of the complexity of
phonological transformations. For example, regular relations are able
to encode such a phonological transformation that only applies to an
even number of terms, obviously unattested as an adequate rule.
Recently, studies of the so-called subregular classes, that is,
classes even weaker than regular relations, have proved to be a more
probable upper bound. Such classes are conceptualized as a hierarchy,
not unlike the traditional Chomsky hierarchy in formal language
theory. The remaining problem is how many phonological phenomena are
explainable and, more importantly, \textemph{not} explainable by each
class. In this inquiry, tonal systems pose an interesting challenge,
and thus require a more in-depth investigation.
Traditionally, tonal systems are accounted for by nonlinear phonology,
which involves the use of multiple phonological tiers and their
associations. This process can be carefully separated into two ideas:
%
\begin{enumerate}
\item Tones can be independently represented on a separate tier just
as segments do;
\item The associations between phonological tiers can be manipulated
in a nontrivial way.
\end{enumerate}
%
In this thesis, the two ideas are examined under the lens of attested
tonal systems found in Chinese languages with vastly different tonal
transformations. Their implication applies not only to tonal systems,
but also to the interfaces between linguistic levels in general.
\tableofcontents
\cleardoublepage
\pagenumbering{arabic}
\chapter{Theoretical Background}
This chapter introduces the theoretical foundations needed for the
interpretation of tonal systems. The theories are presented from two
perspectives, namely theoretical and computational phonology, which
utilize different methodologies. Although this thesis uses a
computational approach, the theoretical implication requires an
understanding of relevant approaches as well.
Specifically, the focus is placed on phonological
\textterm{transformations} as opposed to \textterm{phonotactics}.
Their distinction corresponds to the distinction between transducers
and acceptors in automata theory, which reflect translation and
membership problems respectively. Loosely speaking, a transducer is a
function of type \(\alpha \to \beta \to \mathit{Boolean}\), while an
acceptor of type \(\alpha \to \mathit{Boolean}\), that is, a
predicate.\footnote{For deterministic transducers, the types are
isomorphic to \(\alpha \to \beta\).}\footnote{Unless a gradient
interpretation of phonology is required, in which case the output
type can be, for example, an interval type.} The latter will be
briefly mentioned as the theories are introduced.
Given the theoretical background, this chapter then explains the
rationales behind the chosen methodology as well as the expected
results. In particular, this thesis has the goal of clarifying the
unique operations on autosegmental representations that empower them
to account for certain tonal transformations.
\section{Theoretical Phonology}
This section details the presentations of phonological transformations
in theoretical phonology. Phonological transformations have been
employed by two allegedly divergent formalisms, often named linear and
nonlinear phonology. As explained in the later section on the
computational perspective, this categorization fails to recognize the
actual properties that render computationally more complex formalisms
necessary. Nonetheless, this section will follow the traditional
categorization.
This section also discusses the organization of grammar. As will
become clear once computational models are introduced, organization
only matters in the compositions of phonological transformations,
contrary to the common conception that cyclicity and ordering are
inherent properties of phonological systems. Lexical phonology is
mentioned for its significance in the arguments against cyclicity.
The discussion of \textterm{optimality theory} as in
\textcite{ps93otcigg} is intentionally omitted. The reason is not
that optimality theory is a worse formalism, but merely that its
formalization results in computational devices out of the scope of
this thesis. Specifically, optimality theory, due to its
constraint-optimizing nature, requires constraint optimizers rather
than transducers, which are potentially powerful enough to
overgenerate. For such an implementation based on dynamic
programming, refer to \textcite{t95cot}. For discussions of the
computational complexity of optimality theory, refer to
\textcite{e97egpot, i06sptotci, hkr09ecot}, among others.
\subsection{Linear Phonology}
The earliest discussion of phonological transformations dates back to
\textcite{ch68spe}, where rules are in the form of context-sensitive
rewrite rules. For example, a vowel nasalization rule can be in the
form
\ex. \(\text{V} \to [+\text{nasal}]
\context \text{\gap}[+\text{nasal}]\)
read as \enquote{nasalize a vowel immediately before a nasal sound}.
Under the usual notation of rewrite rules, this is written as
\ex. \(\text{V}[+\text{nasal}] \to
(\text{V} \cup [+\text{nasal}])[+\text{nasal}]\)
understood as \enquote{rewrite a vowel immediately before a nasal
sound to the union of the vowel and the singleton set of the nasal
feature}. It should be noted that this apparently context-sensitive
form is by no means actually context-sensitive. What it means is
that, given the Chomsky hierarchy \parencite{c59cfpg}
\ex. \(\text{Recursively enumerable} \supset
\text{Context-sensitive} \supset
\text{Context-free} \supset
\text{Regular}\)
it is not true that phonological transformations form a superset of
context-free relations. This is shown in the later section on regular
relations.
Attention should be paid to \enquote{linear}. One interpretation is
that phonological transformations apply to a linear sequence of
segments. This assumes that segmental phonology, the sort that is
dealt with above, is disjoint from nonsegmental phonology. This
assumption is computationally unsound, as nonsegmental phonology does
not necessarily require a \enquote{nonlinear} account, and conversely
segmental phonology may require one. Another interpretation has to do
with the notion of locality, which is an often misunderstood notion in
theoretical phonology.
A relevant concern is the representations of tonal features. Indeed,
due to the \enquote{linear} nature of the representations, tonal
features have nowhere else to fit in but the same feature sets as the
segmental features. Consider, for example, \posscite{w67pft}
interesting characterization of Xiamen tone cycle:
\ex. \([\mathop{\alpha}\text{high}, \mathop{\beta}\text{fall}]
\to [\mathop{\beta}\text{high}, \mathop{-\alpha}\text{fall}]\)
This rule not only uses the same \textemph{form} as segmental rules,
but also the same \textemph{representation}. Of course, it is not a
problem if the rule explains the tone cycle well.\footnote{See,
however, \posscite{c00tspcd} criticism of the characterization.} On
the other hand, a nonlinear account allows more freedom both in the
form and representation of the rule. Readers are thus reminded that
representation is a separate matter from the computational complexity
of the formalism \textemph{independent} of representation.
\subsection{Nonlinear Phonology}
Despite the remarks made above, nonsegmental phonology is precisely
what has inspired the development of nonlinear phonology, also known
as autosegmental phonology. Early applications of nonlinear phonology
can be found in \textcite{c76vhngpam, g76ap}, among others. Under
nonlinear phonology, phonological representations consist of multiple
associated tiers, each tier with its own linear sequence of terms.
For example, a syllable \textphon{ma} with a high level tone can be
represented as
\ex.
\begin{forest}
auto, [ma [\(\sigma\) [H] [H]]]
\end{forest}
where \(\sigma\) stands for a syllable and H stands for a high tone.
Consequently, phonological transformations are generalized to operate
on associations. An unbounded tone spreading rule can be given as:
\ex. \(
\begin{forest}
auto, [T [\(\sigma\)] [\(\sigma\), no edge] [\(\sigma\), no edge]]
\end{forest}
\to
\begin{forest}
auto, [T [\(\sigma\)] [\(\sigma\)] [\(\sigma\), no edge]]
\end{forest}
\to
\begin{forest}
auto, [T [\(\sigma\)] [\(\sigma\)] [\(\sigma\)]]
\end{forest}\)
An implicit assumption is that the rule applies iteratively from left
to right. This intuition is, surprisingly, captured equally well by
linear transformations. An equally acceptable formulation is the
linear transformation
\ex. \(\text{T}\varepsilon\varepsilon
\to \text{TT}\varepsilon
\to \text{TTT}\)
where \(\varepsilon\) stands for the empty term, assuming the
transformation also applies iteratively from left to right.
The pair of examples above suggests that the power of nonlinear
phonology comes not from the representations, nor from the capability
of iterative application. Instead, it comes from the ability to
encode the associations in the transformations, as explicified by the
computational models. Iterative application, on the other hand,
concerns the computational nature of the rules, which is reflected by
the mappings their corresponding process produces.
A point should be made that iterative application has little to do
with the organization of grammar. An iteratively applied phonological
transformation is neither inherently associated with cyclicity nor
with ordering. In fact, it is not necessarily iterative at all under
an alternative representation. The only thing relevant is the
\textemph{process} the rule generates as embodied by the equivalent
computational device.
\subsection{Organization of Grammar}
What organization of grammar concerns is, then, \textemph{not} the
computational nature of phonological transformations, but rather the
use of linguistic formalisms to express computationally equivalent
models.\footnote{This is not to deny that linguistic formalisms differ
in their expressivity in a nontrivial manner. Expressivity and
computability are orthogonal, as the famous \enquote{Turing tarpit}
analogy puts it \parencite{p82ep}.} This is reminiscent of the use
of programming paradigms to express computationally equivalent
programs given their Turing completeness, except we expect a much
weaker upper bound in phonological models.
The concept of \textterm{cyclicity}, that is, repeatedly applying a
phonological transformation until it is no longer applicable, played
an important role in early formulations of nonsegmental phonology.
The idea is that a series of rules is linearly sequenced as
\(R_{1}, \ldots, R_{n}\), where \(R_{n}\) is a rule that creates new
contexts for the previous rules, often conceived as a \enquote{bracket
erasure} rule. This brings two problems:
%
\begin{enumerate}
\item The end result of rule applications is tightly coupled with how
contexts are represented;
\item It is not clear how to apply different sequences of rules at
each level of representation.
\end{enumerate}
%
The former is hardly an inherent or unique problem of cyclicity. The
latter, in comparison, is a more serious problem, as it prohibits an
otherwise modular organization of grammar by the virtue of
\enquote{rules bloat}. Alternatively, \textterm{lexical phonology}
advocates for a \enquote{stratified} approach \parencite{k82cplp},
thus achieving a more modular organization. This thesis adopts the
general approach of lexical phonology without assuming any form of the
lexicalist hypothesis.
The \textterm{ordering} of phonological transformations, on the other
hand, is a more universal problem. Even simply modeling phonological
transformations as functions already leads to the ordering problem, as
\ex. \(\neg(\forall f, g\ldotp f \circ g \implies g \circ f)\)
That is, function composition is noncommutative. Therefore, this
thesis does not expect this problem to be easily solved.
\section{Computational Phonology}
This section introduces computational formalization of phonological
transformations in terms of regular relations, in particular
subregular functions. The main method of computationally formalizing
phonological transformations concerns the use of machines in the sense
of automata theory. In particular, finite-state machines are used for
their limited computational complexity. They are able to encode
regular relations, but not context-sensitive or even context-free
relations due to their lack of a stack.
Logical encodings of regular relations are also introduced along with
their transductions. Encoding regular relations in terms of logical
formulae facilitates the understanding of their subparts as compared
to transducers and enables the evaluation of their computational
complexity as first-order, quantifier-free, among others.
\subsection{Regular Relation}
The apparently context-sensitive phonological rules have the important
constraint that all terms are \textterm{terminal}, that is, not able
to be further rewritten into themselves, directly or indirectly. As
\textcite{c59cfpg} points out, the distinguishing feature of
context-free languages is exactly the ability to self-embed, which
requires recursively rewriting \textterm{nonterminal} terms into
themselves preceded and followed by other terms. Formally, a grammar
is self-embedding iff
\ex. \(\exists A, \varphi, \psi\ldotp A \to^{+} \varphi A \psi\)
where neither \(\varphi\) nor \(\psi\) is the identity term and
\(\to^{+}\) is the transitive closure of the rewrite relation.
Clearly, phonology does not permit such a structure.
\textcite{kk94rmprs} therefore suggests modeling phonological
transformations as \textterm{regular} relations.
Regular relations are relations whose domains and ranges are regular
languages. An artificial regular relation, for example, is a relation
that rewrites any sequence that has an \textemph{even} number of \(a\)
to a sequence with each \(a\) replaced by \(b\). This is an
unattested phonological transformation, but it is a valid regular
relation, showcasing that regular relations are still computationally
too complex for modeling phonological transformations.
Regular relations can be represented as the equivalent
\textterm{finite-state} transducers. Finite-state transducers are
machines that transit between a finite number of states according to
the currently read term, where transitions emit the output terms. To
put it another way, \textterm{acceptors} are machines that accept
\textemph{one} sequence of terms, while \textterm{transducers} are
ones that accept \textemph{two} sequences of terms, namely the input
and output sequences. The artificial regular relation above requires
a \textterm{nondeterministic} finite-state transducer that either
rewrites \(a\) to \(b\) or does not, whose validity is only determined
after the whole input sequence is read:
\ex.
\begin{tikzpicture}[baseline=(bow.base)]
\node[state] at (0, 0) (bow) {\(q_{1}\)};
\node[state, accepting] at (2, 1) (noop) {\(q_{2a}\)};
\node[state] at (4, 1) (invalid) {\(q_{3a}\)};
\node[state] at (2, -1) (write) {\(q_{2b}\)};
\node[state, accepting] at (4, -1) (valid) {\(q_{3b}\)};
\path[->]
(bow) edge [loop left] node {\(\text{?}:\text{?}\)} ()
(bow) edge node {\(a:a\)} (noop)
(noop) edge [loop above] node {\(\text{?}:\text{?}\)} ()
(noop) edge [bend left] node {\(a:a\)} (invalid)
(invalid) edge [loop above] node {\(\text{?}:\text{?}\)} ()
(invalid) edge [bend left] node {\(a:a\)} (noop)
(bow) edge ['] node {\(a:b\)} (write)
(write) edge [loop below] node {\(\text{?}:\text{?}\)} ()
(write) edge [bend left] node {\(a:b\)} (valid)
(valid) edge [loop below] node {\(\text{?}:\text{?}\)} ()
(valid) edge [bend left] node {\(a:b\)} (write);
\end{tikzpicture}
This artificial regular relation is unattested, which means the
transducer overgenerates. Ideally, two additional constraints are
desired:
%
\begin{enumerate}
\item The transducers be \textterm{deterministic} to exclude multiple
outputs;
\item The transducers be \textterm{local} to limit the required memory
resource.
\end{enumerate}
%
These two constraints, in turn, lead to subregular functions.
Alternatively, regular relations can be represented as logical
transductions, specifically \textterm{monadic second-order} ones as
those in \textcite{eh01mdsttft}, which are actually able to represent
nonregular relations.\footnote{An example is the first-order definable
total reduplication \(w \to ww\), whose output language is included
in the class of tree-adjoining languages \parencite{sw94efecg}.}
Monadic second-order logic allows quantification over sets, which is
needed for the definition of general precedence \(<\) based on
immediate precedence \(\vartriangleleft\):\footnote{This is to assume
the relations are nonrecursive.}
\ex.
\a. \(\mathit{Transitive}(X) \coloneq \forall x, y\ldotp
(x \in X \land x \vartriangleleft y) \implies y \in X\)
\b. \(x < y \coloneq \forall X\ldotp
(x \in X \land \mathit{Transitive}(X)) \implies y \in X\)
This is understood as \enquote{\(x\) precedes \(y\) iff every
transitive set with respect to immediate precedence that includes
\(x\) also includes \(y\)}, which means the transitive closure also
includes \(y\). A relation is then defined that expresses that an
\(a\) immediately precedes another \(a\) ignoring any non-\(a\)
between them:
\ex. \(x \vartriangleleft_{a} y \coloneq a(x) \land a(y) \land x < y
\land \neg(\exists z\ldotp a(z) \land x < z \land z < y)\)
The notion of the evenness of the number of \(a\) is finally defined:
\ex.
\a. \(\mathit{Odd}(x) \coloneq (a(x) \land (\forall y\ldotp
y < x \implies \neg a(y))) \lor (\exists y\ldotp
\mathit{Even}(y) \land y \vartriangleleft_{a} x))\)
\b. \(\mathit{Even}(x) \coloneq \exists y\ldotp
\mathit{Odd}(y) \land y \vartriangleleft_{a} x\)
Accordingly, the transduction is defined:\footnote{Observant readers
may notice that an efficient transducer implementing said
transduction reads the input \textemph{twice}, dispatching on the
number of \(a\) in the first pass. This is due to the
correspondence between monadic second-order logic and two-way
finite-state transducers.}
\ex. \(b'(x) \coloneq b(x) \lor (a(x) \land (\exists y\ldotp
\mathit{Even}(y) \land (\forall z\ldotp y < z \implies \neg a(z))))\)
Similar to how finite-state transducers are constrained, logical
transductions of the sort above are avoided if
%
\begin{itemize}
\item Quantification over sets is disallowed; or
\item Any quantification is disallowed.
\end{itemize}
%
They result in \textterm{first-order} logic and its
\textterm{quantifier-free} version respectively, which correspond to
various subregular functions.
Despite their inadequate computational complexity, regular relations
are useful as they are closed under composition, meaning that
\ex. \(\forall P, Q\ldotp (\mathit{Regular}(P) \land
\mathit{Regular}(Q)) \implies \mathit{Regular}(P \circ Q)\)
This enables the possibility to compile an apparently complex system
of phonological rules into a single finite-state transducer.
\subsection{Subregular Function}
The most trivial class of \textterm{subregular} functions is without a
doubt the class of finite functions, which is disfavored in this
thesis for the reasons mentioned in \textcite{s93wimpatlai}. Rather,
by enforcing the aforementioned constraints, several nontrivial
classes of subregular functions can be subdivided. If the
finite-state transducers are required to be deterministic, the
resulting subregular functions are \textterm{subsequential} functions,
further classified into left and right variants depending on the
direction the input sequence is read \parencite{ch12bcsimr, hl13vhs}.
This class is not particularly interesting for the purposes of this
thesis, as it lacks a meaningful notion of locality.
The concept of \textterm{strict locality} is defined for regular
languages, which concerns the recognition of substrings
\parencite{rp11apresh, rhfhlw13csc}. Borrowing this concept, input
and output strictly local functions can be defined over
\textterm{linear} representations, which \enquote{remember} a finite
length of input and output substrings respectively
\parencite{ceh14lslsf}. Similar to subsequential functions, output
strictly local functions can be further classified into left and right
variants \parencite{ceh15oslf}.
An example of input strictly local function is the function that
rewrites \(a\) to \(b\) whenever the last input is \(a\). That is,
\ex. \(a^{n+1} \to ab^{n}\)
where \(n \in \mathbb{N}\), with the equivalent transducer
\ex.
\begin{tikzpicture}[baseline=(loop.base), initial text=\(a:a\)]
\node[state, initial] at (0, 0) (loop) {\(a\)};
\path[->] (loop) edge [loop right] node {\(a:b\)} ();
\end{tikzpicture}
where the remembered input term is encoded in the state. More
conveniently, it can also be represented as the logical transduction
\ex.
\a. \(a'(x) \coloneq a(x) \land \neg(a(\mathit{pred}(x)))\)
\b. \(b'(x) \coloneq a(x) \land a(\mathit{pred}(x))\)
where \(\mathit{pred}\) is the predecessor function.
Similarly, an example of left output strictly local function is the
function that rewrites \(a\) to \(b\) whenever the last output is
\(a\). That is,
\ex.
\a. \(a^{2n} \to (ab)^{n}\)
\b. \(a^{2n+1} \to (ab)^{n}a\)
with the equivalent transducer
\ex.
\begin{tikzpicture}[baseline=(a.base), initial text=\(a:a\)]
\node[state, initial] at (0, 0) (a) {\(a\)};
\node[state] at (2, 0) (b) {\(b\)};
\path[->]
(a) edge [bend left] node {\(a:b\)} (b)
(b) edge [bend left] node {\(a:a\)} (a);
\end{tikzpicture}
where the remembered output terms are encoded in the states. When it
is represented as a logical transduction, however, there occurs
\textterm{recursive} logical formulae,\footnote{Readers familiar with
fixed-point theorems may expect to see a fixed-point logic
undermining such recursive logical formulae. The technical details
can be found in \textcite{koj18taolns, cj19qlfpfp}.} resulting in a
recursive strictly local function:
\ex.
\a. \(a'(x) \coloneq a(x) \land \neg(a'(\mathit{pred}(x)))\)
\b. \(b'(x) \coloneq a(x) \land a'(\mathit{pred}(x))\)
As \textcite{cj21iolr} conjectures, recursive logical transductions
comprise precisely the logical characterization of output strictly
local functions.
Logical encodings enable the generalization of strict locality over
\textterm{autosegmental} representations \parencite{cj19aislf}. Under
a predicate logic, associations are understood as a binary relation
\(\mathit{Assoc}\) where \(\mathit{Assoc}(a, b)\) indicates that \(a\)
and \(b\) are associated, where \(a\) and \(b\) are on separate
tiers.\footnote{A question can be asked whether \(\mathit{Assoc}\) is
symmetric. Although this question has no particular bearing,
\(\mathit{Assoc}\) is assumed to be antisymmetric to allow the
interpretation of autosegmental representations as directed graphs.}
For example, the unbounded tone spreading rule can be defined as the
recursive logical transduction
\ex. \(\mathit{Assoc}'(x, y) \coloneq \mathit{Assoc}(x, y) \lor
\mathit{Assoc}'(x, \mathit{pred}(y))\)
Is it then easy to see why it works equally well for linear
representations. \(\mathit{Assoc}\) is merely used to represent the
presence of tones:
\ex.
\a. \(\text{T}(x) = \exists y\ldotp \mathit{Assoc}(y, x)\)
\b. \(\text{T}'(x) = \exists y\ldotp \mathit{Assoc}'(y, x)\)
This makes it possible to surject the autosegmental transformation
into a linear one:
\ex.
\a. \(\text{T}'(x) \coloneq
\text{T}(x) \lor \text{T}'(\mathit{pred}(x))\)
\b.
\begin{prooftree}
\hypo{\mathit{Assoc}'(a, x) =
\mathit{Assoc}(a, x) \lor \mathit{Assoc}'(a, \mathit{pred}(x))}
\infer1{(\exists y\ldotp \mathit{Assoc}'(y, x)) =
(\exists y\ldotp \mathit{Assoc}(y, x) \lor
\mathit{Assoc}'(y, \mathit{pred}(x)))}
\infer1{(\exists y\ldotp \mathit{Assoc}'(y, x)) =
((\exists y\ldotp \mathit{Assoc}(y, x)) \lor
(\exists y\ldotp \mathit{Assoc}'(y, \mathit{pred}(x))))}
\infer1{\text{T}'(x) = \text{T}(x) \lor \text{T}'(\mathit{pred}(x))}
\end{prooftree}
In other words, the linear transformation is \textterm{embedded}. The
existential introduction reflects the fact that the transformation
only cares about the \textemph{existence} of any term on the tonal
tier that is associated with the input term.
Strictly local logical tranductions, as shown thus far, satisfy both
first-orderness and quantifier-freeness. This is not a coincidence,
but rather the expected outcome. By disallowing quantification,
logical formulae are allowed to refer to terms other than the input
terms only by the means of the predecessor and successor functions,
which in turn allude to the notion of locality.
\section{Preliminary Goal}
Chinese languages have been famous for their rich repertoire of tonal
systems and relevant transformations. In particular, tonal
transformations in Chinese languages often interact with the
morphosyntactic representations. This fact has inspired analyses that
require morphosyntactic encodings in phonological representations.
This thesis expects to formulate such techniques in terms of strictly
local functions.
Another question this thesis aims to answer is why autosegmental
representations appear necessary. Indeed, as later shown,
autosegmental representations impose the requirement that phonological
representations be graphs as opposed to linear sequences of terms.
This question is impossible to answer without assessing what
autosegmental representations are useful for.
Finally, this thesis attempts to supply a theoretical interpretation
of the computational analyses. The theoretical implication of
strictly local functions, this thesis believes, is more general than
the study of computational complexity. Potentially, it relates to the
more difficult problem of the interfaces between linguistic levels. A
complete answer to this question requires a better understanding of
the primitives and operations each level allows.
\chapter{Boundary of Transformation}
With the computational preliminaries in place, we are now ready to
analyze the attested tonal transformations found in Chinese languages.
This chapter specifically discusses transformations involving
boundaries that can block the applications. Despite the simple facets
of transformations \textfor{per se}, the existence of boundaries
raises certain important questions, such as where, when, and how they
come into play. This is especially problematic under linear models,
as hierarchical interpretations of boundaries often require more than
linear sequences.
\section{Representation of Boundary}
The linear understanding of \textterm{boundaries} is simply certain
terms that never surface, yet block any transformation that may
otherwise apply. A boundary term is always output \textemph{as is},
therefore failing to satisfy any condition when it precedes or
succeeds the input term in question. Consider again the unbounded
tone spreading rule:
\ex. \(\acute{\sigma}'(x) \coloneq \acute{\sigma}(x) \lor
(\sigma(x) \land \acute{\sigma}'(\mathit{pred}(x)))\)
A boundary term certainly cannot satisfy \(\sigma(x)\), and thus
\(\acute{\sigma}'(\mathit{pred}(x))\) cannot be true when it precedes
the input term. This quickly fails to account for more complicated
rule interactions, where boundaries may \textemph{change} to create
new contexts. Recall that cyclicity assumes that each cycle creates
new contexts until fixed point---this intuition leads to analyses that
delete and insert new boundaries.
An alternative view of boundaries is \textterm{groupings}, which can
render the transformation nonregular. Even if we ignore this nature
of arbitrary groupings by restricting self-embedding in some manner,
there is still the issue of determining whether two terms are in the
same group. Suppose groupings are delimited by conventional brackets,
to determine whether two terms are in the same group, the transducer
potentially needs to traverse the whole input sequence. This is more
apparent in the case of autosegmental representations, where groupings
are represented as trees, which are a subset of directed
graphs.\footnote{Such trees with a limited depth are known as
\enquote{prosodic hierarchy}, for which see \textcite{nv86pp}.} The
relation that decides whether two terms are in the same group is then
\ex. \(\mathit{SameGroup}(x, y) \coloneq \exists z\ldotp
\mathit{Assoc}(z, x) \land \mathit{Assoc}(z, y)\)
assuming that groupings do not nest. As usual, nestable groupings
require the transitive closure of \(\mathit{Assoc}\), which is monadic
second-order definable.
If we entirely discard the idea of explicitly encoding boundaries in
input sequences, we need to derive the same processes as before,
thereby implicitly encoding boundaries in some sense. What boundary
terms achieve is essentially applying a single transformation to
multiple linear sequences and combining the outputs through some
associative operation, say concatenation. In other words, the
transformation is \enquote{lifted} to operate on multiple linear
sequences. By doing so, we modularize the transformation into a
simpler transformation and a universal lift operation.\footnote{This
operation is precisely a functor, or loosely speaking function of
type
\((\alpha \to \beta) \to \mathbb{F}\alpha \to \mathbb{F}\beta\),
commonly known as \(\mathit{map}\). \(\mathbb{F}\) should produce a
type whose values can be \enquote{folded} in the sense of
\textcite{h99tuemf} under a monoid.}
However, groupings are much different, as they create input
\textemph{trees}. Input trees, by their inductive nature, can contain
more trees, which are in turn input trees to a potentially new
transducer with appropriate hierarchical notions encoded. It not only
generates a recursive process, but also creates a leaky abstraction as
the transducer must be aware of the whole input up to a certain level.
This defeats the purpose of having multiple linguistic levels in the
first place.
The ideal scenario is that each linguistic level drives phonological
realization by feeding the next level inputs and combining the
outputs. This eliminates the need of boundaries altogether, while
implicitly preserving their essence. This is also much what cyclicity
really aims to achieve, namely that each cycle further transforms the
input sequence with one layer of groupings removed. What this means
in practice will be demonstrated in the following sections as relevant
linguistic data are covered.
\section{Standard Chinese: Tone 3 Sandhi}
Tone 3 sandhi is a tonal transformation found in Standard Chinese,
where tone 3 is dissimilated to tone 2 before another tone 3. This
process can be described as simultaneously applied, as seen in the
following data:
\ex.
\a. \(\text{\textform{zhǎnlǎn}} \to
\text{\textform{zhánlǎn} \textgls{exhibit}}\)
\b. \(\text{\textform{zhǎnlǎnguǎn}} \to
\text{\textform{zhánlánguǎn} \textgls{exhibition hall}}\)
\b. \(\text{\textform{zhǎnlǎnguǎn lǐ}} \to
\text{\textform{zhánlánguán lǐ} \textgls{in exhibition hall}}\)
Therefore, a logical transduction
\ex. \(\acute{\sigma}'(x) \coloneq \acute{\sigma}(x) \lor
(\check{\sigma}(x) \land \check{\sigma}(\mathit{succ}(x)))\)
resulting in the mapping
\ex. \(\check{\sigma}^{n+1} \to \acute{\sigma}^{n}\check{\sigma}\)
should be enough to describe the process. It should be emphasized
that this process is linear in its computational nature,
\textemph{regardless} of the specific representation used here for
convenience. Indeed, even if the tones are represented on a separate
tier with multiple terms, the transformation is still linear and,
importantly, strictly local, albeit with a larger memory
size.\footnote{To decide the memory size, it suffices to calculate the
number \(n + m\), where \(n\) and \(m\) are respectively the highest
degrees of composition of the predecessor and successor functions in
the normalized transduction.} Namely, the mapping will then be
\ex. \((\text{LL})^{n+1} \to (\text{LR})^{n}\text{LL}\)
ignoring the optional alternative realization of tone 3 when it is
lengthened \parencite{d99mstems}. The fundamental reason is that the
associations are not affected by the transformation, that is,
\(\mathit{Assoc}\) is never mentioned in the transduction.
The transformation becomes more interesting when boundaries are
concerned. The data
\ex.
\a. \(\text{\textform{gǒu bǐ mǎ xiǎo}} \to
\text{\textform{góu bǐ má xiǎo}}\)
\bg. gǒu bǐ mǎ xiǎo\\
dog than horse small\\
\trans\textgls{Dogs are smaller than horses.}
show that certain tone 3 does \textemph{not} change to tone 2. This
seems to suggest that an alternative transduction
\ex. \(\acute{\sigma}'(x) \coloneq \acute{\sigma}(x) \lor
(\check{\sigma}(x) \land \check{\sigma}'(\mathit{succ}(x)))\)
resulting in the mapping
\ex.
\a. \(\check{\sigma}^{2n} \to (\acute{\sigma}\check{\sigma})^{n}\)
\b. \(\check{\sigma}^{2n+1} \to
\check{\sigma} (\acute{\sigma}\check{\sigma})^{n}\)
is in effect. This is not true, as the data
\ex.
\a. \(\text{\textform{xiǎogǒu bǐ xiǎomǎ xiǎo}} \to
\text{\textform{xiáogóu bǐ xiáomá xiǎo}}\)
\bg. xiǎo-gǒu bǐ xiǎo-mǎ xiǎo\\
small-dog than small-horse small\\
\trans\textgls{Puppies are smaller than ponies.}
clearly do not adhere to the mapping. Moreover, the data
\ex.
\a. \(\text{\textform{xiǎo zhǐlǎohǔ}} \to
\{\text{\textform{xiǎo zhíláohǔ}},
\text{\textform{xiáo zhǐláohǔ}},
\text{\textform{xiáo zhíláohǔ}}\}\)
\bg. xiǎo zhǐ-lǎohǔ\\
small paper-tiger\\
\trans\textgls{small paper tiger}
show that certain tone 3 \textemph{optionally} changes to tone 2. To
account for such contrasts, various analyses have been developed. Two
such analyses will be discussed, where one is representative for
\enquote{syntactic} analyses and the other for \enquote{prosodic}
analyses.
\subsection{Syntactic Analysis}
\textcite{c00tspcd} argues for a analysis under which feet are built
based on syntactic structures, which in turn govern the application of
tone 3 sandhi. As \textcite{d07psc} states, Although this analysis
includes a notion of feet, they are nonetheless not related to prosody
given their insensitivity to stress. More crucially, the tone 3
sandhi rule makes reference to the syntactic structure for the
applicability. Therefore, it is considered a \enquote{syntactic}
analysis here.
The basic idea is that syntactic branching leads to different foot
structures. For example, a left branching syntactic structure leads
to a left foot:
\ex.
\a. \(\text{(\textform{mǎi hǎo}) \textform{jiǔ}} \to
\text{(\textform{mái háo}) \textform{jiǔ}}\)
\bg. [[mǎi hǎo] jiǔ]\\
\phantom{[[}buy good wine\\
\trans\textgls{to finish buying wine}
Tone 3 sandhi applies to \textform{mǎi} as it is followed by another
tone 3 syllable in the same foot. The curious part is \textform{hǎo},
which is followed by another tone 3 syllable in a \textemph{higher}
group. A right branching syntactic structure, on the other hand,
leads to a right foot:
\ex.
\a. \(\text{\textform{mǎi} (\textform{hǎo jiǔ})} \to
\{\text{\textform{mǎi} (\textform{háo jiǔ})},
\text{\textform{mái} (\textform{háo jiǔ})}\}\)
\bg. [mǎi [hǎo jiǔ]]\\
\phantom{[}buy \phantom{[}good wine\\
\trans\textgls{to buy good wine}
Tone 3 sandhi now only optionally applies to \textform{mǎi}, as it is
followed by another tone 3 syllable in a \textemph{lower} group.
Apparently, this implies that nestable groupings are required. As
aforementioned, nestable groupings lead to monadic second-order logic
over \textemph{trees}, which is an undesirable property.
Alternatively, we may consider that the trees be transformed to linear
sequences suitable as inputs to the later level. This can be a simple
recursive process that flattens the input trees using an associative
operation that prefixes boundaries.\footnote{This process can also be
represented as a strictly local tree transducer, as linear sequences
are isomorphic to single-pathed graphs, that is, unary-branching
trees. See \textcite{ioj20qtt, jh20isltt} for detailed discussions
of such transducers.} This tree transformation can be
nondeterministic to account for the optionality, but the level it
feeds, tone 3 sandhi, remains deterministic.\footnote{Under
nondeterminism, an output sequence that contains a
\(\check{\sigma}\check{\sigma}\) substring simply fails and
backtracks. Although \textcite{c00tspcd} reports several data that
contain such substrings, they are more likely due to emphatic
stresses.} Considering only the nontrivial cases, we have
\ex.
\a. \(
\begin{forest}
nice empty nodes,
[[\(v\), baseline] [[\(w\)] [\(z\)]]]
\end{forest}
\to
\#v\#wz\)
\b. \(
\begin{forest}
nice empty nodes,
[[[\(v\)] [\(w\)]] [\(z\), baseline]]
\end{forest}
\to
\#\#vwz\)
In some sense, this is as if only opening brackets were preserved.
Syntactic categories are insignificant, as the process only makes
reference to the hierarchical structure. This, of course, still
requires modification to tone 3 sandhi, as now the input sequences
contain boundaries. To completely suppress boundaries in the inputs,
we can further split linear sequences by boundaries, resulting in
multiple linear sequences.\footnote{Or, obviously, transform to
multiple linear sequences in a single pass. The presentation is
merely for easier understanding under a graph-transduction
perspective.} Tone 3 sandhi is then lifted to operate on these
linear sequences, whose outputs are combined by concatenation. To put
it simply, this means that
\ex.
\a. \(\#v\#wz \to \tau(v) \cdot \tau(w \cdot z)\)
\b. \(\#\#vwz \to \tau(v \cdot w \cdot z)\)
omitting empty sequences. Interestingly, the contrast now comes from
the fact that tone 3 sandhi does not preserve concatenation:
\ex. \(\neg(\forall x, y\ldotp
\tau(x \cdot y) = \tau(x) \cdot \tau(y))\)
If this \textemph{did} hold, we would be able to derive the
equivalence of the two mappings due to associativity.
A note should be made regarding cyclicity. In \posscite{c00tspcd}
original analysis, cyclicity is used to explain both simultaneous and
optional applications, on the assumption that groupings can be
expanded at each level. However, we have seen how cyclicity is
\textemph{not} an essential property of the analysis by virtue of
modularization, where a tree transformation feeds tone 3 sandhi.
\subsection{Prosodic Analysis}
\textcite{d07psc} proposes an alternative analysis in light of several
perceived inadequacies of the previous syntactic analysis. The most
significant one is that supposedly same syntactic structures can lead
to different groupings, as
\ex. \(\text{(\textform{něizhǒng}) (\textform{jiǔ hǎo})} \to
\text{(\textform{néizhǒng}) (\textform{jiú hǎo})}\)
must be derived from
\exg. [[[něi-zhǒng] jiǔ] hǎo]\\
\phantom{[[[}which-kind alcoholic.beverage good\\
\trans\textgls{Which kind of alcoholic beverage is good?}
supposing the syntactic structure is correct at all. This obviously
hinges on specific syntactic analysis, as one can imagine
\exg. [[[něi-zhǒng] [\gap{} jiǔ]] hǎo]\\
\phantom{[[[}which-kind \phantom{[}\textfeat{gap} alcoholic.beverage
good\\
\trans\textgls{Which kind of alcoholic beverage is good?}
being a totally valid alternative analysis that \textemph{does} derive
the correct grouping.\footnote{Justification for a similar analysis
based on type shifting can be found in \textcite{bs14cnl}, which