-
Notifications
You must be signed in to change notification settings - Fork 4
/
Copy pathTODO.txt
executable file
·5297 lines (4253 loc) · 238 KB
/
TODO.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
$Header$
GMTK TODO List/Memory
Other things that need to be done, or just other ideas that might be
incorporated into the package at some point. Things that are already
done are marked with an "X" on the left of the item. When an item is
added, please add the date to the left of the list item. Note that
some items below might be redundant as they were added at very
different times.
--------------------------------------------
X Low level file parser should support
both integer and named pointers to
the parameters that they share.
(when they write out the parameters, they
should write out the form that was written in).
X simple +* equations in decision tree leaf nodes,
using variables vi and ci (e.g.,
v0, v1, v2, ... , and c0, c1, c2, ...) for values
and cardinalities of parent variables.
exs: c0*v0 + v1
c0 + 1
3
X add a MTCPT type, for deterministic cpt
which just points to one of the decision trees
in the DT table.
X add error check message if two low level objects
have the same name and kill the program.
X discrete observed variables have only one cardinality,
card needs to be a vector.
X Include gaussian objects for special situations.
index -1 corresponds to, with probability 1, gaussian
index -2 corresponds to, with probability 0, gaussian
X An anytime clique finding algorithm that searches
for the best clique tree in an hour, day, weekend, etc.
Save the resulting clique (and model description) to
a file so that it can be reused multiple times.
(removed Fri Dec 31 01:55:48 2004)
X Produce messages to the user saying that what the clique
sizes are and that they might want to adjust the
structure to reduce clique sizes if they become too large.
(removed Fri Dec 31 01:55:54 2004)
X the inference procedure should work both with clique chains and clique trees
- no, but basic data structures are ok.
- at some point, add tree stuff, for single frame applications
DONE: Wed Sep 1 11:29:25 2004
- the inference procedure should work with disconnected networks
X Allocate data structures for CPTS, etc. automatically, if not explicitly
specified in a file.
X use namedobject for randomvariable and any other object that
has a string name
DONE: Wed Sep 1 11:29:33 2004
X- include check to make sure that names for each section in parms are unique
X- todo, make sure that chunk n:m is valid.
X IMPORTANT: ability to train means and other dlinkmats in a dlinkmats structure
independently.
- to through and start using setBasicAllocatedBit() on read, and add
assertions to that effect on all routines that need it.
(later) started doing this, but need to do a sweep to make sure all use
this mechanism.
- IMPORTANT: look at and get rid of purify memory lost message
- write destructors
- write sampling code
X update make file so that -@ works for .o files.
X Check the results of all DT query() calls to make sure
the index it returns is valid. Report great error messages
if it doesn't.
Ultimately, some type of run-type check done once at the beginning
so this query doesn't need to take place.
X if an object gets no probability, do something
more than just issuing a warning, like setting
to uniform probs, removing the component,
re-randomizing, or something.
done: set to previous values.
X add a k-means mode for initialization.
- keep covariances at unity for a while.
- first iteration just use random assignments
(pick a random component who gives unity probability,
everyone else gives zero probability)
- done essentially since the splitting stuff
is working and we can start with a single Gaussian
and path counting.
X Add gaussian split program
X add linear BMM links
- add non-linear BMM links
X implement mixture component vanishing using the
shared Dense1DPMF. If one GM decides that
its component should vanish based on a MCVR
then all of them should (and should check this
as well).
X when reading in definitions for each utterance:
we could change the DTs with a particular name
(e.g., read in a DT file and have a read
mode that changes the pointer if it is there).
This would need to have objects that use DTs
use the integer index rater than the pointer directly.
Essentially done, DTs can now specify a file
containing a number of DTs one per segment.
X Re-think the 1 DT per utterance stuff so that
we can seemlessly do parallelism.
(dt file indexes?)
DONE: Wed Sep 1 11:30:09 2004
- rethink DT format numbering
- add a 'fail' tag to DT leaves.
X pass .str files through cpp before reading.
X make sure that all files passed through cpp that
line numbering error messages is done right.
X GMTK_GMParms::read(), keeps open all files
until the end. If it encounters the same name again,
it keeps track of where it was and continues reading.
Also, this should append rather than delete.
X Allow the file parser to pass ASCII files through cpp
(but not binary).
- need to add a check that makes sure that the observation
range used by the cont. parents match that used by
the gaussian, perhaps add a length check as well.
X Fri Jun 15 12:43:49 2001,
MEMORY SAVINGS:
RandomVariables and their discrete and continuous children
have lots of redundant information when they are unrolled, thereby
wasting lots of memory. This could be changed so that
RV's keep a pointer to a RV common structure which when the RV is cloned
uses the same common data.
DONE: Wed Sep 1 11:30:38 2004
- Short term: evaulate float/double for logp on Aurora
X Long term:
have a log space algorithm option
- in file parser, add checks to make sure that all ints read in
are non-negative
X change parser to keep track of multiple #include files via line directives
X change sparse PMF so that it just contains a list
of values, and then a pointer to a dense pmf
X get the parameter writing stuff finalized.
(get working after workshop, use simple global write in binary for now).
X get MCVR working
- if last component (or entire mixture) has no probability,
either 1) die
2) do nothing, and do not change anything, reverting
to previous values.
2) force to uniform parameters
3) For now, **** force to impossible Gaussian ***
and issue a loud warning.
- prior counts
- need routine makeAccumulatorsPrior
- stand alone program for writing out accumulator file to be read in.
X IMPORTANT: finish load/store/accumulate accumulators
- Identify potential issues with release (ie.., bugs, slowness, obviously needed toolkit abilities,
thigns that are inconvenient, etc.).
X add VCID everywhere.
- allow deterministic relations to be enumerated out. in some cases, this is
easier than a decision tree.
X- add a binary/ascii parameter file conversion program.
- export all internal program variables to command line (e.g., var floor, etc)
X check on memory leak stuff
- make sure that all gaussians use means/variances that are the right dimensionality
in read file.
- rethink sparse CPT and make it such that sparseCPTs don't use the Dense1DPMF which
have become tailored to Gaussian mixtures (so lengths might change).
sparse CPTs should use dense CPTs somehow.
- write C++ program to print number of free parameters for a system.
X add simple multiplication onto decision tree leafs. (or better, make it
use real integer formulas with parens, etc.).
- export the optional training stuff (i.e., don't train means, just covars, etc)
to command line.
- create new objects, integer to name mappings
to map to decision tree leaves (corresponding to integers) to either
1) Gaussian mixture objects
2) Switching gaussian mixture objects
3) sparse PMFs
X make sure that dlinkmatrix precompute is being
called once the global observation matrix is ready.
- check for cardinalties in str file
- make sure dlinks are checked somewhere for validity wrt a file
- rething the EMable thing with the virtual functions, might
be a speedup there, esp. with emIncrement.
- figure out a good way to get (save to disk) most viterbi assignment to
mixture variables.
- clean up source directories.
- fix unrolling bug, where it is possible to get an assertion
failure because of unrolling a network but having incompatible
RVs.
- dlinks, make sure we do not point to self
- decision trees, need to deal with the issue with reading them
in, parallelism, and so on.
- implement other forms of mappings from RV
- decision tree (done)
- hash table
- direct mapping
- write MDCPT parameters out in nice order with smart comments.
- dlinkmats, normalize by "previous" covariance matrix, GEM alg
- more triangulation procedures to reduce large clique sizes.
- get switching parents working with triangulation
- print message at start with largest clique (members, and upper bound on
joint state space)
- reading ascii feature files should not be by line.
X fix bug with small parallel chunks and accumulators being zero
- add option to pass definitions to cpp with cpp arg.
- add some way for DTs to refer to other DTs (i.e., a leaf
of a DT an continue on using another DT, to make sharing
easier, and save memory for big DTs).
X option to floor variances when they are read in.
(for all programs including ascii/binary conversion)
X ascii/binary conversion program can go both ways.
- dynamic DTs, error messages should print cur name as well as base name.
- make DT such taht even if overlap exists, binsearch will occur.
(add option to search from the middle outward).
- add option to split/vanish top/bottom N mixtures irregardless of that.
- for documentation:
- good idea to turn on conservative vanishing right after forced splitting.
This is to make sure that the splitting as a good idea. A forced
split might not be a good idea. The split Gaussian might
dwindle away during training after a split, so keeping conservative
vanishing will keep that from happening (and will keep the
variances from being large).
- option to turn off all warnings and notes.
- accumulators pretty printed
- don't allocate nextmean next covar until end of em iteration
since they are contained in component.
define a new bit in emable to support this (since need
to know the first time end of em thing is called).
- ability to produce viterbi paths with mixture variables
using the gaussian mixture objects.
- write a vector version of log(1 + exp(x))
Wed Aug 18 19:43:41 2004: write a specific version of
log(1 + exp(x)) in one function log1pexp(x).
- clean up swap and end EM in gmparams
- add command line option "-format file-type" to the main programs that
will explain the formats of the various files. For example, if there are
no gaussian mixtures, does the master file still have to mention them?
- make DTS such that 'default' is not required, and that if
we have splits w/o a default, then it will have a run-time
error if we ever get a case that doesn't match the guys in the split
- change range error messages to indcate where the error is
in the file, etc. where the error occurs (add an extra
string argument, etc.).
- check that there are no self loops in dlink strcutres
- add startskip/endskip check
- add link checks
- when no-one left uses a component after vanishing,
get rid of it (add a 'used' bit perhaps in EMable.h).
- make vanishing stuff vanish w/o a trace (i.e., unused
component is gone).
- make more conise all the warning messages about vanishing
(don't need to report all of them, report single summary
message).
- check on error check messages, arguments out of order
in dlink matrix message??? (check with Geoff)
- remove all the using_files stuff in GM.cc
- clean up GM.cc with setExampleStream, and all of that.
- Wed Aug 8 21:58:36 2001 it is still the case that
the gaussian dimensions are not being checked (since
we don't know DT leaf values at start time).
Once tables are in place, we can make sure
that the tables point to all matching gaussians
at start time.
- write our own pre-processor (doesn't have space problem
that cpp has, and also will give standard #line/#file directives).
Perhaps in perl.
- add tag to command line to add to all cloned objects.
- include global missed increment count in accumulators.
- arguments print default values
- viterbi option so that it prints out max likelyhood of
one variable summing over all others.
- clean up virtual functions in GMTK_EMable since some of
the EM ones need not be virtual.
x fix arg description of beam
- for docs, when stdfracs are zero for D and B, and
when we clonesharemeans, we might make a copy of
the gaussians when cloning that are exactly the
same as the parent leading to redundant copy of
Gaussians. Make sure to mention this in docs.
- support no training names such as gmMx* for foobar*
- state clustering, ala HTK, occupancy counts.
- when segment is to short during training, skip it
rather than exiting with an error.
TODO:
- file formats (table & output file)
- record phone numbers
- src id
- icassp
- accumulate multiple accumulators, give a list of accumulator
files when numeric range doesn't work to support accumulators
on different machines.
- go through and making sure all the tying logic and not-training options work.
- change internal class names from ??CPT to the SparseCPT, DenseCPT, etc.
- create a name index type so that DT's can be used for the following.
- in mappings to GM indexes, DT leaf specifies a relative
offset in a table rather than in the global collection of GMs.
- in MSCPTs, so that row elements of the MSCPT point to
offsets in a table for the 1dPDFs rather than in the global
table. Add entry in MSCPT definition in data file.
- a way of adding counts to discrete CPTs without needing to specify an entire accumulator
file.
- cpp program is determined by environment variable if it exists.
- remove from parser the integer index stuff since string names exist.
- remove all cin/cout and use printf/scanf
- add -version flag to everything.
- from Gang.
1. When I want to print the hidden variables I met the following:
suppose my varList file is the following
wordLatticeState
state
phoneTransition
wordTransition
and my fileList file is the following
wordLatticeState.log
state.log
phoneTransition.log
wordTransition.log
The output will put everything in the file wordLatticeState.log. In that
file the first integer will be wordLatticeState(0), second will be
state(0), so on and so forth. It didn't create other three files.
2. suggestion
in gmtkViterbi, cppCommand is very useful. But I if I have a lot macros,
the command line string will be very long. I wonder whether is a way that
this will take a file of macros.
JB answer: but this can be done using cpps #include in one of the
files. TODO: add this to documentation.
- add implicit approach to tutorial (from Yimin).
- graphvis and grappa from at&t can visualize graphs very nicely.
-
>.
>WARNING: Ending EM iteration but 124 rows of MDCPT 'mannerMDCPT' had zero
>counts. Using previous values for those rows.
>
>Actually, one thing that would be useful would be if it were possible to have
>a "verbose" option where it tells you which rows had zero counts.
>
------------------------------------------------------------
Tue Jun 18 17:47:44 2002
> >(2) Does GMTK allow you to use models that aren't strictly probabilistic?
> >E.g. if I wanted to weight the acoustic model score relative to the language
> >model, would I be able to do that? I know that for some purposes I can "fool"
> >it, as I've done with the feature model, by having multiple observation
> >variables pointing to the same observations, but that is fairly limited.
>
> Not at the moment, but that would be really easy to add (and I'd be happy
> to do that for you).
------------------------------------------------------------
FIXED Wed Jun 19 19:27:26 2002
Tue Jun 18 17:51:42 2002
> >(1) I am training a new model, and am getting an error with the new version of gmtkEMtrain but not the old version. The error occ
> >urs when loading accumulators (but not when training without accumulators), and the message I get is:
> >
> >error in accumulating accumulators: /t/klivescu/aurora/articulatory5/MISC/emtrain.1.log:
> >EOF occurred in readDouble, file '/t/klivescu/aurora/articulatory5/MISC/acc_file_1.data': MDCPT load accums
> >Loading accumulators from '/t/klivescu/aurora/articulatory5/MISC/acc_file_1.data'
>
> Hmm. Are you using accumulators that were generated with the old
> version with the new version? I don't think any of that code was
> changed, but it is possible there might have been a bug introduced.
No, the accumulators were generated with the new version. I ran a
few more experiments to try to figure this out, and it seems that
it has to do with the size of the accumulator files--i.e. it only
happens when the accum files have only a few utterances' worth of
data. E.g. if I train on 50 sentences broken up into 2 accum files,
it's fine; but when broken up into 10, I get the error. Perhaps it
is unhappy about some things not being observed in the accum files?
> That would be good. Could you set up the bug on music and/or orca?
I set up the files on both machines. On both machines, everything is in
~klivescu/aurora/articulatory5. The NOTES file contains the command
lines for the old/new versions (sorry about all the long path names--
the commands were copied directly from script-generated makefiles).
I think I made everything group-writable, so you should be able to
run the commands.
Thanks!
Karen
------------------------------------------------------------
FIXED Wed Jun 19 19:27:26 2002
Tue Jun 18 17:51:16 2002
(3) Not so much a question as just letting you know about another error
(different from the last one) that I got with the new version but not the old
one. This one also occurred when combining accumulators. The error message
was:
Loading accumulators from '/homes/klivescu/aurora/articulatory7/MISC_NEW/acc_fi
le_1.data'
GMTK_MeanVector.cc:718: failed assertion `emEmAllocatedBitIsSet()'
IOT/Abort trap (core dumped)
This occurred while training a feature model with clustered features. The only special thing I can think of about this model is that there were a number of Gaussians that I wasn't training (via -objsNotToTrain) because they correspond to impossible combinations of feature values. I put all the files necessary to replicate the error on music in ~klivescu/aurora/articulatory7. The NOTES file in that dir has the commands that I ran for both the old and new versions.
-----------------------------------------------------------------
FIXED Wed Jun 19 19:27:26 2002
Wed Jun 19 11:34:25 2002
At the moment, we are not checking for emAmTrainingBitIsSet() in
GMTK_MeanVector.cc, GMTK_DlinkMatrix.cc, and GMTK_DiagCovarVector.cc
in all of the EM routines, except for the swap routine (i.e., if the
bit is not set, we don't swap). The reason for this is as follows.
When sharing is not on, it is fine to check this bit before each EM
routine, and if not set, do nothing. The problem is that with sharing,
when we compute the updates for the shared means, we'll need the
counts for not only the means but also for the covariances, and vice
versa. A similar situation arrises when dlinks are involved. One
solution might be to activate the accumulation if it is seen that
sharing is occuring, the the problem with this is that, at the moment,
we don't know if sharing is occuring until after emINcrement is called
for the 2nd time on teh same mean object. If the bit is off the first
time, we might miss the first accumulation.
The solution now is to compute all the counts for the
means/variances,etc. in all cases, even when the not_training bit is
on, but this can be very wastefull. Another problem is that we need to
save the accumulators when doing parallel training, even when the
means are not being trained. This means that the training bit could be
off, but we are accumulating accumulators for the shared object so the
accumulators need to be saved even when the training bit is off. This
logic therefore needs to be rethought.
-----------------------------------------------------------------
- get automatic allocation of DenseCPTs working again.
Tue Jul 2 19:35:29 2002
- errors when reading in parameter files should be more informative
and perhaps say where in the parameeeeeer files the problem is.
Tue Jul 2 19:38:36 2002
- no need for warning about accumulatedProb = 0 for MTCPT
Tue Jul 2 19:53:43 2002
add to documentation something about normalization features
and variance floor. Hints/tips on what to do here.
- possibly a diff. var floor for each feature vector element/
XX Tue Jul 2 20:24:18 2002
- add a 'noscore' CPT so that we can do true conditional discrete
observations, similar to what can be done for discrete observations.
Sun Jul 07 17:16:56 2002
- go through all parameter reading code to produce better error messages
XX Sun Jul 07 17:16:56 2002
(e.g., realarray, and all of that, probably eliminate realarray class
as it's not doing much).
Sun Jul 07 23:40:36 2002
- idea, for off-line triangulation heuristic, rather than
min-weight, choose "min dynamic weight." The dynamic
weight is obtained by running through all possible values
in the clique and choose the node to eliminate next which
results in the fewest number of clique instantiations.
Possible problem. Implementation of edges changes for
each utterance, but it is only the decision trees that
change.
Options:
1) Do this for one possible value of the DTs and
hope that this works for the rest
2) do this for all, and average over the differnet
DT values (or take min of max, etc.)
3) have user provide "canonical" DT examples
that are used for triangulation.
4)
Question of how much to expose user to triangulation issues:
Goal: try to make it possible for them to experiment with
better strategies w/o asking them to understand everything,
but still include it in the tutorial section.
Fri Jul 12 23:39:17 2002
- rewrite issue with di_xxCPT type of CPT objects being
declared. There should be one CPT object type used in
the program, and all guys inheret from it. Q: how
to do the separate namespace for MDCPTs, MTCPTS, and MSCPTs?
- this also an issue for mixGaussians. Need to re-think all
of this.
Sat Jul 13 02:05:27 2002
- ascii file reading should read in ascii files preprocessed
by CPP (optionally and by default). Also, the list of files
itself should be processed by CPP.
Tue Jul 16 18:38:44 2002
- when variances get floored, message should also include occupancy
probability (to see if counts are low)
Thu Jul 18 15:38:36 2002
- add comment to docs about problem with single quotes in GMTK comments
because CPP has trouble
XX Thu Jul 18 15:38:51 2002
- add neg values between -infty and -0 being log probabilities.
Mon Jul 22 13:39:02 2002
- add neg vals cpt and dpmf to docs,
- add to docs that normthres can be == 0 to turn it off.
Fri Jul 19 10:11:04 2002
DONE: add ability to specify initial counts on command line for discrete objects
(M?CPT, DPMF) during training, to do a form of Laplace smoothing.
Perhaps similar to obsnottotrain option.
(done Sat Jul 23 16:02:39 2005, general Dirichlet prior model for DenseCPTs and DPMFs)
Fri Jul 19 10:13:22 2002
- idea. Each gmtk object should have a set of options that can
be set when the object is defined, rather than in a separate file???
ri Jul 19 11:59:44 2002
- add karen's MCVR/MCSR comments to the docs
Fri Jul 19 18:48:57 2002
- use regex library for obs to not train, and other such
things.
Mon Jul 22 13:40:06 2002
- idea about triangulation heuristic
(when doing min weight, when a det. var and its parents
live in the clique, don't multiply the child's cardinaliiity
into the clique weights). So, if there is only one
random parent, entire clique weight = cardinality of
that parent).
Mon Jul 22 15:15:04 2002
- when strcutre contains no observations in file, shouldn't
require obs file (but how to determine unrolling in that case?)
Thu Jul 25 19:57:48 2002
- add a per sentence like range option to all programs, like
in the pfile tools. This would pass directly down to the
file reading utilities.
- Update the pfile tools to use the new file format.
Fri Jul 26 14:52:53 2002
- add appropriate waring/informational message mechanism,
so that users can se more/less of the warning information, with
command line option (e.g., for dets hitting zero, and any
other funny hacks that occur).
Wed Jul 31 22:04:27 2002
DONE: actually, rather than laplace smoothing, add an object type
that can include initial counts to use for MDCTPs and MSCPTs
(use same file format as these, but just include pos integer or fp counts
rather than probabilities).
(done Sat Jul 23 16:03:05 2005)
Tue Aug 6 22:49:42 2002
- use email to chiaping on Tue Aug 6 22:49:46 2002 about unity
score Gaussian -> docs
Wed Aug 7 13:46:35 2002
- LM scale factors,
for vit decoding, give the option to raise the prob of
a rand var prob to a power (keep it static for now in
the structure file).
Fri Aug 9 10:33:17 2002
- change low level fileparser to when it has an error
message keep track of the line number & file name of file that it
is parsing for error messages. Need to parse CPP options
as well in that case.
Sun Aug 11 18:07:53 2002
- obsNotToTrain file formats
- should allow multiple objects on the same line
- should use a better format (regexpressions, etc.)
Wed Aug 14 14:08:19 2002
given a word-int map, allow the inclusion of word matrices
WidMatrix to map to internal GMTK matrices, using loadWordFactors
and Katrin's tag-word format. I.e., support observed data
in the form of words/utterances for a word map.
Mon Sep 30 01:21:02 2002
- next set of thigns might be redundant with above, but just a check
1) deterministic CPTs shouldn't have messagtes/notes, etc. about
not getting any accumulated probability, etc.
2) error messages with MSCPTs, MDCPts, use new names
Wed Dec 4 19:39:41 2002
- docs: ascii file formats for data files, add bit in domentation that
one frame per line, and fix error message on line 1272 in GMTK_GM.cc
(see email from Au@hk).
Tue Dec 17 13:31:36 2002
>2. Usually LM scale factor is accompanied with the insertion penalty, i.e.
>
> scale * log P + penalty
>
> I think, including insertion penalty would be quite useful for
> incorporating LMs with GMTK.
(actually, do the switching variable scale/penalty stuff).
Email from Dec 18th
> A way, then, of getting an insertion penalty effect is to have a
> switching scale and penalty, syntax could be:
>
> weight:
> value 1.0 value 0.0 // scale = 1, penalty = 0
> | value 5.0 value 3.0 ; // scale = 5, penalty = 3
>
> which mimics the conditional parents notation. In this case, it uses
> the first scale,penalty when the switching parent is in its first
> region, and the 2nd when the switching parent is in its second region.
> So, when it switches in the true bigram (meaning a word transition
> occured).
>
> We could have time-dependent scale,penalty by doing:
>
> weight:
> observed 0:0 observed 1:1
> | observed 2:2 observed 3:3 ;
>
> where it would be assumed that observations 0-3 contain only scales
> and penalties.
>
> Thoughts??
That's funny because I was about to ask you what happens when you have switching
parents and whether you can change the weight for the different branches of the
switch. I didn't think about the penalty but that would be very useful too. T
he syntax above looks good to me, except I might make it more explicit about whi
ch is the scale and which is the penalty, e.g.
weight: scale value 1.0 penalty observed 0:0 ;
Thu Dec 19 17:36:22 2002
- topological sort loop detection should indicate
which node is involved in a loop.
Sun Dec 22 21:09:32 2002
- unrolling routine should not allow for P to reference into
E and E to reference into P (i.e., P and E can not have
a parent on the other side of a chunk).
Why?
- then the interface method no longer makes sense. This is
because that constrained triangulation for DBNs
is implicitely assumed by the interface algorithm (i.e.,
nodes up until the 'face' are elimiated first). With
links accross chunks, then the face might live
in both the chunk and E at the same time, might
cause large cliques, and would make triangulation
more difficult.
Note that we could have a mode that allowed for this,
if it used the old strategy of first unrolling the
network, then moralizing triangulating, and then
doing inference.
Idea: since we'll need such a mode for unrolling
by 0 cases anyway, we could use it for the
case when E links into P (and vice versa).
- Makes the concept of unrolling more difficult. If
links could span accross a chunk, then depending
on the unrolling amount, a link with child in
say P might at one time
link into a portion of the unrolling chunk, at
another time, might link into a node in E.
- update: Thu Dec 26 03:19:08 2002
perhaps allow for this to occur when using
the unroll graph and then triangulation mode (e..g,
the same mode that works for snakes will work for here as well).
- update: Wed Aug 18 20:11:08 2004
also, allow this when dealing with static template graph case.
Sun Dec 22 21:12:43 2002
- idea: wrap-around mode?
i.e., allow for negative indices at the beginning of a graph
to link into the end (i.e., P can link to E using negative)??
- but this would have same problems as linking from P to E
in note above.
- Wed Aug 18 20:11:34 2004: this relate to loopy BP?
Tue Dec 24 00:25:18 2002
- find best {left,right} interface should have the option
to judge the quality of the interface not by size
but also by weight (which includes deterministic
variables).
Tue Dec 24 08:02:50 2002
for min weight heuristic
// TODO: should also look at if this is a sparse CPT,
// and if so, multiply by the average density (number of non-zeros)
// of the CPT rather than the entire cardinality. For a very
// sparse CPT, for example, using just the cardinality
// is *very* conservative.
a sparce CPT can provide this estimate itself of the
average cardinality. E.g., it can compute over
all parent values number of columns that are non-zero,
and take average, that is the 'effective' cardinality.
Can do this by looking at sparcePMF.length() field
on line 228 of GMTK_MSCPT.cc.
Tue Dec 24 08:26:10 2002
define a cardinality of a CPT baesd on the
average 2^entropy of the node, under the
assumption that pruning will remove stuff below
those values that are not significant.
This is another heuristic to add (and should be
controllable via command line parameter).
Tue Dec 24 13:58:16 2002
- for snake structure, can have a module that
still does regular unconstrained triangulation and
unrolls and triangulates for each length (in that
case, snake will be better)
Tue Dec 24 13:58:48 2002
- for min-fill, min-weight, etc., when there
is a tie, back off to constrained triangulation
(e.g., eliminate the earliest node first, or
a random node selected from among the earliest
nodes)
Wed Dec 25 02:09:25 2002
- with triangulation heuristics, when a tie occurs
then use another heuristic (i.e., with a tie with
min weight, then use min fill in)
Wed Dec 25 12:50:41 2002
- also include an anytime triangulation algorithm
exhaustive search. It should occasionally
print out the percent searched so far, and
should accept a SIGUSR2 to terminate search
so far and take answer. Also, should
include an argument that is the time (in days
and hours) to compute triangulation for, and
stop after that amount of time.
Thu Dec 26 02:16:36 2002
- fix: core dump issue when graph is not connected.
(i.e., when chunk is not connected to itself)
Thu Dec 26 02:21:28 2002
- Chris Bartels Todo:
- write a class Anytime_Triangulation
that has same interface as basicTriangulation routine
- write an anytime exhaustive search triangulatio
- get MCS working with current graphs, so we can
verify that graphs are indeed triangulated.
Thu Dec 26 03:18:13 2002
- include something indocumentation about
cpp0: ....
errors, and that these are generated from cpp not gmtk.
Mon Dec 30 05:16:04 2002
- this is probably mentioned somewhere above, but
change fileParser.{cc,h} to print out line numbers when
there is an error message, and also to understand
cpp's filename and line number mechanism (for ascii files
and when they are processed with cpp).
Fri Jan 3 19:48:46 2003
- this is an old todo item taken out of foce.
// snake structure with constrained elimination
// will ruin the clique_size = 2 property
// of the 'snake' structure. The todo is to get this working
// with that (and similar) structures.
/*
* idea: make complete components in C_l *only* if they
* are connected via nodes/edges within either (P + left_C_l)
* or preceeding C.
* What we will then have is a collection of cliques for
* the interface(s). In this case, we glue together
* the corresponding sets of cliques. Right interface
* algorithm should be similar (and use E rather than P).
* But might both left and right interface need to be used in the
* same repeated chunk in this case to get clique_size=2???
* alternatively (and easier): for snake, just use the unconstrained
* triangulation method (which works perfectly for snake).
*
Sat Jan 4 17:39:20 2003$
- build utilities to convert (at least partially) from
other standard GM network file formats to GMTK and
vice versa. See the
Bayesian Network Repository page.
Tue Jan 7 12:24:16 2003
- remove 'observation' as keyword for weight and just use
observed.
Mon Jan 13 13:31:34 2003
- BUG: when we have a sparse cpt in a parameter file that
refers to a name 'global' for a collection object, that
object does not exist since loadGlobal() was called in the
gmtk programs last (after other trainable and master files
have been called).
Mon Jan 13 13:32:31 2003$
- get rid of MSCPT, MDCPT, and MTCPT messages. In sparce CPT
error messages, it says MTCPT when it should refer to sparce CPTs.
Fri Jan 23 13:17:05 2004: change all internal program variables
away from using MTCPT, MDCPT, etc. also
Mon Jan 13 16:54:33 2003
- make sure in documentation that it says that dlink feature index
values are for absolute locations with respect to the
feature files and not relative locations, relative
to the feature range given in the .str file (if this
is not done already).
- parent variables are absolute, with absolute feature locations
given in dlink definition
- child variables are relative, where relative location is given
starting with feature range given in .str file.
Mon Jan 13 17:18:17 2003
- Fix bad error message, 'Error: upper n >- limit n in range string' when training range does
not match the range of utterance numbers in the observation file.
DONE: Thu Jan 30 19:15:59 2003
- optimization
right now, updating of Gausisans during mean and variance updating as follows:
*covars_p += (*f_p)*(*f_p)*fprob;
*nextMeans_p += (*f_p)*fprob;
but where the two lines of code occur in different files, and so
it can't re-use the common subexpresion (*f_p)*fprob
- TODO, write a routine that can update both at the same time,
probably best way to do this is to add a bit of code to emIncrement
in GMTK_DiagGaussian and GMTK_LinMeanCondGaussian for this purpose.
DONE: Fri Jan 31 00:30:30 2003
- add a '-debug' option that prints various status
messages depending on the '-debug' number.
DONE: Mon Feb 03 14:45:49 2003
- might be above, but add the option to
unroll more times of the C chunk and triangulate that
rather than just one time (this would put more constraints
on the number of allowable frames for a P,C,E graph).