forked from lgatto/spr
-
Notifications
You must be signed in to change notification settings - Fork 0
Expand file tree
/
Copy pathspr.Rnw
More file actions
3433 lines (2725 loc) · 86.1 KB
/
spr.Rnw
File metadata and controls
3433 lines (2725 loc) · 86.1 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
%% R programming course notes --- Stephen Eglen
%% http://www.damtp.cam.ac.uk/user/sje30/
%% Modified by Laurent Gatto <lg390@cam.ac.uk>
%% - misc minor updates
%% - using knitr
%% - scientific/trustworthy software
%% - other misc slides
%% Copyright (C) 2009 Stephen Eglen
%% Permission is granted to copy, distribute and/or modify this document
%% under the terms of the GNU Free Documentation License, Version 1.3
%% or any later version published by the Free Software Foundation;
%% http://www.gnu.org/copyleft/fdl.html
%%
%% If you reuse these notes, please consider citing the
%% PloS Computational Biology article describing these lecture notes:
%% http://www.ploscompbiol.org/doi/pcbi.1000482
\documentclass[]{beamer}
%%\documentclass[notes]{beamer} %% include notes.
\newcommand{\Slang}{\texttt{S} }
\newcommand{\R}{\texttt{R} }
\newcommand{\Rfunction}[1]{{\texttt{#1}}}
\newcommand{\Robject}[1]{{\texttt{#1}}}
\newcommand{\Rpackage}[1]{{\mbox{\normalfont\textsf{#1}}}}
\definecolor{Red}{rgb}{0.7,0,0}
\definecolor{Blue}{rgb}{0,0,0.8}
\usepackage{hyperref}
\hypersetup{%
pdfusetitle,
bookmarks = {true},
bookmarksnumbered = {true},
bookmarksopen = {true},
bookmarksopenlevel = 2,
unicode = {true},
breaklinks = {false},
hyperindex = {true},
colorlinks = {true},
linktocpage = {true},
plainpages = {false},
linkcolor = {Blue},
citecolor = {Blue},
urlcolor = {Red},
pdfstartview = {Fit},
pdfpagemode = {UseOutlines},
pdfview = {XYZ null null null}
}
%% Lecture notes have been made using the Beamer class for LaTeX.
%% http://latex-beamer.sourceforge.net/
%%
%% You will also need textpos.sty, which comes via:
%% http://www.ctan.org/tex-archive/macros/latex/contrib/textpos/
%%
%% There is an extra makefile that will help with creating versions of
%% the lecture notes either for the lecturer:
%%
%% make spr.pdf
%% make spr-4up.pdf
%%
%% or the student (4-up, A4 paper):
%%
%% make h.pdf
%% If the 'Outline' slides are empty, try 'pdflatex rpc' after 'make
%% rpc.pdf' and they should appear. (Any hints on how to fix this in
%% the Makefile?)
%% This file also includes some notes, which are included in the
%% output if you pass the [notes] option to the beamer documentclass,
%% see above. (Look for \note{...} in slides below.}
%% Default will be false for this ifhandouts. Set handouts to true
%% and then you will get 4up handouts suitable for a4paper.
\usepackage{ifthen}
\providecommand*{\handouts}{false}
%% lecs:
%%\includeonlylecture{sat}
%% Following is useful for getting 4-up output directly to A4 pdf.
\ifthenelse{\equal{\handouts}{true}}
{\usepackage{pgfpages}
\pgfpagesuselayout{4 on 1}[a4paper,landscape]}
{}
\usepackage{bm} %Bold math allows Greeksymbols in bold.
%% Listings package is used for including R commands etc.
\usepackage{listings}
\lstset{commentstyle=\color{red},keywordstyle=\color{black},
showstringspaces=false}
\lstnewenvironment{rc}[1][]{\lstset{language=R}}{}
%%\newenvironment{rc} {\begin{alltt}\small} {\end{alltt}}
\newcommand{\adv}{{\tiny (Advanced)}}
\newcommand{\ri}[1]{\lstinline{#1}} %% Short for 'R inline'
\lstnewenvironment{rc.out}[1][]{\lstset{language=R,%%
morecomment=[is]{/*}{*/},%
moredelim=[is][\itshape]{(-}{-)},frame=single}}{}
%%\usepackage{emaxima}
\usepackage[overlay]{textpos} %For using textblock
\setlength{\TPHorizModule}{10mm}
\setlength{\TPVertModule}{\TPHorizModule}
\newcommand{\ds}{\vspace*{5mm}}
\newcommand{\xstar}{\ensuremath{x^\ast}}
\newcommand{\vmu}{{\bm{\mu}}\xspace}
%%\usepackage{theapa}
\usepackage{amsmath,graphicx}
%%\usepackage{multimedia} %%Need for movies.
\newcommand{\smallref}[1]{{\small #1}}
%% \newcommand{\mybox}[1]{\fbox{#1}}x
%% \graphicspath{{../talk_figs/}{/home/anotherpath/}}
\graphicspath{{figs/}}
\setlength{\TPHorizModule}{10mm}
\setlength{\TPVertModule}{10mm}
\newcommand{\colhalf}{\column{0.49\textwidth}}
\author{
Stephen Eglen\\
Laurent Gatto%%\footnote{\url{lg390@cam.ac.uk}}\footnote{\url{http://proteome.sysbiol.cam.ac.uk/lgatto/teaching/spr.html}}\\
}
\date{\today}
\mode<presentation>
{
\setbeamersize{text margin left=0.25cm}
\setbeamersize{text margin right=0.25cm}
\beamertemplatedotitem
\beamertemplateheadempty %% Remove headline (at top of frame)
%% \beamertemplatefootempty %% Remove headline (at top of frame)
\beamertemplatefootpagenumber %% pagenumber only in footer.
%% Remove navigation icons.
\beamertemplatenavigationsymbolsempty
%% Show start of every lecture. Not available in article.
%% \AtBeginLecture{\begin{frame}{\Large Lecture \insertlecture}\end{frame}}
}
\mode<article>
{
\usepackage{fullpage}
\usepackage{pgf}
\usepackage{hyperref}
%%\setjobnamebeamerversion{aa}
}
%% This is run at the start of every section.
\AtBeginSection[] % Do nothing for \section*
{
\begin{frame}<beamer>
\frametitle{Outline}
\tableofcontents[currentsection]
%%\frametitle{currentsection}
\end{frame}
}
\title{Scientific Programming with \R}
\begin{document}
\lstset{language=R}
%% Switch off title page.
%% \begin{frame}
\mode<article>
{
\date{\today}
\maketitle
These are the lecture notes for the programming course.
}
\mode<presentation>
{
\date{\today}
\maketitle
}
<<external-src, cache=FALSE, echo=FALSE>>=
library(knitr)
options(width = 60)
opts_chunk$set(prompt = TRUE,
comment = '',
fig.align = 'center')
read_chunk('src/src.R')
@ %% $
\lecture{1: Introduction to \R}{intro}
\begin{frame}
\frametitle{Books and online help}
\begin{itemize}
\item Introductory Statistics with \R (Springer, Dalgaard).
\item A first course in statistical programming with \R (CUP, Braun
and Murdoch).
\item Computational Genome Analysis: An Introduction (Springer,
Deonier, Tavar{\'e} and Waterman).
\item \Slang programming (Springer, Venables and Ripley).
\item \R programming for Bioinformatics (CRC Press, Gentleman).
\item Scientific programming and simulation using \R (CRC, Jones,
Maillardet and Robinson).
\item The art of R programming (No Starch Press, Matloff).
\item Writing Scientific Software (WSS) (CUP, Oliveira and Stewart).
\item \url{www.r-project.org}, \url{www.rseek.org}, \url{www.r-bloggers.com}
\item \R-help mailing list.
\ds
\item Eglen (2009) \url{http://www.ploscompbiol.org/doi/pcbi.1000482}
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Aims of course}
This course aims to teach \R as a general-purpose programming language.
Issues specific to Computational Biology (e.g. Bioconductor packages) are covered in
other course modules.
In part 1, topics to be mastered in this course include:
\begin{itemize}
\item Interactive use of \R.
\item Basic data types: \Robject{vector}, \Robject{matrix}, \Robject{list},
\Robject{data.frame}, \Robject{factor}, \Robject{character}.
\item Writing scripts.
\item Graphical facilities.
\item Writing your own functions.
\item File input/output.
\item Control-flow statements, looping.
\item Vectorization.
\item Numerics issues.
\item Debugging.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Part 2: Scientific computing issues}
In part 2 of the course\footnote{tentative}, we will explore various other topics,
building on core knowledge of \R.
\begin{itemize}
\item Numerical integration
\item Phase plane analysis
\item Handling large files/data bases
\item String processing (e.g. for genomic data)
\item Advanced graphing / presenting results
\item Reproducible research
\item Future directions (\R and generally)
\item (Object-oriented programming)
\item (Package development)
\item (\R profiling)
\end{itemize}
\end{frame}
\section{Introduction}
\begin{frame}
\begin{block}{Scientific programming/software}
\begin{itemize}
\item Different from software engineering (but should try to adhere to SE best practice, of course).
\item Moving target.
\item Domain scientists write the code (some argue this is a weakness).
\item Has of course to be accurate, user-friendly
(no GUI vs CLI ranting here), usable and useful, flexible, efficient
and open, owned by the community, facilitate
reproducible research.
\item Contribute to users education (i.e not be a black box),
in terms data requirements,
the data processing and
result interpret. \\
Importance of documentation.
\end{itemize}
See Gentleman et al. (2004) Genome Biology (Bioconductor paper) for an example of successful scientific software.
\end{block}
\end{frame}
\begin{frame}
\begin{block}{Trustworthy software}
The complexity of the data processes and of the computations applied
to them mean that those who receive the results of modern data analysis
have limited opportunity to verify the results by direct observation. Users of
the analysis have no option but to trust the analysis, and by extension the
software that produced it. Both the data analyst and the software provider
therefore have a strong responsibility to produce a result that is trustworthy,
and, if possible, one that can be \textit{shown} to be trustworthy. \ds
This places an obligation on all creators of software to program in such a
way that the computations can be understood and trusted. \ds
John M. Chambers, \textit{Software for Data Analysis} (Springer)
\end{block}
\end{frame}
\begin{frame}
\frametitle{What is \R?}
\begin{itemize}
\item Statistical computing environment, and programming language.
\item Very popular in many areas of statistics, computational biology.
\item ``Programming with data'' (Chambers)
\item Approach: command-line for one-liners; interactive usage; write
scripts/functions for larger work (edit/run cycle);
develop package for consolidation and distribution.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{History}
\begin{itemize}
\item \Slang language came from Bell Labs (Becker, Chambers and Wilks).
Commercial version S-plus (1988).
\item \R emerged as a combination of \Slang and Scheme: Ross Ihaka and
Robert Gentleman (NZ).
\item 1993: first announcement.
\item 1995: 0.60 release, now under GPL.
\item 2015-08-14: release 3.2.2.
Stable, multi-platform. Major release every April.
\item R-core now 21 people, key academics in field,
including John Chambers.
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Strengths of \R}
\begin{itemize}
\item GPL'd, available on many platforms.
\item Excellent development team with yearly release cycle.
\item Source always available to examine/edit.
\item Fast for vectorized calculations.
\item Foreign-language interface (C/Fortran) when speed crucial, or
for interfacing with existing code.
\item Good collection of numerical/statistical routines.
\item Comprehensive \R Archive Network (CRAN) $\sim$ 9280 packages
[2016-10-04] (cf 1000 in April 2007).
%% Number of packages listed near top of
%% http://cran.r-project.org/web/packages/
\item On-line doc, with examples.
\item High-quality graphics (pdf, postscript, quartz, x11, bitmaps).
Often used just for plotting \ldots
%%\item Passing arguments to functions is nice \ldots
\end{itemize}
\end{frame}
\begin{frame}
\frametitle{Graphics example}
\centerline{\includegraphics[width=11cm]{figures/gpQuality_jean}}
{\small Jean YH Yang; gpQuality \url{http://bioinf.wehi.edu.au/marray/ibc2004/lect1b-quality.pdf}}
\end{frame}
\begin{frame}
\frametitle{Weaknesses of \R}
\begin{itemize}
\item Loops are slow. Learn how to vectorize solutions. %% or use apply family of functions.
\item No fast compiler yet, and unlikely to happen due to nature of
language. Byte compiler available in \textbf{compiler} package.
\item No (decent) endorsed GUI built-in to \R.
Tk is available within base \R, and packages for other
graphical tooklits (e.g. Gtk2, Qt) are also
available. \\
``Programming Graphical User Interfaces with \R'',
M. F. Lawrence and J. Verzani
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Brief comparison to matlab}
\begin{itemize}
\item Flexible language, similar to matlab, but definitely not
``everything is a matrix''. Frames, lists, vectors \ldots
\item From matlab to \R:
\url{http://cran.r-project.org/doc/contrib/R-and-octave.txt}
\item Comprehensive matlab and \R guide:
\url{http://www.math.umaine.edu/faculty/hiebeler/comp/matlabR.html}
\item Use \Rfunction{x[i]} not \Rfunction{x(i)} for indexing vectors.
\item Making vectors: \Rfunction{x <- c(10, 9, 5, 1)}
\item Assignment: best to use \Rfunction{<-} rather than \Rfunction{=}.
%% Stay away from underscore!
\end{itemize}
%% \lstset{language=R}
%% \begin{lstlisting}
%% x <- 10
%% x = 10 ## equivalent, more readable?
%% lo.val <- 100 ## not lo_val <- 100
%% \end{lstlisting}
\end{frame}
\begin{frame}
\frametitle{Using \R}
\begin{itemize}
\item Start-up: type `R' at command line.
\item Type commands interactively, and get results.
\item Type commands into a file; \lstinline{source('myfile.R')}; edit
file \ldots
\item Mac/Win has a GUI for interactive use, with internal editors.
\item All platforms have a command-line interface
\item Many external editors have support for \R,
including
\begin{itemize}
\item Emacs through ESS (\url{http://ess.r-project.org}),
\item Eclipse IDE (\url{http://www.walware.de/goto/statet}),
\item Rstudio (\url{http://www.rstudio.org}),
\item \ldots
\end{itemize}
\end{itemize}
\end{frame}
%% TODO
%% \lecture{Basic data types}
%% Start the lectures with a simple example of how to work with R.
\begin{frame}[fragile]
\frametitle{My very first \R session}
\note{when viewing x for the first time, explain how the indices of
the vector are labelled, [1], [8] etc.}
<<firstsession, eval = FALSE, tidy = FALSE, prompt = FALSE>>=
x <- rnorm(50, mean=4)
x
mean(x)
range(x)
hist(x)
## check help -- how to change title?
?hist
hist(x, main="my first plot")
q()
@
\end{frame}
\begin{frame}
\frametitle{Interacting with \R}
\begin{itemize}
\item Can use up/down arrow keys to go through command history.
Within a command, use left/right arrow keys to edit.
\item History can be saved over sessions (\Rfunction{?history}).
\item Multiple commands can be put onto one line, using \Robject{;} as
separator between lines, e.g. \Rfunction{x<-10; y<-3; a <- 5}.
\item \texttt{TAB} can do object/file completion.
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Objects and Functions}
\R manipulates objects. Each object has a name and a type
(\Robject{vector}, \Robject{matrix}, \Robject{list}, \ldots)
Name of an object: letters (upper/lower case are distinct), digits,
period. Start with a letter.
Objects set by way of assignement. Use the \Rfunction{<-} assignment operator
rather than \Rfunction{=} wherever possible.
(Does \Rfunction{i = i+1} make sense?)
\end{frame}
\begin{frame}[fragile]
\frametitle{Objects and Functions}
<<bjectsandfunctions>>=
x <- 200
half.x <- x/2
threshold <- 95.0
age <- c(15, 19, 30)
age[2] ## [] for accessing element.
length(age) ## () for calling function.
@
\end{frame}
\begin{frame}[fragile]
\frametitle{What's up with the assignment and underscore? \adv}
Historically, underscore was used in \Slang for assignment (because an old
system keyboard had a key equivalent to the ASCII underscore that
generated a back arrow). Hence underscore was not
used within variables.
More recently, \Rfunction{=} is now available as an assignment operator
(similar to languages like \texttt{C}), but is frowned upon as it can be
confusing.
What does \Rfunction{i = i+1} imply mathematically?
Better to stick to \Rfunction{i <- i + 1} and use equals just within calls to
functions, e.g. \Rfunction{runif(max=3)}.
Note also that assignments return values:
<<assignop>>=
y <- 1 + ( x <- 9 )
a <- b <- 0
@
\url{http://developer.r-project.org/equalAssign.html}
\end{frame}
\section{Vectors}
\begin{frame}[fragile]
\frametitle{Vectors}
Vectors are a fundamental object for \R. Scalars are treated as vector of length 1.
<<vector>>=
y <- c(10, 20, 40)
y[2]
length(y)
x <- 5
length(x)
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Vectors}
Some operations work element by element, others on the whole vector, compare:
<<vector2>>=
y <- c(20, 49, 16, 60, 100)
min(y)
range(y)
sqrt(y)
log(y)
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Generating vectors}
Many short hand methods for regular sequences; \Rfunction{c()} for irregular.
<<gen>>=
x <- seq(from=1, to=9, by=2)
y <- seq(from=2, by=7, length=3)
z <- 4:8
a <- seq.int(5) ## fast for integers
b <- c(3, 9, 2)
d <- c(a, 10, b)
e <- rep( c(1,2), 3)
f <- integer(7)
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Accessing and setting elements}
<<access>>=
x <- seq(from=100, by=1, length=20)
x[3] ## just element 3.
x[c(12,14)] ## element 12 and 14
x[1:5]
bad <- 1:4
x[-bad] ## exclude elements
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Accessing and setting elements}
Can also provide a logical vector of same length as vector (logical
values explained later).
<<access2>>=
x <- c(5, 2, 9, 4)
v <- c(TRUE, FALSE, FALSE, TRUE)
x[v]
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Accessing and setting elements}
\note{accessing by logical was seen as a little confusing}
Elements can be set in several ways
<<set>>=
x <- rep(0,10)
x[1:3] <- 2
x[5:6] <- c(-5, NA)
x[7:10] <- c(1,9) ## recycling.
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Recycling rule \adv}
Recycling is convenient, but dangerous; when vectors are of different
lengths, the shorter one is often recycled to make a vector of the
same length.
<<recycling>>=
a <- c(1,5) + 2
x <- c(1,2); y <- c(5,3,9,2)
x + y
x + c(y,1) ## odd recycling, warning.
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Recycling rule \adv}
<<recyclingex, eval = FALSE>>=
x <- 1:10
y <- x * 2
z <- x^2
y + z
x + 1:2
x + 1:3
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Recycling rule \adv}
<<recyclingex2>>=
x <- 1:10
y <- x * 2
z <- x^2
y + z
x + 1:2
x + 1:3
@
\end{frame}
%% Note also that R does not discriminate usually between a row vector
%% and a column vector.
\begin{frame}[fragile]
\frametitle{Naming indexes of a vector}
<<naming>>=
joe <- c(24, 1.70)
joe
names(joe)
names(joe) <- c("age", "height") ## replacement function
joe
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Naming indexes of a vector}
<<indexing>>=
joe["height"] == joe[2]
@
Refering to index by name rather than by position can make code more
readable, and flexible. Cannot do things like \Rfunction{x[1:4]}
easily though, since you need to name all four elements you want. \\
Although extremly useful, names have a cost when processing large
objects. \ds
Note: in second use of \Rfunction{names()} above, we are actually using
the \textit{replacement function} \Rfunction{names<-}, see later.
\end{frame}
\begin{frame}[fragile]
\frametitle{Common functions for vectors}
\begin{itemize}
\item \Rfunction{length()}
\item \Rfunction{rev()}
\item \Rfunction{sum()}, \Rfunction{cumsum()}, \Rfunction{prod()}, \Rfunction{cumprod()}
\item \Rfunction{mean()}, \Rfunction{sd()}, \Rfunction{var()}, \Rfunction{median()}
\item \Rfunction{min()}, \Rfunction{max()}, \Rfunction{range()}, \Rfunction{summary()}
\item \Rfunction{exp()}, \Rfunction{log()}, \Rfunction{sin()}, \Rfunction{cos()}, \Rfunction{tan()} (radians, not degrees)
\item \Rfunction{round()}, \Rfunction{ceiling()}, \Rfunction{floor()}, \Rfunction{signif()}
\item \Rfunction{sort()}, \Rfunction{order()}, \Rfunction{rank()}
\item \Rfunction{which()}, \Rfunction{which.max()}
\item \Rfunction{any()}, \Rfunction{all()}
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Functions as function args}
Functions can be called within function calls; the following are equivalent:
<<funinfun>>=
x <- c(3, 2, 9, 4)
y <- exp(x); z1 <- which(y > 20) ## case 1
z2 <- which ( exp(x) > 20) ## case 2
all.equal(z1, z2)
@
\end{frame}
\section{Calling functions}
\begin{frame}[fragile]
\frametitle{Default values for function arguments}
A function will error if not all required arguments are provided.
Some functions have both required and optional arguments. If the
optional arguments are not provided, they are either ignored, or they
take a default value.
\begin{verbatim}
Usage:
round(x, digits = 0)
\end{verbatim}
\end{frame}
\begin{frame}[fragile]
\frametitle{Default values for function arguments}
<<round>>=
x <- c(2.091, 4.126, 7.925)
round() ## required arg is missing
round(x)
round(x, digits = 2)
@
Let's see how this works in more detail.
\end{frame}
\begin{frame}[fragile]
\frametitle{Argument matching}
\R has a flexible method for specifying arguments to function. We can
either provide an actual value for a formal argument, or give
arguments as \texttt{key=value} (or \texttt{formal=actual}). \\
As an example, let's look at help for seq:
\begin{verbatim}
seq(from = 1, to = 1, by = ((to - from)/(length.out - 1)),
length.out = NULL, along.with = NULL, ...)
\end{verbatim}
(NB: in \Rfunction{seq(from=x)}, \texttt{from} is the \textbf{formal argument} of the
function, and here \Robject{x} is the actual value.) \ds
The \texttt{...} notation allows for other arguments to be passed, which
are not used by this function.
\end{frame}
\begin{frame}[fragile]
\frametitle{Argument matching}
Typical calls are as follows:
\begin{scriptsize}
<<seqexples>>=
seq(1, 3, 0.5) ## positional matching
seq(1, 5,length.out=3) ## can skip args (e.g. by)
seq(to=5) ## order not important.
seq(f=5,t=1) ## abbrev tags.
seq(len=5, 1,2) ## tags removed before positional matching
@
\end{scriptsize}
\end{frame}
\begin{frame}[fragile]
\frametitle{ \ldots in function calls \adv}
Why do some functions, like \Rfunction{sqrt}, require only one argument, yet others take many arguments?
Functions like \Rfunction{c}, \Rfunction{cbind}, have \Robject{...} in the arguments:
\begin{verbatim}
Usage:
c(..., recursive=FALSE)
Arguments:
...: objects to be concatenated.
\end{verbatim}
The \Robject{...} indicate any number of objects may be passed, not just (say) one or two.
The result of \Rfunction{c()} is to combine them all into one long vector,
taking into account if the keyword ``recursive'' is provided
(when args are first flattened).
The \texttt{...} can also indicate that other arguments can be provided which
are not processed directly by this function, but may be useful for
other functions (e.g. popular when plotting).
\end{frame}
\begin{frame}[fragile]
\frametitle{Replacement functions \adv}
<<repl>>=
x <- 1:5
x
length(x)
length(x) <- 2
x
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Replacement functions \adv}
Normally \Rfunction{length(x)} would return a value, rather than you assigning a
value to the function! These are \textbf{replacement functions}, see
help page:
\begin{verbatim}
Usage:
length(x)
length(x) <- value
\end{verbatim}
\end{frame}
\begin{frame}
\frametitle{Getting help: key commands}
\begin{itemize}
\item \Rfunction{help(hist)} to see help file (or \Rfunction{?hist}).
\item \Rfunction{args(hist)} to see arguments of a function.
\item \Rfunction{example(boxplot)} run examples in help page.
\item \Rfunction{options(help\_type="html")} will then use web-browser for help.
\item \Rfunction{help.search("histogram")}
\item \Rfunction{demo()} to list all demos, e.g. \Rfunction{demo(graphics)}
\end{itemize}
NB: In the \R terminal \Rfunction{?command} works as shorthand
for \Rfunction{help("command")} except
for a small number of commands, e.g. \Rfunction{if}, \Rfunction{while}.
Use the longhand for these.
\end{frame}
\begin{frame}
\frametitle{Help pages}
\begin{itemize}
\item What you can expect to find:
\begin{itemize}
\item Description -- one line summary
\item Usage -- formal arguments
\item Arguments -- interpretation of arguments
\item Details -- what the function does
\item Value -- return value.
\item References -- documentation
\item See also -- helps you find related pages
\item Examples -- guaranteed to run: \Rfunction{example(hist)}
\end{itemize}
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Numbers and special values}
\begin{itemize}
\item \Robject{numeric} (floating-point, double): 12, 4.92, 1.5e3 --
\Rfunction{is.numeric()} (integers converted to f.p.)
\item \Robject{integers} 1L -- \Rfunction{is.integer()}
\item \Robject{complex}: 3+2i -- \Rfunction{is.complex()}
\end{itemize}
\begin{scriptsize}
<<nbrs>>=
typeof(1)
typeof(1L)
is.integer(1)
is.integer(1L)
@
\end{scriptsize}
\end{frame}
\begin{frame}
\frametitle{Numbers and special values}
Special values:
\begin{itemize}
\item \Robject{NA}: not available. (Often used to represent missing data
point) -- \Rfunction{is.na()}
\item \Robject{NaN}: not a number. e.g. 0/0 -- \Rfunction{is.nan()}
\item \Robject{Inf}, \Robject{-Inf}: $\pm \infty$ --
\Rfunction{is.finite()}
\end{itemize}
You will also meet:
\begin{itemize}
\item \Robject{NULL}: often, list of zero length --
\Rfunction{is.null()}
\end{itemize}
\end{frame}
\begin{frame}[fragile]
\frametitle{Numbers and special values}
<<specials>>=
typeof(NA)
typeof(NaN)
typeof(Inf)
typeof(NULL)
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Operator precedence \Rfunction{?Syntax}}
<<opprec, eval = FALSE>>=
3 * 4 + 2 != 3 * (4 + 2)
2^3+1 != 2^(3+1)
1:5-1
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Operator precedence \Rfunction{?Syntax}}
<<opprec2>>=
3 * 4 + 2 != 3 * (4 + 2)
2^3+1 != 2^(3+1)
1:5-1
@
\end{frame}
\begin{frame}[fragile]
\frametitle{Operator precedence \Rfunction{?Syntax}}
Subset taken from \Rfunction{?Syntax}, see that page for full list.
Highest precedence at top.
\begin{verbatim}
'[ [[' indexing
'$ @' component / slot extraction
'^' exponentiation (right to left)
'- +' unary minus and plus
':' sequence operator
'%any%' special operators
'* /' multiply, divide
'+ -' (binary) add, subtract
'< > <= >= == !=' ordering and comparison
'!' negation
'& &&' and
'| ||' or
'<- <<-' assignment (right to left)
'?' help (unary and binary)
\end{verbatim}
Bottom line: use parentheses to order preference.
\end{frame}
\begin{frame}[fragile]
\frametitle{Operators}
Most operators will be familiar, but some may not:
<<ops, eval=FALSE, prompt=FALSE, tidy=FALSE>>=
x <- 10
x == 4 ## test for equality
x != 10 ## not equal?
7 %/% 2 ## division, ignoring remainder. (3)
7 %% 2 ## remainder (1)
x <- 9 ## assignment
x <<- 9 ## assign x to 9 in the global env. (BAD)
## Raising to a power can be done in two ways.
all.equal( 10.1 ** 2.5, 10.1^2.5 )
@
\end{frame}
\begin{frame}[fragile]
\frametitle{When things go wrong}
Syntax errors are those where you've just made a typing mistake.