-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathpreface.tex
327 lines (319 loc) · 17.6 KB
/
preface.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
\chapter*{Preface}\normalsize
\addcontentsline{toc}{chapter}{Preface}
\pagestyle{plain}
\textbf{OpenIMAJ} is a set of libraries and tools for multimedia analysis.
OpenIMAJ is very broad and contains everything from state-of-the-art computer
vision (e.g. SIFT descriptors, salient region detection, face detection, etc.)
and advanced data clustering, through to software that performs analysis on the
content, layout and structure of webpages.
OpenIMAJ is primarily written in pure Java and, as such, is completely platform
independent. The video capture and hardware libraries contain some native code
but Linux (x86 and x86\_64 are supported currently; ARM support is coming soon),
OSX and Windows are supported out of the box (under both 32 and 64 bit JVMs).
It is possible to write programs that use the libraries in any JVM language
that supports Java interoperability, such as Groovy, Jython, JRuby or
Scala. OpenIMAJ can even be run on Android phones and tablets.
The OpenIMAJ software is structured into a number of modules. The modules
can be used independently, so if, for instance, you were developing data
clustering software using OpenIMAJ you wouldn't need to acquire the modules related
to images or text. The list on the following page illustrates the modules
and summarises the functionality in each component.
This tutorial aims to instruct the reader on how to get up and running
writing code using OpenIMAJ. Currently the tutorial covers the following areas:
\begin{enumerate}
\item Getting started with OpenIMAJ using Maven
\item Processing your first image
\item Introduction to clustering, segmentation and connected components
\item Processing video
\item Finding faces
\item Global image features
\item SIFT and feature matching
\end{enumerate}
In the future we hope to add more content to the tutorial covering the following:
\begin{itemize}
\item Basic text analysis
\item Image and video indexing using ImageTerrier
\item Compiling OpenIMAJ from source
\item Tracking features in video
\item Audio processing
\item Speech recognition
\item Hardware interfaces
\item Advanced local features
\item Scalable processing with OpenIMAJ/Hadoop
\item Machine learning
\item Building a bibliography of the techniques used in your code.
\end{itemize}
\section*{The OpenIMAJ Modules}
% \begin{figure*}[h!]
\renewcommand*\DTstylecomment{\rmfamily{ }{ }}
\newcommand{\descwidth}{8.0cm}
\DTsetlength{0.2em}{0.4em}{0.2em}{0.4pt}{1.6pt}
\dirtree{%
.1 OpenIMAJ\DTcomment{\begin{minipage}[t]{\descwidth}
OpenIMAJ (Open Intelligent Multimedia in Java) is a collection of libraries and tools for multimedia analysis written in the Java programming language. OpenIMAJ intends to be the first truly complete multimedia analysis library and contains modules for analysing images, videos, text, audio and even webpages. The OpenIMAJ image and video analysis and feature extraction modules contain methods for processing visual content and extracting state-of-the-art features, including SIFT. The OpenIMAJ clustering and nearest-neighbour libraries contain efficient, multi-threaded implementations of clustering algorithms including Hierarchical K-Means and Approximate K-Means. The clustering library makes it possible to easily create visual-bag-of-words representations for images and video with very large vocabularies. The text-analysis modules contain implementations of a statistical language classifier and low-level processing pipeline. A number of modules deal with content creation, including interactive slideshows and animations. The hardware integration modules allow cross-platform integration with devices including webcams, the Microsoft Kinect, and even devices such as GPS's. OpenIMAJ also incorporates a number of tools to enable extremely-large-scale multimedia analysis using a distributed computing approach based on Apache Hadoop.
\end{minipage}}.
.2 archetypes\DTcomment{\begin{minipage}[t]{\descwidth}
Maven archetypes for OpenIMAJ
\end{minipage}}.
.3 openimaj-quickstart-archetype\DTcomment{\begin{minipage}[t]{\descwidth}
Maven quickstart archetype for OpenIMAJ
\end{minipage}}.
.2 core\DTcomment{\begin{minipage}[t]{\descwidth}
Submodule for modules containing functionality used across the OpenIMAJ libraries.
\end{minipage}}.
.3 core\DTcomment{\begin{minipage}[t]{\descwidth}
Core library functionality concerned with general programming problems rather than multimedia specific functionality. Includes I/O utilities, randomisation, hashing and type conversion.
\end{minipage}}.
.3 core-image\DTcomment{\begin{minipage}[t]{\descwidth}
Core definitions of images, pixels and connected components. Also contains interfaces for processors for these basic types. Includes loading, saving and displaying images.
\end{minipage}}.
.3 core-video\DTcomment{\begin{minipage}[t]{\descwidth}
Core definitions of a video type and functionality for displaying and processing videos.
\end{minipage}}.
.3 core-audio\DTcomment{\begin{minipage}[t]{\descwidth}
Core definitions of audio streams and samples/chunks. Also contains interfaces for processors for these basic types.
\end{minipage}}.
.3 core-math\DTcomment{\begin{minipage}[t]{\descwidth}
Mathematical implementations including geometric, matrix and statistical operators.
\end{minipage}}.
.3 core-feature\DTcomment{\begin{minipage}[t]{\descwidth}
Core notion of features, usually denoted as arrays of data. Definitions of features for all primitive types, features with location and lists of features (both in memory and on disk).
\end{minipage}}.
.3 core-experiment\DTcomment{\begin{minipage}[t]{\descwidth}
Classes to formally describe experiments and evaluations, with support for automatically evaluating their results.
\end{minipage}}.
.3 core-citation\DTcomment{\begin{minipage}[t]{\descwidth}
Tools for annotating code with publication references and automatically generating bibliographies for your code.
\end{minipage}}.
.2 image\DTcomment{\begin{minipage}[t]{\descwidth}
Submodule for image related functionality.
\end{minipage}}.
.3 image-processing\DTcomment{\begin{minipage}[t]{\descwidth}
Implementations of various image, pixel and connected component processors (resizing, convolution, edge detection, ...).
\end{minipage}}.
.3 image-local-features\DTcomment{\begin{minipage}[t]{\descwidth}
Methods for the extraction of local features. Local features are descriptions of regions of images (SIFT, ...) selected by detectors (Difference of Gaussian, Harris, ...).
\end{minipage}}.
.3 image-feature-extraction\DTcomment{\begin{minipage}[t]{\descwidth}
Methods for the extraction of low-level image features, including global image features and pixel/patch classification models.
\end{minipage}}.
.3 faces\DTcomment{\begin{minipage}[t]{\descwidth}
Implementation of a flexible face-recognition pipeline, including pluggable detectors, aligners, feature extractors and recognisers.
\end{minipage}}.
.3 image-annotation\DTcomment{\begin{minipage}[t]{\descwidth}
Methods for describing automatic image annotators
\end{minipage}}.
.2 video\DTcomment{\begin{minipage}[t]{\descwidth}
Sub-modules containing support for analysing and processing video.
\end{minipage}}.
.3 video-processing\DTcomment{\begin{minipage}[t]{\descwidth}
Various video processing algorithms, such as shot-boundary detection.
\end{minipage}}.
.3 video-analysis\DTcomment{\begin{minipage}[t]{\descwidth}
The OpenIMAJ Video Processing Library contains implementations of a variety of video analysis operators.
\end{minipage}}.
.3 xuggle-video\DTcomment{\begin{minipage}[t]{\descwidth}
Plugin to use Xuggler as a video source. Allows most video formats to be read into OpenIMAJ.
\end{minipage}}.
.2 audio\DTcomment{\begin{minipage}[t]{\descwidth}
Submodule for audio processing and analysis related functionality.
\end{minipage}}.
.3 audio-processing\DTcomment{\begin{minipage}[t]{\descwidth}
Implementations of various audio processors (e.g. multichannel conversion, volume change, ...).
\end{minipage}}.
.2 machine-learning\DTcomment{\begin{minipage}[t]{\descwidth}
Sub-module for machine-learning libraries.
\end{minipage}}.
.3 clustering\DTcomment{\begin{minipage}[t]{\descwidth}
Various clustering algorithm implementations for all primitive types including random, random forest, K-Means (Exact, Hierarchical and Approximate), ...
\end{minipage}}.
.3 nearest-neighbour\DTcomment{\begin{minipage}[t]{\descwidth}
Implementations of K-Nearest-Neighbour methods, including approximate methods.
\end{minipage}}.
.3 machine-learning\DTcomment{\begin{minipage}[t]{\descwidth}
The OpenIMAJ Machine Learning Library contains implementations of optimised machine learning techniques that can be applied to OpenIMAJ structures and features.
\end{minipage}}.
.2 text\DTcomment{\begin{minipage}[t]{\descwidth}
Text Analysis functionality for OpenIMAJ.
\end{minipage}}.
.3 nlp\DTcomment{\begin{minipage}[t]{\descwidth}
The OpenIMAJ NLP Library contains a text pre-processing pipeline which goes from raw unstructured text to part of speech tagged stemmed text.
\end{minipage}}.
.2 thirdparty\DTcomment{\begin{minipage}[t]{\descwidth}
Useful third-party libraries (possibly originally written in other languages) that have been ported to Java and integrated with OpenIMAJ. Not all modules have the same license.
\end{minipage}}.
.3 klt-tracker\DTcomment{\begin{minipage}[t]{\descwidth}
A port of Stan Birchfield's Kanade-Lucas-Tomasi tracker to OpenIMAJ. See http://www.ces.clemson.edu/~stb/klt/.
\end{minipage}}.
.3 tld\DTcomment{\begin{minipage}[t]{\descwidth}
A port of Georg Nebehay's tracker https://github.com/gnebehay/OpenTLD originally created by Zdenek Kalal https://github.com/zk00006/OpenTLD
\end{minipage}}.
.3 ImprovedArgs4J\DTcomment{\begin{minipage}[t]{\descwidth}
\end{minipage}}.
.3 IREval\DTcomment{\begin{minipage}[t]{\descwidth}
A modified version of the IREval module (version 4.12) from the lemur project with extensions to better integrate with OpenIMAJ. See http://www.lemurproject.org
\end{minipage}}.
.3 CLMFaceTracker\DTcomment{\begin{minipage}[t]{\descwidth}
\end{minipage}}.
.2 demos\DTcomment{\begin{minipage}[t]{\descwidth}
Demos showing the functionality of OpenIMAJ.
\end{minipage}}.
.3 demos\DTcomment{\begin{minipage}[t]{\descwidth}
Demos showing the use of OpenIMAJ!
\end{minipage}}.
.3 sandbox\DTcomment{\begin{minipage}[t]{\descwidth}
A project for various tests that don't quite constitute demos but might be useful to look at.
\end{minipage}}.
.3 touchtable\DTcomment{\begin{minipage}[t]{\descwidth}
Work on patrick's touchtable
\end{minipage}}.
.3 SimpleMosaic\DTcomment{\begin{minipage}[t]{\descwidth}
Demo showing SIFT matching with a Homography model to achieve image mosaicing.
\end{minipage}}.
.3 CampusView\DTcomment{\begin{minipage}[t]{\descwidth}
Demo showing how we used OpenIMAJ to create a Street-View-esq capture system.
\end{minipage}}.
.3 ACMMM-Presentation\DTcomment{\begin{minipage}[t]{\descwidth}
The OpenIMAJ presentation for ACMMM 2011. Unlike a normal presentation, this one isn't PowerPoint, but is actually an OpenIMAJ Demo App!
\end{minipage}}.
.2 test-resources\DTcomment{\begin{minipage}[t]{\descwidth}
Resources for running OpenIMAJ JUnit tests.
\end{minipage}}.
.2 tools\DTcomment{\begin{minipage}[t]{\descwidth}
Sub-modules containing commandline tools exposing OpenIMAJ functionality.
\end{minipage}}.
.3 core-tool\DTcomment{\begin{minipage}[t]{\descwidth}
Core of all tools
\end{minipage}}.
.3 GlobalFeaturesTool\DTcomment{\begin{minipage}[t]{\descwidth}
A tool for extracting various global features from images.
\end{minipage}}.
.3 ClusterQuantiserTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for clustering and quantising features.
\end{minipage}}.
.3 LocalFeaturesTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for extracting local image features.
\end{minipage}}.
.3 FaceTools\DTcomment{\begin{minipage}[t]{\descwidth}
Tools for detecting, extracting and comparing faces within images.
\end{minipage}}.
.3 FeatureVisualisation\DTcomment{\begin{minipage}[t]{\descwidth}
Tools for visualising certain types of image feature.
\end{minipage}}.
.3 CityLandscapeClassifier\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for classifying images as cityscapes (or images containing man-made objects) or landscapes. Based on the edge direction coherence vector.
\end{minipage}}.
.3 WebTools\DTcomment{\begin{minipage}[t]{\descwidth}
Tools and utilities for extracting info from web-pages
\end{minipage}}.
.3 OCRTools\DTcomment{\begin{minipage}[t]{\descwidth}
Tools for training and testing OCR.
\end{minipage}}.
.3 ImageCollectionTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for extracting images from collections (zip, gallery, video etc.).
\end{minipage}}.
.3 SimilarityMatrixTool\DTcomment{\begin{minipage}[t]{\descwidth}
A tool for performing operations on Similarity Matrices.
\end{minipage}}.
.3 TwitterPreprocessingTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for applying a text preprocessing pipeline to twitter tweets.
\end{minipage}}.
.2 hadoop\DTcomment{\begin{minipage}[t]{\descwidth}
Sub-modules for integrating OpenIMAJ with Apache Hadoop to allow Map-Reduce style distributed processing.
\end{minipage}}.
.3 core-hadoop\DTcomment{\begin{minipage}[t]{\descwidth}
Reusable wrappers and helpers to access and create sequence-files and map-reduce jobs.
\end{minipage}}.
.3 tools\DTcomment{\begin{minipage}[t]{\descwidth}
Tools that provide multimedia analysis algorithms expressed as Map-Reduce jobs that can be run on a Hadoop cluster.
\end{minipage}}.
.4 core-hadoop-tool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for clustering and quantising features using Map-Reduce jobs on a Hadoop cluster.
\end{minipage}}.
.4 HadoopFastKMeans\DTcomment{\begin{minipage}[t]{\descwidth}
Distributed feature clustering tool.
\end{minipage}}.
.4 HadoopImageDownload\DTcomment{\begin{minipage}[t]{\descwidth}
Distributed image download tool.
\end{minipage}}.
.4 HadoopLocalFeaturesTool\DTcomment{\begin{minipage}[t]{\descwidth}
Distributed local image feature extraction tool.
\end{minipage}}.
.4 SequenceFileTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for building, inspecting and extracting Hadoop SequenceFiles.
\end{minipage}}.
.4 HadoopGlobalFeaturesTool\DTcomment{\begin{minipage}[t]{\descwidth}
Distributed global image feature extraction tool.
\end{minipage}}.
.4 HadoopClusterQuantiserTool\DTcomment{\begin{minipage}[t]{\descwidth}
Distributed feature quantisation tool.
\end{minipage}}.
.4 HadoopTwitterPreprocessingTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for clustering and quantising features using Map-Reduce jobs on a Hadoop cluster.
\end{minipage}}.
.4 HadoopTwitterTokenTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for clustering and quantising features using Map-Reduce jobs on a Hadoop cluster.
\end{minipage}}.
.4 SequenceFileIndexer\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for building an index of the keys in a Hadoop SequenceFile.
\end{minipage}}.
.4 HadoopEXIFTool\DTcomment{\begin{minipage}[t]{\descwidth}
Tool for extracting EXIF information from images on a Hadoop cluster.
\end{minipage}}.
.2 web\DTcomment{\begin{minipage}[t]{\descwidth}
Sub-modules containing support for analysing and processing web-pages.
\end{minipage}}.
.3 core-web\DTcomment{\begin{minipage}[t]{\descwidth}
Implementation of a programatic offscreen web browser and utility functions.
\end{minipage}}.
.3 webpage-analysis\DTcomment{\begin{minipage}[t]{\descwidth}
Utilities for analysing the content and visual layout of a web-page.
\end{minipage}}.
.3 readability4j\DTcomment{\begin{minipage}[t]{\descwidth}
Readability4J is a partial re-implementation of the original readability.js script in Java. Many modifications have been made however.
\end{minipage}}.
.3 twitter\DTcomment{\begin{minipage}[t]{\descwidth}
The twitter project contains tools with which to read JSON data from the twitter API and process the data.
\end{minipage}}.
.2 hardware\DTcomment{\begin{minipage}[t]{\descwidth}
Sub-modules containing interfaces to hardware devices that we've used in projects built using OpenIMAJ.
\end{minipage}}.
.3 core-video-capture\DTcomment{\begin{minipage}[t]{\descwidth}
Cross-platform video capture interface using a lightweight native interface. Supports 32 and 64 bit JVMs under Linux, OSX and Windows.
\end{minipage}}.
.3 serial-driver\DTcomment{\begin{minipage}[t]{\descwidth}
Interface to hardware devices that connect to serial or USB-serial ports.
\end{minipage}}.
.3 gps\DTcomment{\begin{minipage}[t]{\descwidth}
Interface to GPS devices that support the NMEA protocol.
\end{minipage}}.
.3 compass\DTcomment{\begin{minipage}[t]{\descwidth}
Interface to an OceanServer OS5000 digital compass.
\end{minipage}}.
.3 nmea-parser\DTcomment{\begin{minipage}[t]{\descwidth}
Contains a parser for NMEA sentences written in Groovy.
\end{minipage}}.
.3 kinect\DTcomment{\begin{minipage}[t]{\descwidth}
The OpenIMAJ Core Video Capture Library contains the core classes and native code required interface with the Kinect device.
\end{minipage}}.
.3 turntable\DTcomment{\begin{minipage}[t]{\descwidth}
Integration with our serially controlled turntable
\end{minipage}}.
.2 content\DTcomment{\begin{minipage}[t]{\descwidth}
Libraries for multimedia content creation.
\end{minipage}}.
.3 slideshow\DTcomment{\begin{minipage}[t]{\descwidth}
A library for creating slideshows and presentations that can contain interactive demos that utilise all OpenIMAJ components.
\end{minipage}}.
.3 animation\DTcomment{\begin{minipage}[t]{\descwidth}
Code to help make an animation of data/models/etc.
\end{minipage}}.
.3 visualisations\DTcomment{\begin{minipage}[t]{\descwidth}
A library that contains classes for visualising various different features, such as audio and video.
\end{minipage}}.
.2 ide-integration\DTcomment{\begin{minipage}[t]{\descwidth}
Plugins to aid OpenIMAJ development in various IDE's.
\end{minipage}}.
}
% \end{figure*}
\setlength{\parskip}{1ex plus 0.5ex minus 0.2ex}