- ORCA
- Cite ORCA
- Install, tutorials and documentation
- Methods
- Performance metrics
- Utilities, classes and scripts
- Experiments parallelization with HTCondor
- External software
- Other contributors
- References
ORCA (Ordinal Regression and Classification Algorithms) is a MATLAB framework including a wide set of ordinal regression methods associated to the paper "Ordinal regression methods: survey and experimental study" published in IEEE Transactions on Knowledge and Data Engineering. ORCA provides implementation and integration of ordinal classification algorithms and performance metrics for ordinal regression. In addition, it helps to accelerate classifier experimental comparison with automatic fold execution, experiment paralellisation and performance reports. You can find a basic definition of ordinal regression at Wikipedia.
As a general experimental framework, the two main objectives of the framework are:
- To run many experiments as easily as possible to compare many algorithms and many datasets.
- To provide an easy way of including new algorithms into the framework by simply defining the parameters of the algorithms and the training and test methods.
To help these purposes, ORCA is mainly used through scripts that describe experiments, but the methods can be easily used through a common API.
The initial code of ORCA was released linked to the following work, if you use this framework please cite it:
P.A. Gutiérrez, M. Pérez-Ortiz, J. Sánchez-Monedero, F. Fernández-Navarro and C. Hervás-Martínez (2016),
"Ordinal regression methods: survey and experimental study",
IEEE Transactions on Knowledge and Data Engineering. Vol. 28. Issue 1
Bibtex entry:
@Article{Gutierrez2015,
Title = {Ordinal regression methods: survey and experimental study},
Author = {P.A. Guti\'errez and M. P\'erez-Ortiz and J. S\'anchez-Monedero and F. Fernandez-Navarro and C. Herv\'as-Mart\'inez},
Journal = {IEEE Transactions on Knowledge and Data Engineering},
Year = {2016},
Url = {http://dx.doi.org/10.1109/TKDE.2015.2457911},
Volume = {28},
Number = {1}
}
For more information about the paper and the ordinal datasets used please visit the associated website: http://www.uco.es/grupos/ayrna/orreview For more information about our research group please visit Learning and Artificial Neural Networks (AYRNA) website at University of Córdoba (Spain).
All the documentation is in the doc folder:
- A quick install guide of ORCA and the associated build troubleshooting.
- A first how to tutorial to get started with ORCA.
- A specific tutorial for naive approaches and decompositions covering the different considerations for this kind of methods.
- A tutorial for threshold models centred on examining the differences of these models.
- Paralelize ORCA experiments.
- Use ORCA with HTCondor.
The Algorithms folder includes the MATLAB classes for the algorithms included and the original code (if applicable). config-files includes different configuration files for running all the algorithms. In order to use these files, you will need the datasets of our review paper.
The algorithms included in ORCA are:
- SVC1V1 [1]: Nominal Support Vector Machine performing the OneVsOne formulation (considered as a naïve approach for ordinal regression since it ignores the order information).
- SVC1VA [1]: Nominal Support Vector Machine with the OneVsAll paradigm (considered as a naïve approach for ordinal regression since it ignores the order information).
- SVR [2]: Standard Support Vector Regression with normalised targets (considered as a naïve approach for ordinal regression since the assumption of equal distances between targets is done).
- CSSVC [1]: This is a nominal SVM with the OneVsAll decomposition, where absolute costs are included as different weights for the negative class of each decomposition (it is considered as a naïve approach for ordinal regression since the assumption of equal distances between classes is done).
- SVMOP [3,4]: Binary ordinal decomposition methodology with SVM as base method, it imposes explicit weights over the patterns and performs a probabilistic framework for the prediction.
- ELMOP [5]: Standard Extreme Learning Machine imposing an ordinal structure in the coding scheme representing the target variable.
- POM [6]: Extension of the linear binary Logistic Regression methodology to Ordinal Classification by means of Cumulative Link Functions.
- SVOREX [7]: Ordinal formulation of the SVM paradigm, which computes discriminant parallel hyperplanes for the data and a set of thresholds by imposing explicit constraints in the optimization problem.
- SVORIM [7]: Ordinal formulation of the SVM paradigm, which computes discriminant parallel hyperplanes for the data and a set of thresholds by imposing implicit constraints in the optimization problem.
- SVORLin [7]: We have also included a linear version of the SVORIM method (considering the linear kernel instead of the Gaussian one) to check how the kernel trick affects the final performance (SVORLin).
- KDLOR [8]: Reformulation of the well-known Kernel Discriminant Analysis for Ordinal Regression by imposing an order constraint in the projection to compute.
- NNPOM [6,9]: Neural Network based on Proportional Odd Model (NNPOM), implementing a neural network model for ordinal regression. The model has one hidden layer and one output layer with only one neuron but as many threshold as the number of classes minus one. The standard POM model is applied in this neuron to have probabilistic outputs.
- NNOP [10]: Neural Network with Ordered Partitions (NNOP), this model considers the OrderedPartitions coding scheme for the labels and a rule for decisions based on the first node whose output is higher than a predefined threshold (T=0.5). The model has one hidden layer and one outputlayer with as many neurons as the number of classes minus one.
- REDSVM [11]: Augmented Binary Classification framework that solves the Ordinal Regression problem by a single binary model (SVM is applied in this case).
- ORBoost [12]: This is an ensemble model based on the threshold model structure, where normalised sigmoid functions are used as the base classifier. The weights parameters configures whether the All margins versions is used (
weights=true
) or the Left-Right margin is used (weights=false
). - OPBE [13]: This method implements an ordinal projection based ensemble (OPBE) based on three-class decompositions, following the ordinal structure. A specific method for fusing the probabilities returned by the different three-class classifiers is implemented (product combiner, logit function and equal distribution of the probabilities). The base classifier is SVORIM but potentially any of the methods in ORCA can be setup as base classifier.
Measures folder contains the MATLAB classes for the metrics used for evaluating the classifiers. The measures included in ORCA are the following (more details about the metrics can be found in [14,15]:
- MAE: Mean Absolute Error between predicted and expected categories, representing classes as integer numbers (1, 2, ...).
- MZE: Mean Zero-one Error or standard classification error (1-accuracy).
- AMAE: Average MAE, considering MAEs individually calculated for each class.
- CCR: Correctly Classified Ration or percentage of correctly classified patterns.
- GM: Geometric Mean of the sensitivities individually calculated for each class.
- MMAE: Maximum MAE, considering MAEs individually calculated for each class.
- MS: Minimum Sensitivity, representing the ratio of correctly classified patterns for the worst classified class.
- Spearman: Spearman Rho.
- Tkendall: Tau of Kendall.
- Wkappa: Weighted Kappa statistic, using ordinal weights.
- DataSet.m: Class for data preprocessing.
- Experiment.m: Class that runs the different experiments.
- Utilities.m: Class that pre-process the experiment files, run the different algorithms and produces the results.
- runtests.m: Script to run all the methods in order to check that the installation is correct.
- runtestssingle.m: Script to run all the methods using the ORCA API. Reference performance is compared with toy dataset in order to check that the installation is correct.
- runtestscv.m: This script runs full experiment tests using the ORCA configuration files to describe experiments.
condor folder contains the necessary files and steps for using HTCondor with our framework.
The ORCA frameworks makes use of the following external software implementations. For some of them, a Matlab interface has been developed through the use of MEX files.
- libsvm-weights-3.12: we have used this framework for Support Vector Machine algorithms. The version considered was 3.12.
- libsvm-rank-2.81: this implementation was used for the method REDSVM. The version considered was 2.81.
- orensemble: this implementation was used for the method ORBoost.
- SVOR: this implementation was used for the methods SVOREX, SVORIM and SVORIMLin.
Apart from the authors of the paper and the authors of the implementations referenced in "External software" section, the following persons have also contributed to ORCA framework:
- Juan Martín Jiménez Alcaide developed the Matlab wrappers for the SVORIM and SVOREX algorithms.
- [1] C.-W. Hsu and C.-J. Lin, “A comparison of methods for multi-class support vector machines,” IEEE Transaction on Neural Networks, vol. 13, no. 2, pp. 415–425, 2002.
- [2] A. Smola and B. Schölkopf, “A tutorial on support vector regression,” Statistics and Computing, vol. 14, no. 3, pp. 199–222, 2004.
- [3] E. Frank and M. Hall, “A simple approach to ordinal classification,” in Proceedings of the 12th European Conference on Machine Learning, ser. EMCL ’01. London, UK: Springer-Verlag, 2001, pp. 145–156.
- [4] W. Waegeman and L. Boullart, “An ensemble of weighted support vector machines for ordinal regression,” International Journal of Computer Systems Science and Engineering, vol. 3, no. 1, pp. 47–51, 2009.
- [5] W.-Y. Deng, Q.-H. Zheng, S. Lian, L. Chen, and X. Wang, “Ordinal extreme learning machine,” Neurocomputing, vol. 74, no. 1–3, pp. 447– 456, 2010.
- [6] P. McCullagh, “Regression models for ordinal data,” Journal of the Royal Statistical Society. Series B (Methodological), vol. 42, no. 2, pp. 109–142, 1980.
- [7] W. Chu and S. S. Keerthi, “Support Vector Ordinal Regression,” Neural Computation, vol. 19, no. 3, pp. 792–815, 2007.
- [8] B.-Y. Sun, J. Li, D. D. Wu, X.-M. Zhang, and W.-B. Li, “Kernel discriminant learning for ordinal regression,” IEEE Transactions on Knowledge and Data Engineering, vol. 22, no. 6, pp. 906–910, 2010.
- [9] M. J. Mathieson, Ordinal models for neural networks, in Proc. 3rd Int. Conf. Neural Netw. Capital Markets, 1996, pp. 523-536.
- [10] J. Cheng, Z. Wang, and G. Pollastri, "A neural network approach to ordinal regression," in Proc. IEEE Int. Joint Conf. Neural Netw. (IEEE World Congr. Comput. Intell.), 2008, pp. 1279-1284.
- [11] H.-T. Lin and L. Li, “Reduction from cost-sensitive ordinal ranking to weighted binary classification,” Neural Computation, vol. 24, no. 5, pp. 1329–1367, 2012.
- [12] H.-T. Lin and L. Li, “Large-margin thresholded ensembles for ordinal regression: Theory and practice,” in Proc. of the 17th Algorithmic Learning Theory International Conference, ser. Lecture Notes in Artificial Intelligence (LNAI), J. L. Balcazar, P. M. Long, and F. Stephan, Eds., vol. 4264. Springer-Verlag, October 2006, pp. 319–333.
- [13] M. Pérez-Ortiz, P. A. Gutiérrez y C. Hervás-Martínez. “Projection based ensemble learning for ordinal regression”, IEEE Transactions on Cybernetics, Vol. 44, May, 2014, pp. 681-694.
- [14] M. Cruz-Ramírez, C. Hervás-Martínez, J. Sánchez-Monedero and P. A. Gutiérrez. “Metrics to guide a multi-objective evolutionary algorithm for ordinal classification,” Neurocomputing, Vol. 135, July, 2014, pp. 21-31.
- [15] J. C. Fernandez-Caballero, F. J. Martínez-Estudillo, C. Hervás-Martínez and P. A. Gutiérrez. “Sensitivity Versus Accuracy in Multiclass Problems Using Memetic Pareto Evolutionary Neural Networks,” IEEE Transacctions on Neural Networks, Vol. 21. 2010, pp. 750-770.