This repository was archived by the owner on Jan 14, 2021. It is now read-only.
Hi Julien, I have commited the changes to allow to optionally generate the vector in the same step, as well as exposing the vector params to the plugin#1
Open
hugopinto wants to merge 5 commits intoDigitalPebble:masterfrom
Conversation
added 3 commits
April 13, 2011 16:43
…aining, using 4 additional parameters:
minFreq and maxFreq will filter features that are below or above the min and max freq, respectivelly.
After the remaining ones, we will keepNBestAttributes (before it was possible to filter either by min/max or by n best).
Finally, we let the user control if he wants to compactLexicon or not, so that indices remain continguous, instead of having gaps of due to the filtered bits.
CREOLE nos has:
<PARAMETER NAME="minFreq" RUNTIME="true" DEFAULT="1" OPTIONAL="true">java.lang.Integer</PARAMETER>
<PARAMETER NAME="maxFreq" RUNTIME="true" DEFAULT="2147483647" OPTIONAL="true">java.lang.Integer</PARAMETER>
<PARAMETER NAME="keepNBestAttributes" RUNTIME="true" DEFAULT="0" OPTIONAL="true">java.lang.Integer</PARAMETER>
<PARAMETER NAME="compactLexicon" RUNTIME="true" DEFAULT="True" OPTIONAL="true">java.lang.Boolean</PARAMETER>
Author
|
Sorry, I was just messing up - just realized that a simple commit will not commit all modified files. now all are in git |
Member
|
Hi Hugo, Thanks for sharing this. A few comments below:
The idea behind the generation of the vectors from the PR is to do what most people do first i.e. make no assumptions as to what works best and try without any filtering. Any reason not to use the latest stable version of the TC api (1.5?), have you changed something on that front? |
added 2 commits
April 19, 2011 12:54
…the end. For some reason, the default Lexicon.saveFile does not work. It has a dependency on lib-svm, and lib-svm was not available, thus I added it. Seems like the liblinear-with-deps actually is missing deps. The creole was modified to account for the lib-svm dependency
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Now the plugin generates a vector file in libsvm format, ready for training, using 4 additional parameters:
minFreq and maxFreq will filter features that are below or above the min and max freq, respectivelly.
After the remaining ones, we will keepNBestAttributes (before it was possible to filter either by min/max or by n best).
Finally, we let the user control if he wants to compactLexicon or not, so that indices remain continguous, instead of having gaps of due to the filtered bits.
CREOLE nos has: