Skip to content

Commit

Permalink
Add implementation
Browse files Browse the repository at this point in the history
Implement the CTC beam search and update README

Signed-off-by: Yihong Wang <[email protected]>
  • Loading branch information
yhwang committed Feb 14, 2020
1 parent d233e0e commit d6acb7f
Show file tree
Hide file tree
Showing 8 changed files with 561 additions and 1 deletion.
5 changes: 5 additions & 0 deletions .npmignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
node_modules/
.npmignore
src/
.vscode/
package-lock.json
53 changes: 52 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,53 @@
# beam-search
Implement the Connectionist Temporal Classification(CTC) beam search in JavaScript
Implement the Connectionist Temporal Classification(CTC) beam search in
JavaScript. The input is log probabilities of an array. The length of the
array is the number of CTC slots. Each item in the array contains an array of
log probabilities to each characters, including blank character. Usually
the blank character is the last one. The implementation doesn't support NGram
now. But it's one of the todos.

## Usage
The following code is used to handle English CTC results:
``` javascript
const { CTCBeamSearch, EN_VOCABULARY } = require('ctc-beam-search');
const bs = new CTCBeamSearch(EN_VOCABULARY);
const data = ....; // log probabilities
const results = bs.search(data, 5); // beam width = 5
// dump the first result to console as a string
console.log(results[0].convertToStr(EN_VOCABULARY));
```

The `EN_VOCABULARY` is like this:
``` javascript
const { Vocabulary } = require('ctc-beam-search');
const engV = new Vocabulary({ ' ': 0,
'a': 1,
'b': 2,
'c': 3,
'd': 4,
'e': 5,
'f': 6,
'g': 7,
'h': 8,
'i': 9,
'j': 10,
'k': 11,
'l': 12,
'm': 13,
'n': 14,
'o': 15,
'p': 16,
'q': 17,
'r': 18,
's': 19,
't': 20,
'u': 21,
'v': 22,
'w': 23,
'x': 24,
'y': 25,
'z': 26,
'\'': 27,
}, 28);
```
You can create you own Vocabulary.
29 changes: 29 additions & 0 deletions package.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
{
"name": "ctc-beam-search",
"version": "0.1.0",
"description": "Implement Connectionist Temporal Classificiation(CTC) beam search in JavaScript",
"main": "dist/index.js",
"scripts": {
"build": "tsc"
},
"repository": {
"type": "git",
"url": "git+https://github.com/yhwang/beam-search.git"
},
"keywords": [
"CTC",
"beam",
"search"
],
"author": "[email protected]",
"license": "Apache-2.0",
"bugs": {
"url": "https://github.com/yhwang/beam-search/issues"
},
"homepage": "https://github.com/yhwang/beam-search#readme",
"devDependencies": {
"@types/node": "^12.12.17",
"tslint": "^5.20.1",
"typescript": "3.5.3"
}
}
13 changes: 13 additions & 0 deletions src/constants.ts
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
// Blank index (-) in the CHAR_MAP
export const EN_BLANK_INDEX = 28;
// Character list for English
export const EN_CHARS = ' abcdefghijklmnopqrstuvwxyz\'';
// Character map for English. string as key and index as value
export const EN_CHAR_MAP: {[key: string]: number} = {};
export const EPSILON = 1e-5;
export const IS_NODE = typeof(window) === 'undefined';

// Initialize the EN_CHAR_MAP
for (let index = 0, len = EN_CHARS.length; index < len; index++) {
EN_CHAR_MAP[EN_CHARS.charAt(index)] = index;
}
Loading

0 comments on commit d6acb7f

Please sign in to comment.