-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Calum Bell
authored
May 15, 2018
1 parent
3a47636
commit c96cd6b
Showing
1 changed file
with
59 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,59 @@ | ||
# GoLang Keyword Density Analyser | ||
|
||
Implemented in GoLang, this application is used to utilise concurrency to increase the speedup of keyword density analysis on large strings between **50k - 500k** words long. | ||
|
||
Compared to a basic sequential implementation in C++, this application runs **500x faster** on a dataset of 70k words on an intel i5 3.3GHz CPU. | ||
|
||
## About | ||
|
||
This application was built in parallel with a CUDA (C++) implementation, also available on my Github, which takes advantage of GPU threads. Out of the two, this solution ran faster and scaled further. | ||
|
||
## Getting Started | ||
|
||
**Prerequisites** | ||
|
||
1. Install Go by following the instructions as seen on: https://golang.org/doc/install | ||
|
||
---------- | ||
|
||
**Get Started:** | ||
|
||
1. Clone the Repo. | ||
2. Ensure your server has a route which expects a { keyword: "sample" } and returns a String. | ||
3. Open textprocessing.go in a text editor and change line 381 to your server API address. | ||
4. Open terminal/CMD and run go run textprocessing.go from within the directory. | ||
|
||
## Files | ||
|
||
- Test Strings.zip - Contains sample input files to provide guidance on data format. | ||
- results.txt - Sample output from the application. | ||
- sampleresponse.txt - This application expects to receive data from a server, this exemplifies the format which data is accepted. | ||
- textprocessing.go - GoLang file containing parseManager and parser go routines. | ||
|
||
## Performance | ||
The performance of this application has been measured against a sequential C++ solution and a shared memory solution in NVIDIA's CUDA. | ||
|
||
![Performance Analysis](https://i.imgur.com/xddRvbN.png) | ||
|
||
**Notice for particularly small datasets, this solution is slower, due to an increased overhead.** | ||
|
||
## Data Flow Diagram | ||
|
||
The data flow for this program looks like: | ||
|
||
**Can be viewed in stackedit.io** | ||
|
||
```mermaid | ||
graph LR | ||
A[parseManager] -- Keyword --> G[server] | ||
G[server] -- Bulk String --> A[parseManager] | ||
A[parseManager] -- Sub-string --> B((parser)) | ||
B((parser)) -- Frequency Map --> A[parseManager] | ||
A[parseManager] -- Sub-string --> C((parser)) | ||
C((parser)) -- Frequency Map --> A[parseManager] | ||
A[parseManager] -- Sub-string --> D((parser)) | ||
D((parser)) -- Frequency Map --> A[parseManager] | ||
A[parseManager] -- Sub-string --> E((parser)) | ||
E((parser)) -- Frequency Map --> A[parseManager] | ||
A[parseManager] -- Prints Sorted Map --> F(Screen) | ||
``` |