Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ciic 2016 challenge #23

Merged
merged 20 commits into from
May 12, 2016
Merged

Conversation

breznak
Copy link
Member

@breznak breznak commented May 1, 2016

CinC challenge

https://physionet.org/challenge/2016/

A prestigious challenge/conference with nice data!

🔥 UPDATE: game's still ON! 🎸

Looking for hackers to help me set someting up, if it's feasible. The there will be whole summer to tune the app.

Blocked by: Add encoders #22

Plan of attack

  • audio
    • for now use wav2vect from Matlab
    • implement wavEncoder - IN PROGRESS Wav encoder #26
    • evaluate if functionality of the WAVEncoder (internal scipy) is the same as matlab's
    • try Cochlea encoder
    • implement sound encoders for nupic.audio Create sound encoders #22
  • training
    • records are Normal/Anomaly/Unknown
    • aggregate all NORMAL records to a 2 column file (reset, PCG)
      • how radical subsampling? bcs nupic is too slow to process whole dataset: only down to 1000(from 2000),bcs of Sampling Theorem (Fs>=2*F)
    • commit the training data files (bcs the preprocessing takes long)
    • train a HTM model + serialize it
    • try param swarming
  • evaluation
    • load the model, disable learning
    • 2 tasks description.py?, OR other way to train/load/eval a model on datasets
    • compute average anomaly score for all datapoints of a record
    • implement the anomaly metric in nupic
    • create a model (for nupic?) that does this classification based on avg. anomaly scores?
    • threshold to Normal/Anomaly/Unknown
  • submission
    • modify examples sample2016*
    • nupic is installed, so setup will just source a virtualenv
    • each evaluation in next will call matlab (wav2csv), python(writes anomaly scores to CSV), matlab again(loads anomalies and decides classification)
    • this is problematic, better go full-python if possible!
  • improvements:
    • try bag (multi model) voting
      • model trained on full normal data
    • model on FHS parts
    • model on anomalous data
    • model pretrained on ECG data from other sources! https://github.com/breznak/nupic.biodat

Fixes #29

@breznak
Copy link
Member Author

breznak commented May 1, 2016

@nupic-community/core

@rhyolight
Copy link
Contributor

@breznak Looks cool, I wish we would have started this a month ago. 😕

@breznak
Copy link
Member Author

breznak commented May 2, 2016

@rhyolight yep :/ I got to know about it just "5 mins to deadline". But the actual competition is in the end of the summer, I have to find out..but it would make sense if we could still participate! 🙏 I'm going to work on this over the summer, actually I have a working prototype now..

@rhyolight
Copy link
Contributor

@breznak Cool! Hey @rcrowder do you mind if we merge this?

@breznak breznak force-pushed the ciic_2016_challenge branch from 54b995e to d9b490e Compare May 2, 2016 15:52

"tasks":[
{
"taskLabel": "train",
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rhyolight can you help me with this? Basically I need to train a model, disable learning and run it on evaluation data;
I'd like to do that with run_opf_experiment.py script, I found an example with the tasks, but it does not work..not sure if it's supported still, or how to approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry @breznak but I never use run_opf_experiment.py. I found that interface too confusing. Maybe @scottpurdy can help you.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...cool, thanks. If not, do you have an example

  • using Regions/Network API and calling run() and save() from code?

@rhyolight
Copy link
Contributor

This is a cool idea, but I don't have any experience using NuPIC with the route you are taking. Let's see if anyone else is interested in helping. @nupic-community/core

@rcrowder
Copy link
Member

rcrowder commented May 8, 2016

@breznak Looks like a really good start.

@breznak
Copy link
Member Author

breznak commented May 12, 2016

Working plan to get some validation results ASAP:

  • training data
    • will train only on Normal data and select (FHS) subsequences of it
    • data extracted from Matlab @breznak will do that
  • train HTM model
    • on the provided data
    • just one HTM model (with RDSE? encoder, what best settings? probably no time to swarm)
    • able to serialize the model and load to run on eval. data (learning off)
      • the approach with OPF is not reliably working, can someone post code to do that? (@rhyolight or someone..?)
  • write simple classification function: classify(anScores[])
    • should decide classification from the anomaly scores for the whole sequence/sample
    • can be sth like avg and Normal iff <0.4; UNKNOWN iff [0.4...0.7]; Anomal iff > 0.7; ETA ~10mins
  • score
    • process validation data (@breznak will commit a file)
    • classify & compute score -> submit! 🙏

CC @fergalbyrne

@breznak
Copy link
Member Author

breznak commented May 12, 2016

Merging to get collaboration easier

@breznak breznak merged commit f14f3f1 into htm-community:master May 12, 2016
@breznak breznak deleted the ciic_2016_challenge branch May 12, 2016 17:10
@breznak breznak restored the ciic_2016_challenge branch May 12, 2016 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants