From 439ffb518f89188754977d5a986124722aa23a17 Mon Sep 17 00:00:00 2001 From: werner kassens Date: Sun, 20 May 2012 15:44:18 +0200 Subject: [PATCH] changed link to picture in readme so that Serge's github site is independent of mine --- README-Math-KDTREE.md | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/README-Math-KDTREE.md b/README-Math-KDTREE.md index 1d1d671..0f9d5fd 100644 --- a/README-Math-KDTREE.md +++ b/README-Math-KDTREE.md @@ -1 +1 @@ -A kd-tree is a data structure to store multidimensional points in such a way, that it is easy and fast to *repeatedly* find nearest neighbours. For further explanations see [http://en.wikipedia.org/wiki/Kd-tree](http://en.wikipedia.org/wiki/Kd-tree). For loading this package you follow the general SciSmalltalk README and set the repo variable to the local copy of this directory (eg something like repo := '/home/user/SciSmalltalk/Math-KDTree'.). Then you can load it with: ```Smalltalk Gofer new repository: (MCFileTreeRepository new directory: (FileDirectory on: repo)); package: 'Math-KDTree'; load. ``` You construct a kdtree using #withAll: ```Smalltalk kd:=KDTree withAll: aCollectionOfVectors. ``` where a Vector can be any collection of numbers that understands #at: You can then find the nearest neighbours of anotherVector in aCollectionOfVectors with #nnSearch:i: ```Smalltalk kd nnSearch: anotherVector i: numberOfNearestNeighbours. ``` Let me make a simple application just as an example how it can be used. ```Smalltalk "let's say we have some simple 1-dimensional data" sine:= (1 to: 50 by: 0.2)collect: [:i|i sin ]. "let's blur that a little bit" dirtysine :=sine collect: [:i|i +Float random-0.5]. "now suppose we want to clean up the mess by partititioning that mess and then averaging the most similar partitions. ok, first we calc the partition" partitionLength:=30. partition := (1 to: (dirtysine size+1 -partitionLength))collect: [:i|( dirtysine copyFrom: i to:( i+partitionLength-1)) ]. "then we make a kdtree so that it does not take too long to find the most similar parts" tree:=KDTree withAll: partition. "say we want to average 30 partitions" numberOfNeighbours :=30. "then we go through all partitions, find its neighbours and calc the average for each of them" averagedPartition :=(1 to: partition size)collect: [:i| |neighbours| neighbours :=(tree nnSearch: (partition at: i) i: numberOfNeighbours) reduce: [ :a :b | a + b ]. neighbours / numberOfNeighbours]. "what we have now with 'averagedPartition' is an averaged version of 'partition' . we can now reconstruct a somewhat smoother version of the original mess by undoing the partition thing, iow reversing the calculation of the partition we did in the first part" reconstruction :=Array new:(sine size). (1 to: sine size)do:[:i| |n sum| n:=0. sum:=0. 1 to: averagedPartition first size do:[:k| (i between: k and: (averagedPartition size+k-1)) ifTrue: [n:=n+1.sum:=sum+((averagedPartition at:(i+1-k))at:k)] ]. reconstruction at:i put: (sum/n)]. "ok, lets calc the errors" ((sine - dirtysine) squared sum / sine size)sqrt . " 0.2901228130544897" ((sine - reconstruction) squared sum / sine size)sqrt . " 0.07493816828762763" "well, it seems the code did its thing" ``` ![plot plot](https://github.com/WernerK/SciSmalltalk/raw/master/Math-KDTree/NNPlot.jpg) \ No newline at end of file +A kd-tree is a data structure to store multidimensional points in such a way, that it is easy and fast to *repeatedly* find nearest neighbours. For further explanations see [http://en.wikipedia.org/wiki/Kd-tree](http://en.wikipedia.org/wiki/Kd-tree). For loading this package you follow the general SciSmalltalk README and set the repo variable to the local copy of this directory (eg something like repo := '/home/user/SciSmalltalk/Math-KDTree'.). Then you can load it with: ```Smalltalk Gofer new repository: (MCFileTreeRepository new directory: (FileDirectory on: repo)); package: 'Math-KDTree'; load. ``` You construct a kdtree using #withAll: ```Smalltalk kd:=KDTree withAll: aCollectionOfVectors. ``` where a Vector can be any collection of numbers that understands #at: You can then find the nearest neighbours of anotherVector in aCollectionOfVectors with #nnSearch:i: ```Smalltalk kd nnSearch: anotherVector i: numberOfNearestNeighbours. ``` Let me make a simple application just as an example how it can be used. ```Smalltalk "let's say we have some simple 1-dimensional data" sine:= (1 to: 50 by: 0.2)collect: [:i|i sin ]. "let's blur that a little bit" dirtysine :=sine collect: [:i|i +Float random-0.5]. "now suppose we want to clean up the mess by partititioning that mess and then averaging the most similar partitions. ok, first we calc the partition" partitionLength:=30. partition := (1 to: (dirtysine size+1 -partitionLength))collect: [:i|( dirtysine copyFrom: i to:( i+partitionLength-1)) ]. "then we make a kdtree so that it does not take too long to find the most similar parts" tree:=KDTree withAll: partition. "say we want to average 30 partitions" numberOfNeighbours :=30. "then we go through all partitions, find its neighbours and calc the average for each of them" averagedPartition :=(1 to: partition size)collect: [:i| |neighbours| neighbours :=(tree nnSearch: (partition at: i) i: numberOfNeighbours) reduce: [ :a :b | a + b ]. neighbours / numberOfNeighbours]. "what we have now with 'averagedPartition' is an averaged version of 'partition' . we can now reconstruct a somewhat smoother version of the original mess by undoing the partition thing, iow reversing the calculation of the partition we did in the first part" reconstruction :=Array new:(sine size). (1 to: sine size)do:[:i| |n sum| n:=0. sum:=0. 1 to: averagedPartition first size do:[:k| (i between: k and: (averagedPartition size+k-1)) ifTrue: [n:=n+1.sum:=sum+((averagedPartition at:(i+1-k))at:k)] ]. reconstruction at:i put: (sum/n)]. "ok, lets calc the errors" ((sine - dirtysine) squared sum / sine size)sqrt . " 0.2901228130544897" ((sine - reconstruction) squared sum / sine size)sqrt . " 0.07493816828762763" "well, it seems the code did its thing" ``` ![plot plot](https://github.com/SergeStinckwich/SciSmalltalk/raw/master/NNPlot.jpg) \ No newline at end of file