Re: Is there a maximum vector length other than a memory limit?
Quote:
Originally Posted by
JohannesMunk
If Q={1,2,3} then sum(Qi)=6. if P={0,0,1} sum(Pi*Qi)=3. I was only talking of the denominator of your formula. There are no mixed (p and q) sum expressions there.
No, thats not correct. The calculation is only performed using pixels in Q and P that whose values are non zero in BOTH P and Q. Imagine you have a for loop before the correlation calculation that modifies Q based on P. Say Q = {1,2,3} and P = {0,0,1} with the following for loop:
Code:
for(ii=0; ii < 3; ii++) {
if(P[ii] == 0)
Q[ii]=0;
elseif(Q[ii] == 0)
P[ii]=0;
then Q = {0,0,3} and sum(Qi) = 3 not 6.
Then the equation is used, but since you don't know P when the training stack is created you cannot precalculate the sums. My point was that time required to calculate the correlation is important not the time to calculate the training set. So if I precalculate the sums when I generate the training set and then subtract the values at Q[ii] when P[ii]==0 when calculating the correlation I would reduce the computation in calculating the correlation although I increased the total number of calculations.
Re: Is there a maximum vector length other than a memory limit?
Mmh.. thats a strange correlation, if you look only at those pixels that are present in both images. You loose most of the 'geometric' non-matching information. But maybe you are well aware of that, and intend to do so.
Going by your definition. If two of your frames were checkerboards, but one of them shifted by just one pixel, your way of calculating the correlation, would result in 0. Whereas there is a maximum anticorrelation in those two frames => R(P,Q) should be -1.
I would expect that the nominator of a correlation-term, contains only those values that match (when both Pi and Qi are nonzero = Pi*Qi) and the denominator (=normalization) contains all possible values... And then you definitely can precalculate the sums for the individual frames and calculate the denominator for each pair very fast. The nominator of course is a different story!
See http://en.wikipedia.org/wiki/Correlation .
To make this clear: Your Pi and Qi should just be the untouched rawdata. Not some Pi*Qi as you suggested. Otherwise this is not really a correlation.
If you do it your way: Have you checked your normalization? If you compare different R(P,Q) is it even comparable the way you do it?
Joh
Re: Is there a maximum vector length other than a memory limit?
I'll try to explain a little bit more on what the image actually is and that should help explain why I do things this way. These images are "spin-images" imagine if you took a 3D surface mesh, spun a finite plane about a randomly selected vertex's normal and record where each other vertex in the mesh intersects the plane. This can then be discreatized into bins and is actually a 2D histogram representing the accumulation of mesh vertices on the plane. When training the 3D mesh is an isolated model of your object of interest, but when you observe a real world scene of that object it is typically not isolated (there is clutter) and other objects would then be included when calculating spin-images for that surface. By only using pixel (or more correctly bins in the histogram) that are nonzero in both the training, Q, and the observed scene, P, you are then only considering the overlap of the images. Even though the modified correlation coefficient is robust to clutter, it can erroneously report high or complete correlation between images with little overlap. However, with more overlapping pixels, there is more confidence in the value of the correlation coefficient obtained. The correlation coefficient can then be used to calculate a similarity measure that combines the correlation coefficient and its variance into a single function. This can be done by performing a change of variables of the correlation coeff with the hyperbolic arctangent, where in which the variance becomes a direct function of the number of non zero pixels used in the calculation.
Normalization is still fine since the W^2 in the equation becomes the number of nonzero bins in both P and Q.
Re: Is there a maximum vector length other than a memory limit?
The spin-images sounds like an interesting concept. But I don't understand your "overlap" reasoning. But you've obviously put a lot of thought into it, so it'll be allright :->
Back to computation:
1) When you calculate a correlation, make sure you iterate through your pixel data only once and gather all required variables in one pass. Memory fetching is always time consuming.
2) Inside your loop, calculate Pi*Qi first. That goes fast if one of them is zero. And you can use the result to decide if you add the pi and qi to their respective sums..
3) You can very easily parallize the computation of the different correlation coefficients => Multithreading. See QtConcurrent.
4) If you need this REALLY fast, consider going for OpenCL (GPU).
5) Make sure you really need doubles. Depending on your platform that could speed up computation a lot if you used singles or even integers. On GPUs double performance is about half of single performance. On CPUs the difference between f64 and f32 will probably be insignificant.
6) Make sure you compile with some high stream enables (SSE2, ... ) optimization (O3 ..) Depending on your code this might resolve in fast (SIMD) SSE code.
Good luck with your project!
Joh
Re: Is there a maximum vector length other than a memory limit?
I'm way ahead of you! I have this all working on the GPU, not OpenCL but NVIDIA's CUDA. I have all the algorithms working in Matlab and I can call the CUDA kernals through mex functions. I also have a rough c++ mex of just the correlation calculation portion of the algorithm. I get about a 22x speedup when going from Matlab to c++ and over 1000x speed-up when going from Matlab to CUDA! For all those implementations I have using floats instead of doubles, which is probably what I'll use here as well.
Thanks for your ideas, you've been a huge help.
One quick question. Would I be better off using QVector instead of QList since I know the desired length of the array and can allocate it on initialization with QVector? Also, lets say I make a class called siStore to store the pixel and any precalculated values. I then have
Code:
QList<siStore> *siStack = new QList<siStore>();
how then do I tell the QList how many elements it has and how do I tell each siStore in the QList how many elements it has? All my programming experience is in Matlab, so classes, pointers, and templates are all new to me!
Re: Is there a maximum vector length other than a memory limit?
That's great!
QList vs QVector will not make much of a difference. See http://doc.qt.nokia.com/4.6/containers.html#container-classes
Code:
#include <QtCore>
#include <QtGui>
struct siFrame {
// default constructor - initialize sums..
siFrame() {sum = 0;sqsum = 0;}
float pi[256];
float sum;
float sqsum;
};
int main(int argc, char *argv[])
{
qDebug() << "Start Vector";
QVector<siFrame* > siStore(714838);
for (int i=0;i<siStore.count();++i)
{
siFrame* frame = new siFrame();
for (int j=0;j<256;++j) {frame->pi[j] = j;frame->sum += j;frame->sqsum += j*j;}
// if you have the data already in memory use: memcpy()
siStore[i] = frame;
}
qDebug() << "Start List";
QList<siFrame* > siStoreL;
for (int i=0;i<714838;++i)
{
siFrame* frame = new siFrame();
for (int j=0;j<256;++j) {frame->pi[j] = j;frame->sum += j;frame->sqsum += j*j;}
// if you have the data already in memory use: memcpy()
siStoreL.append(frame);
}
qDebug() << "End";
// Just a widget
wdg.show();
int result = a.exec();
// Clean up - just loop through all elements and call delete on the pointers.
qDeleteAll(siStore);
siStore.clear();
qDeleteAll(siStoreL);
siStoreL.clear();
//int result = a.exec();
return result;
}
This requires 1.5 GB RAM and works without problems in Win32.
I don't know if saving sum and sqsum for your kind of correlation really makes sense. Your implementation will look a lot uglier, if you substract only some of the values. And you still need to iterate through all pixels.
Joh
Re: Is there a maximum vector length other than a memory limit?
Thanks that works. However, how would I set this up in a class? Both the 256 and 714838 are dynamic. Instead of a struc for siFrame() I made a class with a constructor that allows me to set the size of the array by using QVector in siFrame instead of float[];
I have a class called siTrainer that needs to load a queue of things to train and processes that queue by allocating a siStore list, calculating each siFrame, populating the siStore list, writing the siStore list to harddisk, delete the siStore list and then repeat. So I need the class to be able to initialize and clear the siStore list. I need something like this:
Code:
class siTrainer
{
public:
siTrainer();
~siTrainer();
private:
void initStore();
void deleteStore();
int numPts; //The number of images, 714838 in my example
int siWidth; //The size of each image, 256 in my example
QVector<siFrame *> *siStore;
}
How would I implement initStore()?
QVector<siFrame* > siStore(714838); won't work since it will just create a local variable that cannot
latter be accessed by cleatStore(). When I try
Code:
siStore = new QVector<siFrame *>(numPts)
I get the following error
Quote:
binary '=': no operator found which takes a right-hand operand of type 'siFrame *'
for the following line
Code:
siStore[i] = frame;
Re: Is there a maximum vector length other than a memory limit?
I'm not sure of your trainer logic. But you should get everything out of here. I had already shown you how to insert a frame into a container.
Code:
#include <QtCore>
#include <QtGui>
struct siFrame {
siFrame(int numpixels) {sum = 0;sqsum = 0;pi = new float[numpixels];}
float* pi;
float sum;
float sqsum;
};
class siTrainer {
public:
siTrainer(int numframes,int numpixels) {
for (int i=0;i<numframes;++i)
{
siFrame* frame = new siFrame(numpixels);
// initialize your frame..
for (int j=0;j < numpixels;++j)
{
frame->pi[j] = j;
frame->sum += j;
frame->sqsum += j*j;
}
// store the frame
siStore.append(frame);
}
}
~siTrainer() {
qDeleteAll(siStore);
siStore.clear();
}
private:
QList<siFrame* > siStore;
};
int main(int argc, char *argv[])
{
//for each training ..
{
siTrainer* s = new siTrainer(714838,256);
// do something with your trainer..
// free it up
delete s;
}
return 0;
}
To make this clear:
siFrame* frame = new siFrame(numpixels);
defines and initializes a reference to a frame instance. (BTW you could just change struct to class if you need more functionality. Both are the same thing except that struct members are public by default)
siStore.append(frame);
Saves the reference to the list. When frame goes out of scope (in each iteration) only the pointer variable itself is freed, not the object it points to. So the other reference in the list still points to the valid object.
Joh
Re: Is there a maximum vector length other than a memory limit?
Your siStore is a pointer to QVector so try:
(*siStore)[i] = frame;
If you insist on staying with a Vector. But if you ran my little app, you would see though, that there is no performance difference between QList and QVector. QList is internally an array too.. which resizes according to some growing strategy.. Append is Amort(1).. The memory overhead for an array of pointers that is slightly to big is insignificant. It justs handles better.
Joh
Re: Is there a maximum vector length other than a memory limit?
Thanks again you've been a huge help. I just didn't know if there would be something better than to use .append() since I know the length needed. I understand that QList allocates more memory than it needs to make the .append() operation fast, but I figured that I would be doing so many appends that it would have a performance effect, but I don't see any when I run it.
Re: Is there a maximum vector length other than a memory limit?
OK everything works but the qDeleteAll() doesn't work the way I would expect. In my siTrainer class I have a method run() that gets called to process the training queue and it bassically works like this:
Code:
void siTrainer::run()
{
foreach(item in queue)
{
this->initStore();
this->doWork();
this->deleteStore();
{
}
Where the initStore() an ddeleteStore() is implemented with the QList as you suggest above.
qDeleteAll(siStore) works in that I don't have a memory leak but the counter in siStore never resets so when I perform initStore() the second time in the loop the siFrames get appened to a list with a count of 714838.
Re: Is there a maximum vector length other than a memory limit?
my fault! qDeleteAll just iterates through each item an deletes it. But the now invalid references stay in the container. You have to clear the container!
qDeleteAll(c);
c.clear();
Another question is, if you really need to delete all the frames. Is it a whole new set every time, or do you recreate all of them for each incoming frame?
Joh
Re: Is there a maximum vector length other than a memory limit?
That works perfect!
It is a whole new set everytime. The gui for this allows a user to load a 3D model, set the desire spin-image parameters and then add it to a queue. The user then can load another completely different model set its parameters and append it to the queue. Once the user has added all the models the code goes through the queue and trains each model independently.
Since training takes so long it nice to add a bunch of models and then be able to walk away from the machine!
Re: Is there a maximum vector length other than a memory limit?
Depending on what your trainer class does on top of what I've seen, it could be a better design to just create a new trainer for a completly different problem. But thats just esthetics.
Good luck with your project!
Joh