I consider a possibility of object detection using a graph neural network.
It is to represent correlation among pixels in basic, by the graph.
(M x N)-pixel image can be represented by (M x N) x (M x N) pixel correlation matrix just like a complete network topology. Local and high intensity (255 for 8-bit monochrome for example) between pixels should have strong correlation something, so the element on the matrix should have higher value. Global and low intensity between pixels should have weak correlation, so the element on the matrix should have lower value.
For simplify the problem, let us consider a monochrome image (a single channel) which has pixels having 0 to 255 value.Then the correlation matrix can be obtained by following procedure;
- an edge filter is applied, or a convolution is applied to the source image
- evaluate distance between pixels and intensities of the pixels
- fill the correlation value to the matrix
- Particular object should have particular pattern in the matrix as a feature.
Approximately similar pattern is same object, indeed.
Although graph network has correlation between neighbor nodes, in this case, a node affects onto all other nodes. Computation complexity is very high, so sparse matrix representation (CSR for example) is required to skip unnecessary operations.
Just checking correlation of pixels, so elements in matrix should be independent from a geometry. So, this should have robustness for rotation, striding, lack of object part, instead of use the pooling.
By obtaining the matrix thru learning, it can detect object, maybe.
This is probably not a graph network, but I want to explore possibility of this method.
Any suggestion is welcome.
And I am going to attend 19th this month of Study event.