Image Based Modeling and Rendering Difficulties and Theory
There is very little actual information available on this topic. From what I can tell, most researchers are still perplexed by image based modeling. The vast majority of the work being done is limited to very simple surfacesÑspecifically flat surfaces. I have not been able to find any reference whatsoever to finding the camera location and orientation. For this reason and this reason alone, I am currently refraining from laying out my algorithms publicly.
From my knowledge of mathematics, I have devised a list of what I consider to be all possible types of 3d modeling. There exist 4 degrees of 3d modeling, each of which has several interpretations. All uses of continuity assume the d2 (Pythagorean theorem) measurement of distance. All sets are assumed to be closed. Allowing for non-closed sets allows for certain inconsistencies in objects that never happen in reality. All real objects can be represented by a closed set of some form.
0th degree: This is point based 3d modeling. Objects of this degree consist merely of a finite collection of discrete points. Star field screen savers fall into this category.
1st degree: This is line based 3d modeling. Objects of this degree consist of a finite collection of finite length of continuous paths. A path can be interpreted as a line segment that is not required to be straight. It is, however, required to be continuous. If the path is not continuous, it will either define multiple paths, or be of 0th degree. Wireframe 3d modeling falls into this category.
2nd degree: This is surface based 3d modeling. Objects of this degree consist of a finite collection of finite area continuous surfaces. The vast majority of current 3d modeling is of 2nd degree, but is limited to simple 2-dimensional objects, or polygons. As hardware speed increases, polygon-based 3d modeling is becoming more and more like using 3d surfaces to represent models.
3rd degree: This is solid based 3d modeling. Objects of this degree consist of a finite collection of finite volume continuous solids. The only benefit of this over 2nd degree models is that it allows for taking accurate cross sections of objects without redefining the model. This type of modeling requires a significant amount of information to be stored for the model, as it stores values for every point inside of the object. It requires knowledge of every point inside the object. Therefore, this is not likely to be heavily used for many decades to come. A cat scan or an MRI is an example of this type of modeling.
Issues with types of models:
This theory has a rather significant impact on image based modeling. As digital images are a 0th degree representation of the objects we wish to model, we will never be able to create perfect 2nd degree representations of the objects. Also, we will never be able to create 3rd degree models from photographs. This latter fact is not a surprising detail, and it is not likely to every be a problem.
The difficulty in creating 2nd degree representations is as follows: The reality that we live in is a 4th degree model (adding time as the 4th dimension of change to a 3rd degree model). A digital photograph is a 0th degree representation of a 2nd degree model. The reality that we see (ignoring time and quantum physics) is a rather large collection of highly complex 2nd degree objects.
Through algorithms such as multi-dimensional Lagrange Interpolation, we can find what the original object is likely to have been, based on the information given; however, this is a best guess method. This will be highly evident as distant objects are brought near.
Suppose, for example, that I wanted to create a virtual tree climbing experience, allowing the user to become a squirrel. For this to be accurate, an incredibly significant number of pictures must be taken. The difficult in this is that the tree moves with the wind. To create the 3d model, a large number of high-resolution photographs must be taken at exactly the same momentÑwithout interfering with each other. There are possible methods to avoid this, but this would involve sacrificing quality for ease of creation.
With these limitations, there are three possible types of models that could be used in Image Based Modeling. A 0th degree approach would at first seem the easiest and most effective. This will create highly accurate 3d models of arbitrary shape. However, this also allows ÒglitchesÓ to ariseÑrandom points floating in space that shouldnÕt be there. A 2nd degree type 1 approach involves creating polygons to represent all surfaces. A 2nd degree type 2 approach involves using more difficult mathematical theories to create ÒsmoothÓ surfaces. This will improve the quality of models when viewed closely.
Issues with accuracy:
Another issue arises when using the first method. In creating point based 3d models, there will be a loss of image quality in the 3d models, as point matches will result in a modeled point. In using 2nd degree modeling, objects are interpreted as surfacesÑnot as a collection of points. Type 2 models push this abstraction a step further.
The issue is as follows: 0th degree objects are likely to REDUCE the quality of rendered images when compared to the original images. This is not a necessary side effect of image based modeling, though if not treated properly from the beginning, it can be an unavoidable side effect. Models created should not be of lower quality than the original images.
In fact, they should be of HIGHER quality than the highest quality image given. The reasoning for this: A human is able to decipher the shape of an object when viewed with both eyes. When using only one eye, this becomes more difficultÑregardless of the eye being used. This is quite evident when receiving eye exams. When viewing with one eye, the chart is difficult to read all the way to the bottom. However, when using both eyes, several more lines become apparent.
Each pixel in the input image is actually an average of the colors available at that specific location. This is true with the cells in the retina as well. When multiple images of a single object are presented, we are given multiple averages of different areas. With this, we can compute a higher resolution representation of the surface than any of the individual images gives. If the images contain significant differences in quality, this cannot be done. The images must be similar resolution representations of the original surface.
Issues of parallelization:
Any model creation algorithm should be an embarrassingly parallel algorithm. Therefore, using clusters of computers, it should be possible to create highly accurate models of scenes and even motion sequences in real-time. The effect on entertainment as we know it would be revolutionary. Imagine set top boxes that allow the viewer to watch a football game from any location desired, including watching the game as the football. This would not require incredibly expensive technological footballsÉ simply 3d models created in real-time transmitted to the viewer. As bandwidth increases, this can become a reality, and not just some science fiction scene to add to a futuristic movie.
Do you want your company to be the pioneer in this technological revolution?
**All theories and speculation in this article are the work of the author and the author alone**