Making Stuff Up and Filling in the Blanks
The resolution of modern digital cameras is truly amazing. Even more so when you stop to consider that so much of the image data is guesswork. It may be educated guesses, but it's still guesswork.
If there's a screen on window near you right now, get up and look out through it at the world for a moment. Don't worry, I'll wait right here for you to get back. If there isn't one handy, follow along with me anyway and imagine what you'd see. The world outside would be sliced up into squares based on the wire spacing of the screen mesh. If you stand close enough to the screen, the holes would appear large in relation to the size of objects outside that are visible through the screen. Whether you're looking at a tree or a building, you'd see a section of it through each screen hole if you get right up near the window. But if you stand far enough away the individual holes would be small enough you could all but ignore the screen if I hadn't called your attention to it in the first place. From this distance, the individual holes would be negligible in size when compared to the tree or building, but they'd still be there.
The number of screen holes required to cover that tree or building could be considered a measure resolution. Imagine that what you see through any given hole were averaged out to a single color with no other information able to pass through the screen. If only a few holes were required to cover something on the other side, the resulting image from those color values would have very little discernable detail. You may not even be able to tell what you are looking at since most anything would be rendered merely as a pattern of large color blocks. Make the holes small enough but impose the same color averaging filter to what passes through it though and the resulting image would have enough discreet values for it to remain recognizable, appearing almost as if the screen weren't there at all.
This isn't too dissimilar from how most of us conceive digital camera resolution with the screen holes standing in for the sensor photosites and the averaged color from each representing the RGB value for each resulting image pixel. But the reality of digital image capture goes far beyond this simple analogy.
To start with, consider the wires that make up the window screen. Regardless of how many screen holes (resolution) may be needed to cover a given subject, the wires themselves aren't entirely inconsequential. If the wires are thin enough it can be easy to consider them such, but if they are thick enough they'll start to block some of the view and the holes taken taken collectively won't show you everything on the other side of the window. A color average taken from any screen hole (pixel photosite) can only include what you see through that hole, not whatever may be blocked by the wires that bound that hole. Make the wires thick enough, and the array of color values from the entire window screen (sensor) will be based on only partial information. Some of what is on the other side won't be visible through any hole.
The surface of a digital camera sensor actually has such wires since the whole sensor is electrically powered and the data captured by each photosite has to get from the sensor to your memory card somehow. Millions of photosites need millions of wires to make the whole thing work. On early compact cameras, all those support wires could obscure as much as half of the surface area of the sensor yet the resulting images looked as if none of that extra stuff existed at all. The color from each pixel was extended to fill in the area behind the surrounding wires where nothing got recorded. The camera and software guessed that what it couldn't see probably looked pretty much like what was nearby that it could see.
Larger sensors have an inherent advantage in that, for the same pixel count (resolution), more surface area will be made up of photosites that record information than would be the case with smaller sensors. The same wires will still be needed, but once we subtract these from the total surface area, more would remain to be divided up into the photosites. A greater percentage of the total area would be made up of useful photosites than wires on a larger sensor. Of course newer sensors have improved a lot across all sensor sizes when compared to earlier sensors. And new Back Side Illuminated (BSI) sensor designs have moved much of the support wiring to behind the substrate rather than in front of it, but not everything can be moved so there's still some nonvisible area to be filled in and guessed based on what does get recorded.
But even more important to the topic at hand is the fact that each photosite doesn't actually record a full range of color but is in fact sensitive to just a single color. A camera never sees cyan, mauve or burnt umber. It sees only pure red, green and blue. Or to be more correct, each photosite sees only red, green or blue, not all three. Each photosite can record more or less intensity of its designated color, but it can't see other colors at all.
The photosites in a typical digital camera sensor are arranged in a rectangular array known as a Bayer matrix. Each photosite has a colored filter over it that filters out everything other than one specific color. One row of a Bayer matrix array will have photosites with filters colored red, green, red, green, and so on. The next row will have green, blue, green, blue and so on photosites. Imagine what your window screen would look like if you added this sort of colored filters to it.
If you add all these up you can see that green has twice the representation that either red or blue does. This may seem odd but makes sense when you consider that green is in the middle of the color spectrum and the human eye is itself twice as sensitive to green light as it is to red or blue. It's all very scientific.
Given that each photosite sees just a single color, the camera and the software have to use values from adjacent photosites that see the other two colors to make up the missing values to create an RGB color for the final image. As such, fully two thirds of the overall data needed for the resulting RGB image has to be guessed at, or interpolated, based on data that is available. Raw converter software employs sophisticated algorithms to do the best it can to figure out what probably was there, but two out of three colors that make up an RGB color simply weren't recorded for any given pixel and have to be filled in as best as can be.
The final pixel count of an image produced by a given camera likely won't match exactly the photosite count of its sensor since data from some of the edge photosites contribute only to help with this interpolation process and don't result in actual pixels in their own right. But if the camera just combined groupings of three photosites (one of each filter color) to create pixels, the final resolution would be only a third of the photosite count. Small alignment errors from each color not being recorded at exactly the same point would soften images even more and likely result in color fringing. Instead, the camera and software do their level best to guess what should / would have been recorded at each point by filling in the blanks for the other two colors.
As I said at the outset, these are indeed educated guesses, but a guesses nonetheless. Something to think about. All things considered, it's really cool what the camera and raw converter software can do given how little actual data is on hand to work with.