Jump to content
UltravioletPhotography

Revealing the faded text on an old building ad with ICA


Recommended Posts

Andy Perrin

I have a longstanding interest in local history, and lately this has crossed over a number of times with my interest in multispectral imaging because of the possibilities of revealing seemingly lost information. These possibilities have been well-investigated by the historical community, so I'm not doing anything new and exciting by their standards here, at least from a technical standpoint. One of the better-known examples is Christina Duffy's multispectral imaging of a burnt Magna Carta.

 

This post will be on a commonly used method of combining multiple images from different (sometimes overlapping) spectral bands to extract the text of this advertisement. The method is called Independent Component Analysis (ICA) or Blind Signal Separation (BSS). The main text of the ad is reasonably clear, but there is smaller text that is nearly illegible and it would be nice to recover it.

post-94-0-68146600-1530418953.jpg

 

Independent Component Analysis originated in the audio community. The original problem it was meant to solve is known as the "cocktail party problem" — you are at a party and two people are talking at once: how do you figure out what each person is saying? Each ear hears something slightly different (because it is facing a different direction, at a different distance from each speaker, etc.) so your brain can untangle the resulting mess somehow, but what if you want a computer to do it? In the computer version, you have two recordings from different mics (representing your ears) and the computer's task is to spit out the two original audio streams. The way it was solved was to imagine that each recording is a weighted average (linear combination in math-speak) of the original sources, but you don't know the weights. The problem becomes to recover the weights. Different ICA methods take different approaches to finding the weights. The method I used is called fastICA.

 

In the context of image analysis, we imagine that each channel of our multispectral image (not just the R,G, and B, also additional channels for UV and IR) contains some information about the hidden letters, but different colored letters might reflect in different parts of the spectrum. The orange text, for example, is not visible in UV. This means that the ICA algorithm (which is just adding cleverly-chosen weighted sums and differences of the original channels) would be able to subtract off the brick background in principle, making the text easier to read.

 

Could you do this by hand? Technically yes. It is just adding and subtracting channels, after all. But in the current example there are 12 channels coming from 4 images, and determining the correct weights — all 144 of them — by trial and error would be very laborious indeed.

 

The images used here were taken with the following filters/stacks using the Novoflex Noflexar 3.5/35mm:

UV- 2mm UG11 and 1.5mm S8612

Visible- Hoya UV/IR Cut

IR/vis- TIffen #12

IR- Hoya R72

 

UV:

post-94-0-47720000-1530420590.jpg

 

Visible (for reference):

post-94-0-68146600-1530418953.jpg

 

IR/vis (Tiffen #12). This has had the Aerochrome treatment described in the other post on my helicopter flight:

post-94-0-83139400-1530420643.jpg

 

IR:

post-94-0-70565600-1530420667.jpg

 

The IR does the best by itself in revealing the smaller text, but it does not make it fully readable. Now we run the ICA. If you give the ICA 12 channels (4 photos x 3 channels/photo) then it will give back 12 "independent components" - images that are statistically independent and therefore should hopefully reveal unique information. Here is what you actually get back (with some contrast adjustment):

post-94-0-87202800-1530421435.jpg

 

The ICA process is not 100% unique. The ICA components that are revealed will come out randomly inverted (because statistically, changing the sign from + to - does not affect whether a channel is correlated to any of the others). So it is permissible to invert them back to normal. What we see above is that (as predicted) the ICA managed to wipe the text off the wall altogether in some cases, and in others bits of the text remain. The images fall into 4 groups: (1) images showing the main text ("Royal Crown Cola"/"Mansfield Market"), (2) images of plain brick wall, (3) images with bits of the smaller orange text that we are interested in, and (4) one entirely blank image (noise). That last one is because both the Tiffen and the Hoya R72 image contain duplicate infrared info, so it subtracts IR - IR and gets noise.

 

I took an average of each of the first three groups and then put the results in the channels of an Lab file. This was the final result:

post-94-0-91708300-1530422029.jpg

 

The smaller text is now readable (barely)! The ads read,

MANSFIELD

MARKET

FRESH KILLED POULTRY

MEAT GROCERY FRUITS VEGETABLE

(something, probably FRESH) EVERYDAY

 

and

 

Drink

ROYAL CROWN

COLA

(unreadable)

 

References

I learned a lot from this review article, and if you are interested in writing your own ICA routine, I recommend it highly, especially for its comments on the pros and cons of different methods:

Choi, S., Cichocki, A., Park, H.M. and Lee, S.Y., 2004. Blind Source Separation and Independent Component Analysis: A Review.

 

The wikipedia article on fastICA has a nice overview of that particular method.

https://en.wikipedia.org/wiki/FastICA

 

A fairly advanced book on the topic (read the review above before diving into this).

Cichocki, A. and Amari, S.I., 2002. Adaptive blind signal and image processing: learning algorithms and applications (Vol. 1). John Wiley & Sons.

 

Christina Duffy's piece on the burned Magna Carta is fun reading.

https://www.bl.uk/ma...rnt-magna-carta

Link to comment

Andy, you are amazing !!!!! This is utterly fascinating. I am going to read the Wiki reference.

 

OK, now on a practical note. Go immediately and print up your business cards for Perrin's Independent Uvvisir Forensic Analysis. I don't know what you might make as an engineering tutor, but somehow there is a business lurking behind what you do combining alternate wavelength photography and Matlab programming. Art, historical documents, insurance claims, old buildings -- there are so many fields where things need to be investigated photographically, yes?

Were I just a few years younger meself I'd give this a try! It would be just as much work as the telecom engineering I did at the Labs, but perhaps the point is perhaps about who is in control of ones life. Sorry, I'm rambling on here......

Link to comment
Andy Perrin

Andrea, a more accessible (less mathy) intro is here:

https://en.wikipedia...ponent_analysis

and

https://en.wikipedia.org/wiki/Blind_signal_separation

 

Note that "less mathy" does not mean "no math at all." It comes with the territory here.

--

 

OK, now on a practical note. Go immediately and print up your business cards for Perrin's Independent Uvvisir Forensic Analysis. I don't know what you might make as an engineering tutor, but somehow there is a business lurking behind what you do combining alternate wavelength photography and Matlab programming. Art, historical documents, insurance claims, old buildings -- there are so many fields where things need to be investigated photographically, yes?

Would that I could! I have no idea how I would get business, Andrea. Who would I even talk to? I have no credentials in this area.

Link to comment
  • 3 years later...

Nice work! Have you tried dstretch imagej?

I didn’t bother, my algorithm above is closely related to dstretch (which is just PCA!). I work exclusively in MATLAB, I have never liked ImageJ very much. I have my own PCA implementation in MATLAB and also AMUSE and color deconvolution.

Link to comment

Please sign in to comment

You will be able to leave a comment after signing in



Sign In Now
×
×
  • Create New...