Hi
I have developed and algorithm for chroma subsampling that
keeps both constant
luminance and constant chrominance. It's called Twibright
Luminaplex. I am
sending a description for two purposes. Maybe Gstreamer will
find it useful and
will be able to implement it (there is a reference
implementation under GPL). I
am also opposed to the idea of software patents and want to
publish prior art
on an independent mailing list, which is a recommended
practice to ensure that
if someone tries to patent the idea, the patent will fail on
the first court
trial.
I have also developed another algorithm that produces
slightly worse result,
but is faster and the execution time doesn't depend on the
content.
The following description has been published at
http://ronja
.twibright.com/hyperluma.php and on Freshmeat under the
name
Luminaplex (still a new pending entry). Who wants to testify
against a possible
software patent save this e-mail together with the
aforementioned HTML
somewhere with your received e-mail, thanks.
Regards,
-- Karel Kulhavy
<p>I am publishing the algorithm description at the
moment of release on an
independent server (Freshmeat) to have a reasonable proof of
prior art and
resulting patent invalidity in case someone tries to patent
my idea.
<p><b>Hyperluma and Luminaplex - algorithms for
improved video
encoding</b></p>
<p>Twibright Labs have developed a family of
algorithms to
encode ordinary 4:2:0 and 4:2:2 Y'CbCr video with
unprecedented colour detail
quality. There are three algorithms - Hyperluma 1, Hyperluma
2 and Luminaplex,
in order of increasing picture quality.
The algorithms solve the so called <a href=
"http://en.wikipedia.org/wiki/Chroma_subsampling&q
uot;>chroma bleed</a> problem
which is a result of the mathematically incorrect idea of
chroma subsampling in
analog and digital video broadcast, computer video encoding
and similar
applications. The chroma bleed is best visible as dark
contours along edges of
bright colours in the picture. It can be seen on the edge
between green and
magenta and between red and blue on a TV testcard.
However it exists for any kind
of colour detail, and contributes to overall corruption of
the image
details.
<p>The corruption - the black bar between green and
magenta and between red and blue
- is very visible on TV testcards.</p>
<p>5 vertical strips of the screenshot show the
quality of
individual encoding agorithms compared with the original.
Note how the edge
corruption decreases from left to right. Use a CRT to view
this image properly,
LCDs usually display false artifacts on the eges. Nearest
neighbour chroma
interpolation is assumed in the decoder. </p>
<p><b>Twibright Hyperluma 1</b></p>
<p>Hyperluma 1 internally simulates the receiver
(decoder) of the signal
and then adjusts the luma channel until the effect is
canceled. The result
- these artifacts, which could confuse the viewer in
resolving details,
are gone.</p>
<p>During chroma subsampling we are presented with 3
pixels, each containing R, G,
and B values. We are supposed to produce four Y' (luma)
values, one Cb
(blue chroma) and one Cr (red chroma value). The original
ancient idea
of the TV system was that the
luma values would describe perceived brightness (luminance),
which is perceived
by human eye with twice as sharp resolution than the colour
information.
However, as the luma and chroma are calculated from
gamma-corrected (nonlinear)
R' G' B' values, luma doesn't represent just luminance
(perceived brightness),
but is also influenced by the chroma values. To reconstruct
luminance properly,
both luma and chroma values are necessary - luminance
depends on both.</p>
<p>When the chroma values are subsampled and therefore
changed, the luminance
then suffers too, because depends partially also on the
chroma. The first step
in ordinary encoding system is to convert R' G' B' values
into Y' Cb Cr, the
second one is to subsample the Cb Cr. Hyperluma instead
converts R' G' B' into
gL Cb Cr, where gL is gamma-transformed luminance. gL
represents accurately the
brightness as perceived by the viewer. In the second step,
the Cb and Cr are
subsampled as usually. Hyperluma then adds third step, where
gL Cb Cr are used
to calculate Y' - a luma that would produce with the given
Cb Cr the brightness
perception that was in the picture at the given pixel
position.</p>
<p><b>The core of Hyperluma - the Hyperluma
Table</b></p>
<p>The core to the Hyperluma calculation is to solve
the following mathematical
problem: given Cb, Cr, and gL, what is the Y', that with Cb
and Cr produces
the given gL?</p>
<p>Because the function transforming Y'CbCr into gL
(gamma-transformed
luminance) cannot be easily mathematically reversed, the
software uses binary
division to search for the solution. As this is 8 times
slower than the forward
transformation and would slow down the video encoding, it
precalculates the
reverse transformation into an <a href="http://ronja.twibright.com/utils/hyperluma_ta
ble">universal
lookup table 12.96 megabytes big</a>. This table can
be compressed down to
just 29kB with bzip and then easily distributed with the
program. When the
program starts, it loads the table into the RAM, allowing
fast video processing
without a need to wait on program startup until the lookup
table is
precalculated.</p>
<p>After I developed the algorithm, published it and
consulted with specialists,
it came out that the algorithm is patented. That was an
impulse to develop
Hyperluma 2 and ultimately Luminaplex, which improve the
quality even more
and don't seem to be covered by any patent.</p>
<p>The problem of Hyperluma 1 is that it removes the
problem of nonconstant
luminance by providing a constant luminance, however the
chrominance stays
the same. By the process of chroma filtering and
nonlinearities in chroma
definition, the chrominance in the picture suffers. This is
partially solved
by Hyperluma 2.</p>
<p><b>Twibright Hyperluma 2</b></p>
This produces 4:2:0 Y'CbCr encoded with Hyperluma 2:
<br><br>
<br> * Input: 3x 16 bits RGB
<br> * Calculate a subsampled RGB,
in both dimensions 1:2 subsampling
<br> * Calculate subsampled R'G'B'
from the subsampled RGB by applying the
gamma function.
<br> * Calculate Cb and Cr from the
R'G'B'
<br> * Calculate 16-bit luminance Y
for each from 4 pixels from the original
high-resolution RGB
<br> * Calculate Yc, a companded
luminance by applying gamma function on the
luminance Y (Yc has nothing to do with Y'!)
<br> * Use Yc, Cb and Cr as an
input for the hyperluma table to calculate luma Y'
for each of four pixels.
<br> * Send the resulting 6 numbers
- Y', Y', Y', Y', Cb, Cr - to the output
<p>4:2:2 encoding is analogical. Note that the
algorithm doesn't depend on
what subsampling filter you use.</p>
<p>Hyperluma 2 improves the chrominance by calculating
the chroma not by
chroma filtering, but by physically correct photon
averaging. The luminance
is corrected completely like in Hyperluma 1, the chrominance
is corrected
partially. But the chrominance correction is not perfect -
the luma shift
caused by the luma correction by the hyperluma table causes
the colours to
drift from their ideal values.</p>
<p>Hyperluma 1 and Hyperluma 2 are algorithms of
constant luminance. But the
viewer doesn't watch the luminance only. He cares also about
the chrominance!
I took a completely different approach and set a goal to get
both constant
luminance and constant chrominance. The result is the
Twibright Luminaplex
algorithm, an algorithm that produces the ultimately correct
result.</p>
<p><b>Twibright Luminaplex</b></p>
<p>The ultimate idea of encoding RGB or 4:4:4 Y'CbCr
into 4:2:0 or 4:2:2 Y'CbCr
is strikingly simple. You want the viewer to see the same on
the TV as what is in
the input signal. And that's what lead me to the Luminaplex
algorithm,
the algorithm that provides the ultimate quality.</p>
<p><b>How to tell whether the viewer sees the
same</b></p>
<p>Sees the same doesn't mean the TV shows the same.
The human eye is alleged
to have half resolution in colour than in brightness. But
how to quantify this,
when the error is perceived according to a power function,
while the colour
mixing goes linearly? I cut this Gordic knot with a simple
algorithm that I still
believe is accurate. This is what the perceived_error
function in the source
code calculates:</p>
<p><b>The perceived error
measure</b></p>
<br><br>
<br> * Operates on 2x2 pixel
blocks
<br> * Decodes the Y'CbCr data like
a decoder with nearest neighbourhood
interpolation would do
<br> * Calculates 16-bit linear RGB
that appear on the screen phosphors
<br> * Calculates luminance Y from
RGB for each of the 4 pixels.
<br> * Calculates gamma-companded
luminance Yc from each Y by just applying
the gamma function (do not confuse Yc and Y'!)
<br> * Calculates gamma-companded
luminance Yc for each pixel (that't the brightness
perception
<br> * Averages the RGB (linear)
values of all 4 pixels together. Then applies
gamma so R'G'B' come out. That's the colour perception.
<br> * We are having 7 numbers -
Yc, Yc, Yc, Yc, R', G', B'. Calculate
the Yc, Yc, Yc, Yc, R', G', B' that would be displayed from
the original
RGB signal without 4:2:0 Y'CbCr encoding. Take a difference
for each
of the 7 channels, square it and sum the squares. Take a
square root. This
is called <a href="http://en.wikipedia.org/wiki/Root_mean_square"&g
t;Root Mean
Square (RMS)</a> and that says how crappy the viewer
considers the picture.
<br> * I defined it as RMS because
it has a property that when the same absolute
value of difference is spread into multiple pixels, it is
considered less
crappy. I think this corresponds to the reality of
perception. A single
screaming bug in the image is worse than some small
imperfection uniformly
spread in the picture.
<p><b>The Luminaplex algorithm
itself</b></p>
<br><br>
<br> * Operates on blocks of 2x2
pixels
<br> * Calculate 6 values - Y', Y',
Y', Y', Cb, Cr as for Hyperluma 2. This is
our starting point.
<br> * Now we are going to
continuously monitor the quality of our data by calculating
the perceived error from these 6 numbers - Y', Y', Y', Y',
Cb, Cr
<br> * Take some dimension (for
example the first Y') and try to increase
the value. Does the error go up? If yes, try going the other
way.
If no, increase until the error would go up.
<br> * Take another dimension, then
another until you exhaust all the 6 dimensions.
<br> * Go another round and another
until you make a round where you actually didn't
do any adjustment. This is the final result.
<br> * You can see the algorithm
proceeds like a satellite dish viewer who wants
to get the best picture. He starts with some picture. Then
he adjust the
azimuth for the best picture. Then he adjust the elevation
for the best picture
while keeping the improved azimuth. Then he repeats until he
realizes there is
nothing more to improve. Then he sits down into the couch
and
watches.
<p>Note that this algorithm, unlike Hyperluma 1 and 2,
are defined with the
assumption that the receiver uses a nearest neighbour filter
for the chroma
upsampling. There is actually no need to use any more
complicated filter than
this. The transmission chain with Twibright Luminaplex
encoder and
nearest neighbour decoder preserves the luminance
and the chrominance in the passpand of human eye. Some
spatial
frequency mirrors are generated in the chrominance area
above the capability
of human eye to discern colours, because the
average-4-pixels filter doesn't
cut off above the Nyquist frequency well. However it doesn't
matter, because
human eye cannot see these mirrors anyway.</p>
<p>Running a decoder with more sophisticated chroma
filter on a Luminaplex
output is not only a waste of CPU time, but also degrades
the picture somewhat.
The Luminaplex output was optimized for the simplest filter.
For example mplayer
on png output (where I can study the output exactly) uses
this simplest filter.</p>
<p><b>How to optimize Luminaplex output for
decoders with sophisticated
filters?</b></p>
<p>But what if we need to encode with Twibrigtht
Luminaplex for a decoder that
uses a more sophisticated filter, like bilinear or even
bicubic? Then the
task becomes more complicated, but not
impossible.</p>
<p>Instead of optimizing blocks 2x2 now we have to
optimize the whole picture
as a single block, because changing one luma or chroma value
influences the
errors in several neighbouring pixels. It's the same
algorithm, but instead
of 6 dimensions we have several hundreds thousands or
millions dimensions.</p>
<p>Each iteration we don't have to minimize the global
image error, but we need
to calculate the error only for those pixels that are
influenced in the output
by the changed value. The size of this area is given by the
size of the filter
core in the decoder. It will surely slow down the
calculation, several times,
depending on the core size. But it can still be practical
for one-time encoding
of videos for distribution for example on a website or on a
DVD. I didn't
implement this variant into pix-y4m.c because it would be
more complicated and
slower and I didn't intend to use it on the Ronja
project.</p>
<p>The Luminaplex algorithm is not suitable for
broadcast because theoretically
one could feed a poisonous picture that would contain a lot
of difficult
squares that would slow down the decoder and cause a stream
dropout. The
Hyperluma 2 doesn't have this problem, the execution always
takes the same
time there.</p>
<p>Some 8-bit R'G'B' triplets are not representable by
8-bit Y'CbCr -
example seems to be 192,64,192. In such case Luminaplex
sometimes produces some
kind of 2x2 dithering to minimize the error. Very slight
dithering artifacts on
constant edges or constant colours result. If this is not
acceptable for some
reason, Hyperluma 2 must be used instead. Sometimes the
colour transition is so
sharp that Y'CbCr would have to go out of range to produce
the correct
result. It was originally programmed this way but then it
came out that mplayer
seems to clip Y'CbCr into the 16-235,16-240,16-240 range
before processing and
the result was horrible - a green tint on black. So I
removed this and allowed
only Y'CbCr values within the range to be used. This
degraded the SNR on the
numeric test (below) only by about
a decibel, from 45dB to 44dB.</p>
<p><b>Numeric quality
comparison</b></p>
<p>I have made a test with randomly generated pixel
content. Such content exercises
all possible colour transitions in the linear RGB cube
homogenously. The RMS
error is calculated for LSB per pixel - if it comes out as
N, it means you
would need to increase all 8-bit R'G'B' channels in a
picture by N to get
the same level of perceived error. I also provide a SNR
value in dB from
this SNR. 0 dB means error excursion of 127.5LSB - the same
excursion from the
average as the strongest possible signal - white/black bars.
400,000 randomly
filled pixel squares were used for each run. Iterations per
pixel are
indicated, that counts how many times the perceived_error
calculation is
performed per pixel (from which 0.25 is for evaluation
purpose only). The
time/pixel is measured on 1.5GHz Pentium M with OpenBSD 4.0,
gcc 3.3.5, and
optimization flags -g -O2 -fomit-frame-pointer
-funroll-loops -fstrength-reduce.</p>
<br><b>Algorithm - RMS error [LSB] - SNR [dB] -
Time/pixel [usec]
- Iterations/pixel</b>
<br>
Ordinary - 9.075 LSB - 22.95 dB - 0.20 usec - 0.25 iter.
<br>
Hyperluma 1 - 5.635 LSB - 27.09 dB - 0.18 usec - 0.25 iter.
<br>
Hyperluma 2 - 2.800 LSB - 33.17 dB - 0.18 usec - 0.25 iter.
<br>
Luminaplex - 0.731 LSB - 44.83 dB - 2.01 usec - 6.96 iter.
<br>
<p><b>Source codes</b></p>
<p><a href="http://ronja.twibright.com/utils/pix-y4m.
c">pix-y4m.c</a> is the source code. The
Hyperluma 1
is disabled in the code. As it is now the code implements
Twibright Luminaplex
and the inaccessible Hyperluma 1 code is there only for
illustration.</p>
<p><b>Usage in real world</b></p>
<p><a href="http://
ronja.twibright.com/3d/">Ronja 3D
videos</a> which are used to illustrate how to put
<a href="
http://ronja.twibright.com/">Ronja</a>
together, are now all encoded with the Luminaplex
algorithm. It is indicated by a small Luminaplex logo in the
upper left corner.
I think it was a good idea to develop the Luminaplex
algorithm for this application,
because the models are brightly coloured for
comprehensibility and then the
chroma bleed problem is the strongest. The chroma bleed
disease attacks
details of the picture and people could have problem
identifying all those
small nuts, bolts and holes in the Ronja models.</p>
<p><b>Patents</b></p>
<p>As I already mentioned, the Hyperluma 1 algorithm
seems to be covered by
some patents. Hyperluma 2 and Luminaplex are not, I hope. On
January 28, 2007 I
published these algorithms and their GPL free software
implementation to
constitute prior art and to prevent someone from stealing
them by patenting
them and then being able to prevent people from using this
algorithm. I have
sent a note to various mailing lists and respectable
individuals so a plausible
proof of prior art can be constructed to challenge a patent
if someone tried to
abuse the patent system this way.</p>
<p>This of course applies only in the case the
Hyperluma 2 or Luminaplex
algorithms doesn't accidentally happen to be already covered
by a patent as
happened in the case of Hyperluma 1. I didn't conduct a
patent search because
these are expensive and checked just the patents connected
with Hyperluma 1. If
you are curious you can conduct a patent search yourself.
Should you find a
patent that infringes either algorithm, let me
know.</p>
<p>Software patents are harmful for software
development, especially free
software. If you look into some public patent database you
can see the patent
system is full of simple algorithms performing simple tasks,
effectively
banning free usage of some independently invented innovative
ideas.</p>
<p><b>Bibliography</b></p>
<br><br>
<br> * <a href="http://www.poynton.com/PDFs/Mag_of_noncons
t_luminance.pdf">The magnitude of nonconstant
luminance errors (Acrobat PDF format)</a> by Charles
Poynton
<br> * <a href="http://www.poynton.com/ColorFAQ.html">Poynton's
Color FAQ</a>
<br> * <a href="http://en.wikipedia.org/wiki/Chroma_subsampling&q
uot;>Chroma Subsampling</a> on Wikipedia
<br> * <a
href="http://poynton.com/papers/YUV_and_luminance
_harmful.html">http://poynton.com/PDFs/YUV_and_lumin
ance_harmful.pdf</a>
by Charles Poynton
<br> * <a href="http://en.wikipedia.org/wiki/Mach_bands">Mach
Bands</a>
<br> * <a href="http://en.wikipedia.org/wiki/Cornsweet_illusio
n">Cornsweet illusion</a>
<br> * <a
href="http://v3.espace
net.com/textdoc?DB=EPODOC&IDX=GB2293514&F=0">
;GB2293514</a>
"Luminance signal coding; correcting for failure of
constant
luminance"
<br> * <a href="http://v3.espacenet.com/te
xtdoc?DB=EPODOC&IDX=WO9609724&F=0">WO9609724
</a>
"Luminance signal coding; correcting for failure of
constant
luminance"
<br> * <a href="http://www.google.com/patents?vid=USPAT49997
02&id=ZFIfAAAAEBAJ&dq=4999702&ie=ISO-8859-1"
;>US4999702</a>
"Method and apparatus for processing component
signals to preserve high frequency intensity"
------------------------------------------------------------
-------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the
chance to share your
opinions on IT & business topics through brief surveys -
and earn cash
http://www.techsay.com/default.
php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
gstreamer-devel mailing list
gstreamer-devel lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/gstream
er-devel
|