|
List Info
Thread: : r613 - docs/nutissues.txt
|
|
| : r613 - docs/nutissues.txt |
  Switzerland |
2008-02-12 09:00:10 |
Author: michael
Date: Tue Feb 12 16:00:09 2008
New Revision: 613
Log:
More about interleaving.
Modified:
docs/nutissues.txt
Modified: docs/nutissues.txt
============================================================
==================
--- docs/nutissues.txt (original)
+++ docs/nutissues.txt Tue Feb 12 16:00:09 2008
 -162,3
+162,8  How do we identify the interleaving
A. fourcc
B. extradata
C. New field in the stream header
+D. Only allow 1 standard interleaving
+
+What about the interleaving of non raw codecs, do all
specify the
+interleaving, or does any leave it to the container? If so,
our options
+would be down to only C.
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |
  Germany |
2008-02-12 10:47:13 |
On Tue, 12 Feb 2008 16:00:10 +0100 (CET)
michael <subversion mplayerhq.hu> wrote:
> Modified: docs/nutissues.txt
>
============================================================
==================
> --- docs/nutissues.txt (original)
> +++ docs/nutissues.txt Tue Feb 12 16:00:09 2008
>  -162,3 +162,8  How do we identify the interleaving
> A. fourcc
> B. extradata
I would vote for this with a single fourcc for pcm and a
single fourcc
for raw video. Having infos about the data format packed in
the fourcc
is ugly and useless. That just lead to inflexible lookup
tables and the
like. Instead we should just define the format in a way
similar to what
mp_image provide for video (colorspace, packed or not, shift
used for
the subsampled planes, etc). That would allow
implementations simply
supporting all definable format, instead of a selection of
what happened
to be commonly used formats at the time the implementation
was written.
> C. New field in the stream header
> +D. Only allow 1 standard interleaving
> +
> +What about the interleaving of non raw codecs, do all
specify the
> +interleaving, or does any leave it to the container?
If so, our
> options +would be down to only C.
On a related subject, it might also be useful to define the
channel
disposition when there is more than one. Mono and stereo can
go by with
the classical default, but as soon as there is more channels
it is
really unclear. And imho such info could still be usefull
with 1 or 2
channels. Something like the position of each channel in
polar
coordinate (2D or 3D?) should be enouth.
Albeu
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |
  Austria |
2008-02-12 10:57:03 |
On Tue, Feb 12, 2008 at 05:47:13PM +0100, Alban Bedel
wrote:
> On Tue, 12 Feb 2008 16:00:10 +0100 (CET)
> michael <subversion mplayerhq.hu> wrote:
>
> > Modified: docs/nutissues.txt
> >
============================================================
==================
> > --- docs/nutissues.txt (original)
> > +++ docs/nutissues.txt Tue Feb 12 16:00:09 2008
> >  -162,3 +162,8  How do we identify the interleaving
> > A. fourcc
> > B. extradata
>
> I would vote for this with a single fourcc for pcm and
a single fourcc
> for raw video. Having infos about the data format
packed in the fourcc
> is ugly and useless. That just lead to inflexible
lookup tables and the
> like.
> Instead we should just define the format in a way
similar to what
> mp_image provide for video (colorspace, packed or not,
shift used for
> the subsampled planes, etc). That would allow
implementations simply
> supporting all definable format, instead of a selection
of what happened
> to be commonly used formats at the time the
implementation was written.
The key points here are that
* colorspace/shift for subsampled planes, etc is not
specific to RAW, its
more like sample_rate or width/height
* non raw codecs have clearly defined global headers
(sometimes at least)
-> thus we cant really use extradata for it
extradata would only be ok for things we definitly dont
ever need for non
raw
>
> > C. New field in the stream header
> > +D. Only allow 1 standard interleaving
> > +
> > +What about the interleaving of non raw codecs, do
all specify the
> > +interleaving, or does any leave it to the
container? If so, our
> > options +would be down to only C.
>
> On a related subject, it might also be useful to define
the channel
> disposition when there is more than one. Mono and
stereo can go by with
> the classical default, but as soon as there is more
channels it is
> really unclear. And imho such info could still be
usefull with 1 or 2
> channels. Something like the position of each channel
in polar
> coordinate (2D or 3D?) should be enouth.
I agree
What about that LFE channel thing? And where do we put this
info, The stream header seems the logic place if you ask me
...
[...]
--
Michael GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB
Democracy is the form of government in which you can choose
your dictator
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |
  Germany |
2008-02-12 12:37:53 |
On Tue, 12 Feb 2008 17:57:03 +0100
Michael Niedermayer <michaelni gmx.at> wrote:
> On Tue, Feb 12, 2008 at 05:47:13PM +0100, Alban Bedel
wrote:
> > On Tue, 12 Feb 2008 16:00:10 +0100 (CET)
> > michael <subversion mplayerhq.hu> wrote:
> >
> > > Modified: docs/nutissues.txt
> > >
============================================================
==================
> > > --- docs/nutissues.txt (original)
> > > +++ docs/nutissues.txt Tue Feb 12 16:00:09
2008
> > >  -162,3 +162,8  How do we identify the
interleaving
> > > A. fourcc
> > > B. extradata
> >
> > I would vote for this with a single fourcc for pcm
and a single
> > fourcc for raw video. Having infos about the data
format packed in
> > the fourcc is ugly and useless. That just lead to
inflexible lookup
> > tables and the like.
>
> > Instead we should just define the format in a way
similar to what
> > mp_image provide for video (colorspace, packed or
not, shift used
> > for the subsampled planes, etc). That would allow
implementations
> > simply supporting all definable format, instead of
a selection of
> > what happened to be commonly used formats at the
time the
> > implementation was written.
>
> The key points here are that
> * colorspace/shift for subsampled planes, etc is not
specific to RAW,
> its more like sample_rate or width/height
Sure, but when a "real" codec is used, it's the
decoder business to tell
the app what output format it will use. NUT can provide
infos about the
internal format used by the codec, that would help dealing
with decoder
including slow colorspace conversions. But that's definitly
non-essential information, any player should be able to do
without it.
However for RAW data the "decoder" need to know
the exact format used,
just like some other decoder need some huffman tables or
whatever.
And the logical place for such information is the extradata
imho.
> * non raw codecs have clearly defined global headers
(sometimes at
> least) -> thus we cant really use extradata for it
> extradata would only be ok for things we definitly
dont ever need
> for non raw
imho the 2 case are completly different. For raw codecs we
are talking
about informations essential to the decoder initialization.
For non raw
codecs we are talking about some extra informations only
usefull in some
applications. Both need to encode the same type of
informations but imho
they should be stored in different places.
> >
> > > C. New field in the stream header
> > > +D. Only allow 1 standard interleaving
> > > +
> > > +What about the interleaving of non raw
codecs, do all specify the
> > > +interleaving, or does any leave it to the
container? If so, our
> > > options +would be down to only C.
> >
> > On a related subject, it might also be useful to
define the channel
> > disposition when there is more than one. Mono and
stereo can go by
> > with the classical default, but as soon as there
is more channels
> > it is really unclear. And imho such info could
still be usefull
> > with 1 or 2 channels. Something like the position
of each channel
> > in polar coordinate (2D or 3D?) should be enouth.
>
> I agree
> What about that LFE channel thing?
I was thinking about simply setting the distance to 0,
however a flag
for "non-directional" channels might be better.
> And where do we put this info, The stream header seems
the logic
> place if you ask me ...
I agree, this is essential information for proper
presentation it
definitly belong there.
Albeu
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |
  United States |
2008-02-12 12:35:47 |
On Tue, Feb 12, 2008 at 07:37:53PM +0100, Alban Bedel
wrote:
> Sure, but when a "real" codec is used, it's
the decoder business to tell
> the app what output format it will use. NUT can provide
infos about the
> internal format used by the codec, that would help
dealing with decoder
> including slow colorspace conversions. But that's
definitly
> non-essential information, any player should be able to
do without it.
>
> However for RAW data the "decoder" need to
know the exact format used,
> just like some other decoder need some huffman tables
or whatever.
> And the logical place for such information is the
extradata imho.
Agree.
Rich
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |
  Austria |
2008-02-12 12:43:13 |
On Tue, Feb 12, 2008 at 07:37:53PM +0100, Alban Bedel
wrote:
> On Tue, 12 Feb 2008 17:57:03 +0100
> Michael Niedermayer <michaelni gmx.at> wrote:
>
> > On Tue, Feb 12, 2008 at 05:47:13PM +0100, Alban
Bedel wrote:
> > > On Tue, 12 Feb 2008 16:00:10 +0100 (CET)
> > > michael <subversion mplayerhq.hu> wrote:
> > >
> > > > Modified: docs/nutissues.txt
> > > >
============================================================
==================
> > > > --- docs/nutissues.txt (original)
> > > > +++ docs/nutissues.txt Tue Feb 12
16:00:09 2008
> > > >  -162,3 +162,8  How do we identify the
interleaving
> > > > A. fourcc
> > > > B. extradata
> > >
> > > I would vote for this with a single fourcc
for pcm and a single
> > > fourcc for raw video. Having infos about the
data format packed in
> > > the fourcc is ugly and useless. That just
lead to inflexible lookup
> > > tables and the like.
> >
> > > Instead we should just define the format in a
way similar to what
> > > mp_image provide for video (colorspace,
packed or not, shift used
> > > for the subsampled planes, etc). That would
allow implementations
> > > simply supporting all definable format,
instead of a selection of
> > > what happened to be commonly used formats at
the time the
> > > implementation was written.
> >
> > The key points here are that
> > * colorspace/shift for subsampled planes, etc is
not specific to RAW,
> > its more like sample_rate or width/height
>
> Sure, but when a "real" codec is used, it's
the decoder business to tell
> the app what output format it will use. NUT can provide
infos about the
> internal format used by the codec,
Only very few codecs have headers which store informations
about things like
shift for subsampled planes. Thus if this information is
desired it has to
come from the container more often than not. If its not
desired then we also
dont need it for raw IMHO.
> that would help dealing with decoder
> including slow colorspace conversions.
I have no interrest in supporting or helping this case, and
i suspect iam not
alone here.
> But that's definitly
> non-essential information, any player should be able to
do without it.
>
> However for RAW data the "decoder" need to
know the exact format used,
> just like some other decoder need some huffman tables
or whatever.
> And the logical place for such information is the
extradata imho.
see above, also there really are 2 things
1. How things are stored (packed vs. planar, the precisse
byte packing, ...
2. What is stored (colorspace details YUV BT123 vs BT567,
chroma shift,
channel positions
1. defines the format that is packing of raw bytes this is
somehow similar
to mpeg4 vs h261 thus i think it should be specified by
the fourcc
2. is needed for non raw as well which makes fourcc and
extradata unuseable
[...]
>
> > >
> > > > C. New field in the stream header
> > > > +D. Only allow 1 standard interleaving
> > > > +
> > > > +What about the interleaving of non raw
codecs, do all specify the
> > > > +interleaving, or does any leave it to
the container? If so, our
> > > > options +would be down to only C.
> > >
> > > On a related subject, it might also be useful
to define the channel
> > > disposition when there is more than one. Mono
and stereo can go by
> > > with the classical default, but as soon as
there is more channels
> > > it is really unclear. And imho such info
could still be usefull
> > > with 1 or 2 channels. Something like the
position of each channel
> > > in polar coordinate (2D or 3D?) should be
enouth.
> >
> > I agree
> > What about that LFE channel thing?
>
> I was thinking about simply setting the distance to 0,
however a flag
> for "non-directional" channels might be
better.
This is wrong, LFE is not about direction but about the type
of speaker.
LFE stands for "Low-frequency effects".
If id move a other random speaker at disatnce 0 and the LFE
one out and
switch channels it wont sound correct ...
>
> > And where do we put this info, The stream header
seems the logic
> > place if you ask me ...
>
> I agree, this is essential information for proper
presentation it
> definitly belong there.
Good, now we just need to agree on some half sane way to
store it.
for(i=0; i<num_channels; i++){
x_position s
y_position s
z_position s
channel_flags v
}
CHANNEL_FLAG_LFE 1
seems ok?
[...]
--
Michael GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB
While the State exists there can be no freedom; when there
is freedom there
will be no State. -- Vladimir Lenin
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |

|
2008-02-12 13:17:07 |
Michael Niedermayer <michaelni gmx.at> writes:
> On Tue, Feb 12, 2008 at 07:37:53PM +0100, Alban Bedel
wrote:
>> On Tue, 12 Feb 2008 17:57:03 +0100
>> Michael Niedermayer <michaelni gmx.at> wrote:
>>
>> > On Tue, Feb 12, 2008 at 05:47:13PM +0100,
Alban Bedel wrote:
>> > > On Tue, 12 Feb 2008 16:00:10 +0100 (CET)
>> > > michael <subversion mplayerhq.hu> wrote:
>> > >
>> > > > Modified: docs/nutissues.txt
>> > > >
============================================================
==================
>> > > > --- docs/nutissues.txt (original)
>> > > > +++ docs/nutissues.txt Tue Feb 12
16:00:09 2008
>> > > >  -162,3 +162,8  How do
we identify the interleaving
>> > > > A. fourcc
>> > > > B. extradata
>> > >
>> > > I would vote for this with a single
fourcc for pcm and a single
>> > > fourcc for raw video. Having infos about
the data format packed in
>> > > the fourcc is ugly and useless. That just
lead to inflexible lookup
>> > > tables and the like.
>> >
>> > > Instead we should just define the format
in a way similar to what
>> > > mp_image provide for video (colorspace,
packed or not, shift used
>> > > for the subsampled planes, etc). That
would allow implementations
>> > > simply supporting all definable format,
instead of a selection of
>> > > what happened to be commonly used formats
at the time the
>> > > implementation was written.
>> >
>> > The key points here are that
>> > * colorspace/shift for subsampled planes, etc
is not specific to RAW,
>> > its more like sample_rate or width/height
>>
>> Sure, but when a "real" codec is used,
it's the decoder business to tell
>> the app what output format it will use. NUT can
provide infos about the
>> internal format used by the codec,
>
> Only very few codecs have headers which store
informations about things like
> shift for subsampled planes. Thus if this information
is desired it has to
> come from the container more often than not. If its not
desired then we also
> dont need it for raw IMHO.
With compressed video, the decoder informs the caller of the
pixel
format. With raw video, this information must come from
the
container, one way or other.
>> that would help dealing with decoder
>> including slow colorspace conversions.
>
> I have no interrest in supporting or helping this case,
and i suspect iam not
> alone here.
>
>> But that's definitly
>> non-essential information, any player should be
able to do without it.
>>
>> However for RAW data the "decoder" need
to know the exact format used,
>> just like some other decoder need some huffman
tables or whatever.
>> And the logical place for such information is the
extradata imho.
>
> see above, also there really are 2 things
> 1. How things are stored (packed vs. planar, the
precisse byte packing, ...
> 2. What is stored (colorspace details YUV BT123 vs
BT567, chroma shift,
> channel positions
>
> 1. defines the format that is packing of raw bytes this
is somehow similar
> to mpeg4 vs h261 thus i think it should be specified
by the fourcc
> 2. is needed for non raw as well which makes fourcc and
extradata unuseable
The colourspace and whatnot are only needed if the
compressed data is
actually decoded, and in this case the decoder should be
extracting
this information from whatever headers the format uses.
>> > > On a related subject, it might also be
useful to define the channel
>> > > disposition when there is more than one.
Mono and stereo can go by
>> > > with the classical default, but as soon
as there is more channels
>> > > it is really unclear. And imho such info
could still be usefull
>> > > with 1 or 2 channels. Something like the
position of each channel
>> > > in polar coordinate (2D or 3D?) should be
enouth.
>> >
>> > I agree
>> > What about that LFE channel thing?
>>
>> I was thinking about simply setting the distance to
0, however a flag
>> for "non-directional" channels might be
better.
>
> This is wrong, LFE is not about direction but about the
type of speaker.
> LFE stands for "Low-frequency effects".
> If id move a other random speaker at disatnce 0 and the
LFE one out and
> switch channels it wont sound correct ...
>
>>
>> > And where do we put this info, The stream
header seems the logic
>> > place if you ask me ...
>>
>> I agree, this is essential information for proper
presentation it
>> definitly belong there.
>
> Good, now we just need to agree on some half sane way
to store it.
> for(i=0; i<num_channels; i++){
> x_position s
> y_position s
> z_position s
> channel_flags v
> }
>
> CHANNEL_FLAG_LFE 1
>
> seems ok?
I'm not convinced this is the right way to go. Consider a
recording
made with several directional microphones in the same
location. Using
spherical coordinates could be a solution.
Whatever the coordinate system, the location and orientation
of the
listener must be specified, even if there is only one
logical choice.
--
Måns Rullgård
mans mansr.com
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |
  Austria |
2008-02-12 13:56:01 |
On Tue, Feb 12, 2008 at 07:17:07PM +0000, Måns Rullgård
wrote:
> Michael Niedermayer <michaelni gmx.at> writes:
>
> > On Tue, Feb 12, 2008 at 07:37:53PM +0100, Alban
Bedel wrote:
> >> On Tue, 12 Feb 2008 17:57:03 +0100
> >> Michael Niedermayer <michaelni gmx.at> wrote:
> >>
> >> > On Tue, Feb 12, 2008 at 05:47:13PM +0100,
Alban Bedel wrote:
> >> > > On Tue, 12 Feb 2008 16:00:10 +0100
(CET)
> >> > > michael <subversion mplayerhq.hu> wrote:
> >> > >
> >> > > > Modified: docs/nutissues.txt
> >> > > >
============================================================
==================
> >> > > > ---
docs/nutissues.txt (original)
> >> > > > +++ docs/nutissues.txt Tue Feb
12 16:00:09 2008
> >> > > >  -162,3 +162,8  How do
we identify the interleaving
> >> > > > A. fourcc
> >> > > > B. extradata
> >> > >
> >> > > I would vote for this with a single
fourcc for pcm and a single
> >> > > fourcc for raw video. Having infos
about the data format packed in
> >> > > the fourcc is ugly and useless. That
just lead to inflexible lookup
> >> > > tables and the like.
> >> >
> >> > > Instead we should just define the
format in a way similar to what
> >> > > mp_image provide for video
(colorspace, packed or not, shift used
> >> > > for the subsampled planes, etc).
That would allow implementations
> >> > > simply supporting all definable
format, instead of a selection of
> >> > > what happened to be commonly used
formats at the time the
> >> > > implementation was written.
> >> >
> >> > The key points here are that
> >> > * colorspace/shift for subsampled planes,
etc is not specific to RAW,
> >> > its more like sample_rate or
width/height
> >>
> >> Sure, but when a "real" codec is
used, it's the decoder business to tell
> >> the app what output format it will use. NUT
can provide infos about the
> >> internal format used by the codec,
> >
> > Only very few codecs have headers which store
informations about things like
> > shift for subsampled planes. Thus if this
information is desired it has to
> > come from the container more often than not. If
its not desired then we also
> > dont need it for raw IMHO.
>
> With compressed video, the decoder informs the caller
of the pixel
> format. With raw video, this information must come
from the
> container, one way or other.
Yes, I agree for pixel format.
But the decoder often does not know the fine details. Like
as mentioned
"shift for subsampled plane" or the precisse
definition of YUV or if it uses
full luma range or not. MPEG stores these yes, but for
example huffyuv does
not. So it would make some sense if this information could
be stored for non
raw as well.
[...]
>
> >> > > On a related subject, it might also
be useful to define the channel
> >> > > disposition when there is more than
one. Mono and stereo can go by
> >> > > with the classical default, but as
soon as there is more channels
> >> > > it is really unclear. And imho such
info could still be usefull
> >> > > with 1 or 2 channels. Something like
the position of each channel
> >> > > in polar coordinate (2D or 3D?)
should be enouth.
> >> >
> >> > I agree
> >> > What about that LFE channel thing?
> >>
> >> I was thinking about simply setting the
distance to 0, however a flag
> >> for "non-directional" channels might
be better.
> >
> > This is wrong, LFE is not about direction but
about the type of speaker.
> > LFE stands for "Low-frequency effects".
> > If id move a other random speaker at disatnce 0
and the LFE one out and
> > switch channels it wont sound correct ...
> >
> >>
> >> > And where do we put this info, The stream
header seems the logic
> >> > place if you ask me ...
> >>
> >> I agree, this is essential information for
proper presentation it
> >> definitly belong there.
> >
> > Good, now we just need to agree on some half sane
way to store it.
> > for(i=0; i<num_channels; i++){
> > x_position s
> > y_position s
> > z_position s
> > channel_flags v
> > }
> >
> > CHANNEL_FLAG_LFE 1
> >
> > seems ok?
>
> I'm not convinced this is the right way to go.
Consider a recording
> made with several directional microphones in the same
location. Using
> spherical coordinates could be a solution.
The above was intended to specify the location of the
speakers not
microphones.
And spherical coordinates would just drop the distance,
thats the same
as setting the distance to 1 and storing that as xyz.
Actually the main reason why i didnt use spherical is that
with integers
theres a precission to decide on or you end up with
rationals. And this
somehow starts looking messy ...
>
> Whatever the coordinate system, the location and
orientation of the
> listener must be specified, even if there is only one
logical choice.
of course
right_position s
forward_position s
up_position s
And
"the listener is at (0,0,0), (1,0,0) is right, (0,1,0)
is forward,
(0,0,1) is up"
[...]
--
Michael GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB
No human being will ever know the Truth, for even if they
happen to say it
by chance, they would not even known they had done so. --
Xenophanes
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |

|
2008-02-12 14:24:01 |
Michael Niedermayer <michaelni gmx.at> writes:
> On Tue, Feb 12, 2008 at 07:17:07PM +0000, Måns Rullgård
wrote:
>> Michael Niedermayer <michaelni gmx.at> writes:
>>
>> > On Tue, Feb 12, 2008 at 07:37:53PM +0100,
Alban Bedel wrote:
>> >> On Tue, 12 Feb 2008 17:57:03 +0100
>> >> Michael Niedermayer <michaelni gmx.at> wrote:
>> >>
>> >> > On Tue, Feb 12, 2008 at 05:47:13PM
+0100, Alban Bedel wrote:
>> >> > > On Tue, 12 Feb 2008 16:00:10
+0100 (CET)
>> >> > > michael <subversion mplayerhq.hu> wrote:
>> >> > >
>> >> > > > Modified:
docs/nutissues.txt
>> >> > > >
============================================================
==================
>> >> > > > ---
docs/nutissues.txt (original)
>> >> > > > +++ docs/nutissues.txt Tue
Feb 12 16:00:09 2008
>> >> > > >  -162,3 +162,8  How do
we identify the interleaving
>> >> > > > A. fourcc
>> >> > > > B. extradata
>> >> > >
>> >> > > I would vote for this with a
single fourcc for pcm and a single
>> >> > > fourcc for raw video. Having
infos about the data format packed in
>> >> > > the fourcc is ugly and useless.
That just lead to inflexible lookup
>> >> > > tables and the like.
>> >> >
>> >> > > Instead we should just define
the format in a way similar to what
>> >> > > mp_image provide for video
(colorspace, packed or not, shift used
>> >> > > for the subsampled planes, etc).
That would allow implementations
>> >> > > simply supporting all definable
format, instead of a selection of
>> >> > > what happened to be commonly
used formats at the time the
>> >> > > implementation was written.
>> >> >
>> >> > The key points here are that
>> >> > * colorspace/shift for subsampled
planes, etc is not specific to RAW,
>> >> > its more like sample_rate or
width/height
>> >>
>> >> Sure, but when a "real" codec is
used, it's the decoder business to tell
>> >> the app what output format it will use.
NUT can provide infos about the
>> >> internal format used by the codec,
>> >
>> > Only very few codecs have headers which store
informations about
>> > things like shift for subsampled planes. Thus
if this information
>> > is desired it has to come from the container
more often than
>> > not. If its not desired then we also dont need
it for raw IMHO.
>>
>> With compressed video, the decoder informs the
caller of the pixel
>> format. With raw video, this information must come
from the
>> container, one way or other.
>
> Yes, I agree for pixel format.
> But the decoder often does not know the fine details.
Like as
> mentioned "shift for subsampled plane" or the
precisse definition of
> YUV or if it uses full luma range or not. MPEG stores
these yes, but
> for example huffyuv does not. So it would make some
sense if this
> information could be stored for non raw as well.
Point taken, and I agree being able to transmit this
information could
be useful. Using extradata is obviously out of the
question, which
leaves either stream headers or info packets.
>> >> > > On a related subject, it might
also be useful to define the channel
>> >> > > disposition when there is more
than one. Mono and stereo can go by
>> >> > > with the classical default, but
as soon as there is more channels
>> >> > > it is really unclear. And imho
such info could still be usefull
>> >> > > with 1 or 2 channels. Something
like the position of each channel
>> >> > > in polar coordinate (2D or 3D?)
should be enouth.
>> >> >
>> >> > I agree
>> >> > What about that LFE channel thing?
>> >>
>> >> I was thinking about simply setting the
distance to 0, however a flag
>> >> for "non-directional" channels
might be better.
>> >
>> > This is wrong, LFE is not about direction but
about the type of speaker.
>> > LFE stands for "Low-frequency
effects".
>> > If id move a other random speaker at disatnce
0 and the LFE one out and
>> > switch channels it wont sound correct ...
>> >
>> >>
>> >> > And where do we put this info, The
stream header seems the logic
>> >> > place if you ask me ...
>> >>
>> >> I agree, this is essential information for
proper presentation it
>> >> definitly belong there.
>> >
>> > Good, now we just need to agree on some half
sane way to store it.
>> > for(i=0; i<num_channels; i++){
>> > x_position s
>> > y_position s
>> > z_position s
>> > channel_flags v
>> > }
>> >
>> > CHANNEL_FLAG_LFE 1
>> >
>> > seems ok?
>>
>> I'm not convinced this is the right way to go.
Consider a recording
>> made with several directional microphones in the
same location. Using
>> spherical coordinates could be a solution.
>
> The above was intended to specify the location of the
speakers not
> microphones.
I'm having a hard time imagining a player moving my speakers
around
depending on the file being played.
> And spherical coordinates would just drop the distance,
thats the same
> as setting the distance to 1 and storing that as xyz.
Spherical coordinates without radius needs only two fields.
> Actually the main reason why i didnt use spherical is
that with integers
> theres a precission to decide on or you end up with
rationals. And this
> somehow starts looking messy ...
I don't see any fundamental difference. If restricted to
integer
coordinates, an arbitrary point can be described only with a
certain
precision, regardless of coordinate system.
>> Whatever the coordinate system, the location and
orientation of the
>> listener must be specified, even if there is only
one logical choice.
>
> of course
> right_position s
> forward_position s
> up_position s
>
> And
> "the listener is at (0,0,0), (1,0,0) is right,
(0,1,0) is forward,
> (0,0,1) is up"
You're forgetting the measurement unit, i.e. metres, feet,
etc.
--
Måns Rullgård
mans mansr.com
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
| Re: : r613 - docs/nutissues.txt |
  Austria |
2008-02-12 15:38:28 |
On Tue, Feb 12, 2008 at 08:24:01PM +0000, Måns Rullgård
wrote:
> Michael Niedermayer <michaelni gmx.at> writes:
>
> > On Tue, Feb 12, 2008 at 07:17:07PM +0000, Måns
Rullgård wrote:
> >> Michael Niedermayer <michaelni gmx.at> writes:
> >>
> >> > On Tue, Feb 12, 2008 at 07:37:53PM +0100,
Alban Bedel wrote:
> >> >> On Tue, 12 Feb 2008 17:57:03 +0100
> >> >> Michael Niedermayer <michaelni gmx.at> wrote:
> >> >>
> >> >> > On Tue, Feb 12, 2008 at
05:47:13PM +0100, Alban Bedel wrote:
> >> >> > > On Tue, 12 Feb 2008
16:00:10 +0100 (CET)
> >> >> > > michael <subversion mplayerhq.hu> wrote:
> >> >> > >
> >> >> > > > Modified:
docs/nutissues.txt
> >> >> > > >
============================================================
==================
> >> >> > > > ---
docs/nutissues.txt (original)
> >> >> > > > +++
docs/nutissues.txt Tue Feb 12 16:00:09 2008
> >> >> > > >  -162,3 +162,8  How do
we identify the interleaving
> >> >> > > > A. fourcc
> >> >> > > > B. extradata
> >> >> > >
> >> >> > > I would vote for this with
a single fourcc for pcm and a single
> >> >> > > fourcc for raw video.
Having infos about the data format packed in
> >> >> > > the fourcc is ugly and
useless. That just lead to inflexible lookup
> >> >> > > tables and the like.
> >> >> >
> >> >> > > Instead we should just
define the format in a way similar to what
> >> >> > > mp_image provide for video
(colorspace, packed or not, shift used
> >> >> > > for the subsampled planes,
etc). That would allow implementations
> >> >> > > simply supporting all
definable format, instead of a selection of
> >> >> > > what happened to be
commonly used formats at the time the
> >> >> > > implementation was
written.
> >> >> >
> >> >> > The key points here are that
> >> >> > * colorspace/shift for
subsampled planes, etc is not specific to RAW,
> >> >> > its more like sample_rate or
width/height
> >> >>
> >> >> Sure, but when a "real"
codec is used, it's the decoder business to tell
> >> >> the app what output format it will
use. NUT can provide infos about the
> >> >> internal format used by the codec,
> >> >
> >> > Only very few codecs have headers which
store informations about
> >> > things like shift for subsampled planes.
Thus if this information
> >> > is desired it has to come from the
container more often than
> >> > not. If its not desired then we also dont
need it for raw IMHO.
> >>
> >> With compressed video, the decoder informs the
caller of the pixel
> >> format. With raw video, this information must
come from the
> >> container, one way or other.
> >
> > Yes, I agree for pixel format.
> > But the decoder often does not know the fine
details. Like as
> > mentioned "shift for subsampled plane"
or the precisse definition of
> > YUV or if it uses full luma range or not. MPEG
stores these yes, but
> > for example huffyuv does not. So it would make
some sense if this
> > information could be stored for non raw as well.
>
> Point taken, and I agree being able to transmit this
information could
> be useful. Using extradata is obviously out of the
question, which
> leaves either stream headers or info packets.
And looking at the stream headers, there is
colorspace_type
which ive apparently half forgotten ...
Does anyone mind if i add chroma_x/y_pos there as well?
rich?
>
> >> >> > > On a related subject, it
might also be useful to define the channel
> >> >> > > disposition when there is
more than one. Mono and stereo can go by
> >> >> > > with the classical default,
but as soon as there is more channels
> >> >> > > it is really unclear. And
imho such info could still be usefull
> >> >> > > with 1 or 2 channels.
Something like the position of each channel
> >> >> > > in polar coordinate (2D or
3D?) should be enouth.
> >> >> >
> >> >> > I agree
> >> >> > What about that LFE channel
thing?
> >> >>
> >> >> I was thinking about simply setting
the distance to 0, however a flag
> >> >> for "non-directional"
channels might be better.
> >> >
> >> > This is wrong, LFE is not about direction
but about the type of speaker.
> >> > LFE stands for "Low-frequency
effects".
> >> > If id move a other random speaker at
disatnce 0 and the LFE one out and
> >> > switch channels it wont sound correct
...
> >> >
> >> >>
> >> >> > And where do we put this info,
The stream header seems the logic
> >> >> > place if you ask me ...
> >> >>
> >> >> I agree, this is essential
information for proper presentation it
> >> >> definitly belong there.
> >> >
> >> > Good, now we just need to agree on some
half sane way to store it.
> >> > for(i=0; i<num_channels; i++){
> >> > x_position s
> >> > y_position s
> >> > z_position s
> >> > channel_flags v
> >> > }
> >> >
> >> > CHANNEL_FLAG_LFE 1
> >> >
> >> > seems ok?
> >>
> >> I'm not convinced this is the right way to go.
Consider a recording
> >> made with several directional microphones in
the same location. Using
> >> spherical coordinates could be a solution.
> >
> > The above was intended to specify the location of
the speakers not
> > microphones.
>
> I'm having a hard time imagining a player moving my
speakers around
> depending on the file being played.
>
> > And spherical coordinates would just drop the
distance, thats the same
> > as setting the distance to 1 and storing that as
xyz.
>
> Spherical coordinates without radius needs only two
fields.
True, but that gets tricky with integers and precission.
>
> > Actually the main reason why i didnt use spherical
is that with integers
> > theres a precission to decide on or you end up
with rationals. And this
> > somehow starts looking messy ...
>
> I don't see any fundamental difference. If restricted
to integer
> coordinates, an arbitrary point can be described only
with a certain
> precision, regardless of coordinate system.
True but if you map the points to a sphere, then x,y,z gives
you arbitrary
precisson on the surface of the sphere while with spherical
coordinates
this needs some additional "tricks".
Thus x,y,z give you arbitrary directional precission at
quite low complexity.
>
> >> Whatever the coordinate system, the location
and orientation of the
> >> listener must be specified, even if there is
only one logical choice.
> >
> > of course
> > right_position s
> > forward_position s
> > up_position s
> >
> > And
> > "the listener is at (0,0,0), (1,0,0) is
right, (0,1,0) is forward,
> > (0,0,1) is up"
>
> You're forgetting the measurement unit, i.e. metres,
feet, etc.
Hmm, I was thinking that only x/y, x/z that is the direction
would matter.
If theres some sense in also storing distance then we would
need a 4th
variable to specifiy the precission like:
(x/p, y/p, z/p) meter
We can surely do this if someone thinks this is usefull.
[...]
--
Michael GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB
When you are offended at any man's fault, turn to yourself
and study your
own failings. Then you will forget your anger. -- Epictetus
_______________________________________________
NUT-devel mailing list
NUT-devel mplayerhq.hu
https://lists.mplayerhq.hu/mailman/listinfo/nut-devel
|
|
|
|