List Info

Thread: Info packets in NUT stream (spec bugs?)




Info packets in NUT stream (spec bugs?)
user name
2006-11-21 19:32:14
On Tue, Nov 21, 2006 at 06:32:03PM +0100, Michael
Niedermayer wrote:
> Hi
> 
> On Mon, Nov 20, 2006 at 08:57:36PM -0500, Rich Felker
wrote:
> [...]
> > > > I've never actually tested it, but AFAIK
libnut is completely safe and 
> > > > non-breaking on this issue.
> > > 
> > > theres at least one issue with random start
timestamps
> > > try 1e9999 as start timestamp and tell me if
that worked 
> > > while the fileformat of course has no problem
with arbitrary integers,
> > > implementations will ...
> > > making it clear that 0 should be used as
start where possible reduces
> > > the issue but doesnt solve it
> > 
> > I think it's clear that if you use idiotic time
values you'll have
> > problems with implementation support. IMO it's
fine to say just that
> > implementations SHOULD NOT go out of their way to
support excessively
> > large values for any field in a NUT file.
> 
> what is excessively large? whats idiotic? thats not a
good way to specify
> the valid range of a value
> >32bit is idiotic for many people iam pretty sure,
still its not enough
> if your input data is in nanosecond precission ...
> 
> and its neither reasonable to assume that everyone has
to spend an hour
> per field to guess what range of values would have to
be supported to handle
> all non idiotic cases

My idea is that what's idiotic changes with time. That's why
we use
vlc rather than fixed-size fields. Unlike other potential
areas of
abuse in the spec, I don't see any realistic issue with
people
intentionally choosing initial timestamps that will cause
trouble with
some implementations. Generally the only things people would
choose
for starting timestamps would be 0, the end timestamp of
another file,
or the current unix time (seconds since the epoch). All of
these will
fit ok in 64bit as long as a sane timebase is used.

> > It is always possible via linear search. If the
demuxer SHOULD NOT
> > search for them then we should not go out of our
way to make it easy
> > to search... Just my 2¢...
> 
> well there really are 2 cases IMHO
> A. midstream info packets are not allowed in normal nut
files
> B. midstream info packets are allowed in normal nut
files
> 
> for A i agree that the pointers and repeating shouldnt
be required, there may
> be other reasons though why repeating the info makes
sense ...
> 
> for B i dont agree, simply because if info is there,
then there are cases
> where the user will want to have that info, think of
some capture of odeds
> radio stream, its not unlikely to think that the user
would want to seek to
> a specific song (she knows the song title but not the
time to seek to)

Arrg, this is what I was saying way back about info streams
and I got
flamed to death. Anyway such file has no index already, so
it's not
meant to be searchable by index, and searching by chapter
_name_ does
not work with binary search so linear search is the natural
requirement anyway.

If a user wants to be able to search this file they'll
probably remux
it to make a complete nut file (one with index and repeated
headers).
While there are some questions to answer I'd be generally ok
with
mandating that a "complete" nut file must not have
info except after
the headers and that it must be repeated and identical. On
the other
hand I think it still might be best to just always mandate
this, so
that a complete nut file can be generated just by appending
to an
incomplete one. I fear if we don't allow this people will
make
"append-based" completion utilities anyway because
it's so much more
convenient.

Rich

_______________________________________________
NUT-devel mailing list
NUT-develmplayerhq.hu

http://lists.mplayerhq.hu/mailman/listinfo/nut-devel
Info packets in NUT stream (spec bugs?)
user name
2006-11-21 21:35:57
Hi

On Tue, Nov 21, 2006 at 02:32:14PM -0500, Rich Felker wrote:
> On Tue, Nov 21, 2006 at 06:32:03PM +0100, Michael
Niedermayer wrote:
> > Hi
> > 
> > On Mon, Nov 20, 2006 at 08:57:36PM -0500, Rich
Felker wrote:
> > [...]
> > > > > I've never actually tested it, but
AFAIK libnut is completely safe and 
> > > > > non-breaking on this issue.
> > > > 
> > > > theres at least one issue with random
start timestamps
> > > > try 1e9999 as start timestamp and tell
me if that worked 
> > > > while the fileformat of course has no
problem with arbitrary integers,
> > > > implementations will ...
> > > > making it clear that 0 should be used as
start where possible reduces
> > > > the issue but doesnt solve it
> > > 
> > > I think it's clear that if you use idiotic
time values you'll have
> > > problems with implementation support. IMO
it's fine to say just that
> > > implementations SHOULD NOT go out of their
way to support excessively
> > > large values for any field in a NUT file.
> > 
> > what is excessively large? whats idiotic? thats
not a good way to specify
> > the valid range of a value
> > >32bit is idiotic for many people iam pretty
sure, still its not enough
> > if your input data is in nanosecond precission ...
> > 
> > and its neither reasonable to assume that everyone
has to spend an hour
> > per field to guess what range of values would have
to be supported to handle
> > all non idiotic cases
> 
> My idea is that what's idiotic changes with time.
That's why we use
> vlc rather than fixed-size fields. Unlike other
potential areas of
> abuse in the spec, I don't see any realistic issue with
people
> intentionally choosing initial timestamps that will
cause trouble with
> some implementations. Generally the only things people
would choose
> for starting timestamps would be 0, the end timestamp
of another file,
> or the current unix time (seconds since the epoch). All
of these will
> fit ok in 64bit as long as a sane timebase is used.

for rtp the start timestamp is recommanded to be random()
IIRC and for
transcoding people might choose to keep the start timestamp
also when
taking some seconds since x and converting that to a
"insane" timebase
problems will happen ...


> 
> > > It is always possible via linear search. If
the demuxer SHOULD NOT
> > > search for them then we should not go out of
our way to make it easy
> > > to search... Just my 2¢...
> > 
> > well there really are 2 cases IMHO
> > A. midstream info packets are not allowed in
normal nut files
> > B. midstream info packets are allowed in normal
nut files
> > 
> > for A i agree that the pointers and repeating
shouldnt be required, there may
> > be other reasons though why repeating the info
makes sense ...
> > 
> > for B i dont agree, simply because if info is
there, then there are cases
> > where the user will want to have that info, think
of some capture of odeds
> > radio stream, its not unlikely to think that the
user would want to seek to
> > a specific song (she knows the song title but not
the time to seek to)
> 
> Arrg, this is what I was saying way back about info
streams and I got
> flamed to death. Anyway such file has no index already,
so it's not
> meant to be searchable by index, and searching by
chapter _name_ does
> not work with binary search so linear search is the
natural
> requirement anyway.

it is possible to extend info packets so that O(log n)
search for
names can be done ill explain how (and no iam not saying i
actually propose
doing that, its just a random thought)
1. add a hash table to every info packet which contains
pointers to info
   packets for all names in the X previous info packets
2. X is the largst power of 2 which divides n which is the
number of the
   current info packet, (n=5 -> X=1, n=6 -> X=2, n=8
-> X=8)
3. add a pointer to the Xth previous info packet

the space requirement for this is O(n log n) for n info
packets
now to search for your favorite name, start with the last
info packet
search its hash table, if theres a match you have your name,
if not follow
the pointer from 3. and retry (X must be at least twice as
large after each
retry so this is guranteed to terminate after log n steps)

another random thought
100 music videos with midstream info back pointers need 100
seeks to read all
with 10ms per seek thats 1 second, if we assume 3min
playtime per music video
the whole would be 300min and at a realistic bitrate that
will take much
longer to search without the pointers

and yet another random tought
if we now repeat the last X different info packets with each
info packet
similarely to the hash table mess then we could read all
with log n seeks
(and n log n space instead of n for n info packets) while
the complexity
would be very low on the demuxer side, just linear search +
follow the
pointer

[...]
-- 
Michael     GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or
copy any book
Today you'd get arrested for mere telling someone where the
library is
_______________________________________________
NUT-devel mailing list
NUT-develmplayerhq.hu

http://lists.mplayerhq.hu/mailman/listinfo/nut-devel
Info packets in NUT stream (spec bugs?)
user name
2006-11-22 01:13:27
On Tue, Nov 21, 2006 at 10:35:57PM +0100, Michael
Niedermayer wrote:
> On Tue, Nov 21, 2006 at 02:32:14PM -0500, Rich Felker
wrote:
> > On Tue, Nov 21, 2006 at 06:32:03PM +0100, Michael
Niedermayer wrote:
> > > what is excessively large? whats idiotic?
thats not a good way to specify
> > > the valid range of a value
> > > >32bit is idiotic for many people iam
pretty sure, still its not enough
> > > if your input data is in nanosecond
precission ...
> > > 
> > > and its neither reasonable to assume that
everyone has to spend an hour
> > > per field to guess what range of values would
have to be supported to handle
> > > all non idiotic cases
> > 
> > My idea is that what's idiotic changes with time.
That's why we use
> > vlc rather than fixed-size fields. Unlike other
potential areas of
> > abuse in the spec, I don't see any realistic issue
with people
> > intentionally choosing initial timestamps that
will cause trouble with
> > some implementations. Generally the only things
people would choose
> > for starting timestamps would be 0, the end
timestamp of another file,
> > or the current unix time (seconds since the
epoch). All of these will
> > fit ok in 64bit as long as a sane timebase is
used.
> 
> for rtp the start timestamp is recommanded to be
random() IIRC and for
> transcoding people might choose to keep the start
timestamp also when
> taking some seconds since x and converting that to a
"insane" timebase
> problems will happen ...

Just wanted to know one more possible scenario for this -
stupid people 
transcoding directly with pts from mpeg (i've seen mpegs
start at 
timestamps of 24352...)

- ods15
_______________________________________________
NUT-devel mailing list
NUT-develmplayerhq.hu

http://lists.mplayerhq.hu/mailman/listinfo/nut-devel
Info packets in NUT stream (spec bugs?)
user name
2006-11-22 01:37:42
Hi

On Wed, Nov 22, 2006 at 03:13:27AM +0200, Oded Shimon wrote:
> On Tue, Nov 21, 2006 at 10:35:57PM +0100, Michael
Niedermayer wrote:
> > On Tue, Nov 21, 2006 at 02:32:14PM -0500, Rich
Felker wrote:
> > > On Tue, Nov 21, 2006 at 06:32:03PM +0100,
Michael Niedermayer wrote:
> > > > what is excessively large? whats
idiotic? thats not a good way to specify
> > > > the valid range of a value
> > > > >32bit is idiotic for many people iam
pretty sure, still its not enough
> > > > if your input data is in nanosecond
precission ...
> > > > 
> > > > and its neither reasonable to assume
that everyone has to spend an hour
> > > > per field to guess what range of values
would have to be supported to handle
> > > > all non idiotic cases
> > > 
> > > My idea is that what's idiotic changes with
time. That's why we use
> > > vlc rather than fixed-size fields. Unlike
other potential areas of
> > > abuse in the spec, I don't see any realistic
issue with people
> > > intentionally choosing initial timestamps
that will cause trouble with
> > > some implementations. Generally the only
things people would choose
> > > for starting timestamps would be 0, the end
timestamp of another file,
> > > or the current unix time (seconds since the
epoch). All of these will
> > > fit ok in 64bit as long as a sane timebase is
used.
> > 
> > for rtp the start timestamp is recommanded to be
random() IIRC and for
> > transcoding people might choose to keep the start
timestamp also when
> > taking some seconds since x and converting that to
a "insane" timebase
> > problems will happen ...

additional note, i dont know if there really is a problem in
the rtp case
as i dont know rtp well enough, maybe the timestamps have
few enough bits
so this wouldnt be a problem but still a "choose a sane
value" is not a
good way to specify the valid range of something ...


> 
> Just wanted to know one more possible scenario for this
- stupid people 
> transcoding directly with pts from mpeg (i've seen
mpegs start at 
> timestamps of 24352...)

please add a note to the spec sying that people MUST NOT use
timestamps
from mpeg* blindly while transcoding as mpeg can have
timestamp
discontinuities ...

[...]
-- 
Michael     GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or
copy any book
Today you'd get arrested for mere telling someone where the
library is
_______________________________________________
NUT-devel mailing list
NUT-develmplayerhq.hu

http://lists.mplayerhq.hu/mailman/listinfo/nut-devel
Info packets in NUT stream (spec bugs?)
user name
2006-11-22 10:40:48
Hi

On Tue, Nov 21, 2006 at 10:35:57PM +0100, Michael
Niedermayer wrote:
[...]
> > > > It is always possible via linear search.
If the demuxer SHOULD NOT
> > > > search for them then we should not go
out of our way to make it easy
> > > > to search... Just my 2¢...
> > > 
> > > well there really are 2 cases IMHO
> > > A. midstream info packets are not allowed in
normal nut files
> > > B. midstream info packets are allowed in
normal nut files
> > > 
> > > for A i agree that the pointers and repeating
shouldnt be required, there may
> > > be other reasons though why repeating the
info makes sense ...
> > > 
> > > for B i dont agree, simply because if info is
there, then there are cases
> > > where the user will want to have that info,
think of some capture of odeds
> > > radio stream, its not unlikely to think that
the user would want to seek to
> > > a specific song (she knows the song title but
not the time to seek to)
> > 
> > Arrg, this is what I was saying way back about
info streams and I got
> > flamed to death. Anyway such file has no index
already, so it's not
> > meant to be searchable by index, and searching by
chapter _name_ does
> > not work with binary search so linear search is
the natural
> > requirement anyway.
> 
> it is possible to extend info packets so that O(log n)
search for
> names can be done ill explain how (and no iam not
saying i actually propose
> doing that, its just a random thought)
> 1. add a hash table to every info packet which contains
pointers to info
>    packets for all names in the X previous info packets
> 2. X is the largst power of 2 which divides n which is
the number of the
>    current info packet, (n=5 -> X=1, n=6 -> X=2,
n=8 -> X=8)
> 3. add a pointer to the Xth previous info packet

another idea, if we choose Y=sqrt( number of info packets )
and then duplicate the last Y info packets in every Yth info
packet
and give the small info packets a pointer to the last and
the large ones
a pointer to the last large one, then it should look like:
1 2 3 1234 5 6 7 5678 9 A B 9ABC D E 

and needs O(n) space (actually exactly 2*n) and O(sqrt(n))
seeks for
finding a info packet

the same can be done with the xth root, O(n^(1/x)) time and
x*n overhead

[...]
-- 
Michael     GnuPG fingerprint:
9FF2128B147EF6730BADF133611EC787040B0FAB

In the past you could go to a library and read, borrow or
copy any book
Today you'd get arrested for mere telling someone where the
library is
_______________________________________________
NUT-devel mailing list
NUT-develmplayerhq.hu

http://lists.mplayerhq.hu/mailman/listinfo/nut-devel
[1-10] [11-15]

about | contact  Other archives ( Real Estate discussion Medical topics )