List Info

Thread: Why does "-" read from stdin?




Why does "-" read from stdin?
user name
2007-02-01 20:50:55
Hi,

Why does xmlReadFile read from stdin if "-" is
specified as a filename, 
and is there any way to disable this behaviour?

It seems that this will create potential bugs wherever a
program passes 
a filename to libxml2 without checking it first; if the
filename is "-" 
then libxml2 will attempt to read from stdin and the program
may block 
indefinitely.

Best regards,

Michael

-- 
Print XML with Prince!
http://www.princexml.com

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml

Re: Why does "-" read from stdin?
user name
2007-02-02 01:16:35
Michael Day wrote:
> Hi,
> 
> Why does xmlReadFile read from stdin if "-"
is specified as a filename, 
> and is there any way to disable this behaviour?

It's fairly typical behaviour on Unix systems (may even be
POSIX, not
sure). And no, there isn't.

> It seems that this will create potential bugs wherever
a program passes 
> a filename to libxml2 without checking it first; if the
filename is "-" 
> then libxml2 will attempt to read from stdin and the
program may block 
> indefinitely.

Sure - but so would CON under windows, or /dev/stdin, or
/dev/ttys7, ...
Reading input in a blocking condition is not a problem for
libxml2; it's
up to the users to prevent such cases if it's important for
them.
If you're using the library, you could always freopen stdin
from
/dev/null (or dup fd 0, not sure if libxml2 uses the stdin
FILE*) - that
would prevent libxml2 from blocking if someone passed
"-" (but the other
cases listed above would still cause trouble).
Or you could filter user-supplied names and reject
"-" and "/dev/*",
which should catch most cases of blocking input.

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml

Re: Why does "-" read from stdin?
user name
2007-02-02 02:59:09
Hi Tim,

> It's fairly typical behaviour on Unix systems (may even
be POSIX, not
> sure). And no, there isn't.

It's typical behaviour for programs, not libraries; eg.
fopen("-") does 
not return stdin. The issue I'm concerned with is the way
that an API 
function taking a filename treats one value specially,
requiring a check 
and escaping by the caller, when it seems that aliasing
"-" to mean 
stdin is a decision to be made at a higher level, for
example when 
processing command line arguments.

> Sure - but so would CON under windows, or /dev/stdin,
or /dev/ttys7, ...

These situations are different in that you can actually have
a regular 
file called "-", and the xmlReadFile() function
won't load it. This is 
different from calling xmlReadFile() on a filename that
turns out to be 
bound to a socket or a pipe or some other blocking input (or
an NFS 
filesystem for that matter). Blocking after trying to load
"/dev/stdin" 
is not surprising, blocking while trying to load
"-" is.

The header file and the doxygen comments do not even mention
that "-" is 
treated specially; xmlParseFile() takes an argument called
filename and 
xmlReadFile() takes an argument called URL, when the actual
meaning of 
the value is more subtle: URL/filename unless the value is -
in which 
case it means stdin. Again, this requires every caller to
check for "-" 
and substitute "./-".

To be honest the special treatment for "-" seems
more like a hack to 
simplify xmllint than a sensible API choice for a generic
XML library. I 
understand that now is probably too late to change this kind
of stuff, 
as libxml2 was frozen in stone years ago. But perhaps it's
not too late 
to document it. How about changing:

/**
  * xmlParseFile:
  * filename:  the filename

to this:

  * filename:  the filename, or "-" to parse
from standard input

similarly for xmlReadFile:

  * filename:  a file or URL, or "-" to parse
from standard input

That would at least place a warning sign in the
documentation for 
application developers to be aware of what the argument
really means.

Best regards,

Michael

-- 
Print XML with Prince!
http://www.princexml.com

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml

Re: Why does "-" read from stdin?
user name
2007-02-02 03:14:19
Michael Day wrote:
> Hi Tim,
> 
>> It's fairly typical behaviour on Unix systems (may
even be POSIX, not
>> sure). And no, there isn't.
> 
> It's typical behaviour for programs, not libraries; eg.
fopen("-") does 
> not return stdin. The issue I'm concerned with is the
way that an API 
> function taking a filename treats one value specially,
requiring a check 
> and escaping by the caller, when it seems that aliasing
"-" to mean 
> stdin is a decision to be made at a higher level, for
example when 
> processing command line arguments.

fair point

>> Sure - but so would CON under windows, or
/dev/stdin, or /dev/ttys7, ...
> 
> These situations are different in that you can actually
have a regular 
> file called "-", and the xmlReadFile()
function won't load it. This is 
> different from calling xmlReadFile() on a filename that
turns out to be 
> bound to a socket or a pipe or some other blocking
input (or an NFS 
> filesystem for that matter). Blocking after trying to
load "/dev/stdin" 
> is not surprising, blocking while trying to load
"-" is.
> 
> The header file and the doxygen comments do not even
mention that "-" is 
> treated specially; xmlParseFile() takes an argument
called filename and 
> xmlReadFile() takes an argument called URL, when the
actual meaning of 
> the value is more subtle: URL/filename unless the value
is - in which 
> case it means stdin. Again, this requires every caller
to check for "-" 
> and substitute "./-".
> 
> To be honest the special treatment for "-"
seems more like a hack to 
> simplify xmllint than a sensible API choice for a
generic XML library. I 
> understand that now is probably too late to change this
kind of stuff, 
> as libxml2 was frozen in stone years ago. But perhaps
it's not too late 

Well if it's not documented, it's not set in stone - but I'm
inclined to
agree it's probably safest to just document it.

> to document it. How about changing:
> 
> /**
>   * xmlParseFile:
>   * filename:  the filename
> 
> to this:
> 
>   * filename:  the filename, or "-" to parse
from standard input
> 
> similarly for xmlReadFile:
> 
>   * filename:  a file or URL, or "-" to parse
from standard input
> 
> That would at least place a warning sign in the
documentation for 
> application developers to be aware of what the argument
really means.

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
xmlgnome.org
http://mai
l.gnome.org/mailman/listinfo/xml

[1-4]

about | contact  Other archives ( Real Estate discussion Medical topics )