List Info

Thread: Requested Expert Review Request: draft-ietf-mediactrl-vxml-00




Requested Expert Review Request: draft-ietf-mediactrl-vxml-00
country flaguser name
United States
2007-12-17 14:32:06
I was asked by Dean to provide a SIP Expert Review of this
draft, with 
special note to section 2.3. In general I like this draft.
It begins to 
exploit the power of sip, and to unify it with other
protocols. I did 
find some issues, all of which should be fairly
straightforward to address.

I've broken my comments down by section. In many cases I
have includes 
snips from the draft for reference and followed those with
pertinent 
comments.

	Thanks,
	Paul

Section 2.1:

The ABNF of the URI seems excessively rigid in the ordering
of 
parameters - first the init-parameters, then the
vxml-parameters, then 
then any other uri-parameters. It also doesn't note that 
'uri-parameters' is defined in 3261.

I would suggest something like:

   dialog-parameters = *(";" dialog-parameter)

   dialog-parameter  = init-param /
                       vxml-param /
                       uri-parameter ; defined in RFC 3261

   init-param        = (dialog-param /
                        maxage-param /
                        maxstale-param /
                        method-param /
                        postbody-param)

   vxml-param        = vxml-keyword "="
vxml-value

That achieves the same effect but allows all the params in
any order. 
This is almost a nit, but the rigid ordering is likely to be
a source of 
interop problems.

I'm confused about postbody:

    postbody:  Used to set the
application/x-www-form-urlencoded encoded
       [HTML4] HTTP body for "post" requests (or
is otherwise ignored).
       The postbody value is the prepared application/
       x-www-form-urlencoded content, subsequently
URL-encoded (see note
       below).

...

    Note: Special characters in Request-URI parameter values
need to be
    URL-encoded as required by the SIP URI syntax, for
example '?' (%3f),
    '=' (%3d), and ';' (%3b).  The VoiceXML Media Server
MUST therefore
    unescape Request-URI parameter values before making use
of them or
    exposing them to running VoiceXML applications.  It is
important that
    the VoiceXML Media Server only unescape the parameter
values once
    since the desired VoiceXML URI value could itself be URL
encoded, for
    example.  When a postbody is included, its entire
content including
    any line breaks (represented by a CR LF pair) is encoded
as a single
    parameter value following the above rules (such that the
line breaks
    would be replaced by '%0D%0A', for example).

[HTML4] says:

    application/x-www-form-urlencoded

    This is the default content type. Forms submitted with
this content
    type must be encoded as follows:

    1. Control names and values are escaped. Space
characters are
       replaced by `+', and then reserved characters are
escaped as
       described in [RFC1738], section 2.2: Non-alphanumeric
characters
       are replaced by `%HH', a percent sign and two
hexadecimal digits
       representing the ASCII code of the character. Line
breaks are
       represented as "CR LF" pairs (i.e.,
`%0D%0A').
    2. The control names/values are listed in the order they
appear in
       the document. The name is separated from the value by
`=' and
       name/value pairs are separated from each other by
`&'.

The interaction between these two escaping rules seems
potentially 
confusing. I *think* when this is all put together it means
that the 
body must first be encoded according to the [HTML4] section
above. At 
that point it will already be almost conformant to the 3261
syntax of a 
token, except for the use of '&'. Then it needs to be
encoded again, 
which will take care of the ampersands, but which will
re-encode the '%' 
characters of the first encoding.

Exactly what, if anything, I would recommend changing
depends on whether 
I understood what is expected. I guess I might just
recommend that you 
clarify further. (Perhaps I'm just being dense. If so please
just tell 
me so.)

Section 2.2:

    The Application Server SHOULD insert its own URI in the
Record-Route
    header so that it remains in the signaling path for
subsequent
    signaling related to the session.  This is of particular
importance
    for call transfers so that upstream Application Servers
or proxy
    servers see signaling originating from the Application
Server and not
    the VoiceXML Media Server itself.

I don't understand the purpose of the above. The SHOULD
strength of this 
requirement suggests to me an assumption of a particular
operating 
environment. In the general case, why should this be more
than MAY strength?

Section 2.3:

IMO the use of a media-less session is an entirely valid sip
usage. The 
only concern I might potentially have is if it were to catch
some UA 
unaware, because some UAs just aren't prepared to handle
this case. But 
the recommended usage here always puts the choice of doing
this in the 
hands of *other* UA, not the media server. So I see no
problem.

I do find it disconcerting that the initial invite and
subsequent 
reinvites are handled in different ways. For the initial one
you stall 
the VXML awaiting a media stream, but on subsequent ones you
assume the 
absence of a stream is equivalent to having a stream that
doesn't send 
anything. Why isn't the behavior consistent in these cases?
If you need 
to support both behaviors, then it might be better to
explicit and 
unique signaling for each. For instance, you might define
that a media 
stream with a=inactive or a=sendonly (from the client's
perspective - 
client putting media server "on hold") could be
treated as the absence 
of input but the absence of the stream means you should wait
for a 
stream to be negotiated.

Section 2.4:

I don't feel qualified to comment on the validity of the
mappings of 
History-Info. It might be good to get somebody else who
knows it well to 
comment on that.

       In addition, the array's toString() function returns
the full SIP
       Request-URI.  For example, assuming a Request-URI of
sip:dialog
       example.com;voicexml=http://example.com;obj={&
quot;x":1,"y":true} then

I don't believe the above URI is valid. The ',' '{' and '}'
aren't 
syntactically correct in a URI pvalue. You would need to
escape them.

Section 2.6.2:

IIUC, message (1) contains no offer, (5) contains an offer
with media, 
and (6) accepts the call but rejects the media. If so, then
(9) will 
most likely be invalid. To make it valid, the o-line from
(8) needs to 
be replaced with one consistent to that used in (6), with
the version 
number incremented. And if (5) has more m-lines than (8),
then (9) needs 
to be padded with extra (rejected) m-lines.

Section 5.2:

    On receipt of the REFER request, the VoiceXML Media
Server MUST issue
    a provisional response, 100 Trying.  The 202 Accepted
response
    indicates that the VoiceXML document has been fetched
and parsed
    correctly.  The VoiceXML Media Server proceeds to place
the outbound
    INVITE and will execute the application after the ACK is
sent.

The rules of RFC 4320 need to be followed here. REFER is a
non-invite 
transaction and so the timing of the 100 must be as
specified in 4320.

In the call flow, the sending of the initial NOTIFY before
the 202 for 
the REFER, and especially before determining if REFER is
going to 
succeed or fail, is at best unusual and almost certainly
incorrect. 
Sending the NOTIFY and then sending a failure response would
certainly 
be incorrect.

I think you have two choices:

- wait until the get is complete before sending the NOTIFY,
and probably 
send it after the 202.

- send a 202 for the REFER before doing the GET. Inform of a
GET failure 
via a NOTIFY.

Section 6.3:

In the call flow I think you probably need another NOTIFY
between 
messages (6) and (7). Its potentially too long until (13).


_______________________________________________
Sip mailing list  https://ww
w1.ietf.org/mailman/listinfo/sip
This list is for NEW development of the core SIP Protocol
Use sip-implementorscs.columbia.edu for questions on current
sip
Use sippingietf.org for new developments on the application of
sip

Re: Requested Expert Review Request: draft-ietf-mediactrl-vxml-00
user name
2007-12-21 18:17:36
Hi Paul,

Thanks a lot for the review. Mark and I have discussed each of these and I've responded to the comments below.

Cheers,

Dave

[snip]

Section 2.1:

The ABNF of the URI seems excessively rigid in the ordering of
parameters - first the init-parameters, then the vxml-parameters, then
then any other uri-parameters. It also doesn't note that
'uri-parameters' is defined in 3261.


DB> Good observation - ordering of the parameters was never intended. Your proposal looks good. However, while looking over this again, we found another issue which crept in late in the process related to the addition of JSON support. Specifically, the problem is that it is impossible to differentiate where vxml-params end and uri-parameters begin and hence which parameter values to interpret as JSON values and which as plain strings. We propose a change to limit JSON values to just the named VoiceXML session variables 'ccxml' and 'aai' (these were the two variables for which JSON support was originally sought).

  dialog-parameters = *(";" dialog-parameter)

  dialog-parameter ; = init-param / url-parameter ; defined in [RFC 3261]

  init-param  ;      = ";" (dialog-param /
       ;           ;         maxage-param /
   ;           ;           ;  maxstale-param /
       ;           ;         method-param /
       ;           ;         postbody-param /
       ;           ;         ccxml-param /
       ;           ;         aai-param)

  dialog-param      = "voicexml=" vxml-url ; vxml-url follows the URI
   ;           ;           ;           ;       ; syntax defined in [RFC3986]
  maxage-param      = "maxage=" 1*DIGIT

  maxstale-param    = "maxstale=" 1*DIGIT

  method-param      = "method=" ("get" / "post")

  postbody-param    = "postbody=" token

  ccxml-param       = "ccxml=" json-value

  aai-param    ;     = "aai=" json-value

  json-value  ;      =  false /
       ;           ;     null /
       ;           ;     true /
       ;           ;     object /
       ;           ;     array /
   ;           ;         number /
       ;           ;     string ; see RFC4627


I'm confused about postbody:

[snip]

The interaction between these two escaping rules seems potentially
confusing. I *think* when this is all put together it means that the
body must first be encoded according to the [HTML4] section above. At
that point it will already be almost conformant to the 3261 syntax of a
token, except for the use of '&'. Then it needs to be encoded again,
which will take care of the ampersands, but which will re-encode the '%';
characters of the first encoding.

[snip]

DB> Your interpretation of the intent is correct. The postbody value is an application/x-www-form-urlencoded string which we subsequenty re-encode (the latter step affecting each & and % character). We will clarify the text that this is the case. Moreover, we'll be more specific and call out which parameter values are subject to this encoding step , namely vxml-url, the postbody token, and the json-value.

Section 2.2:

   The Application Server SHOULD insert its own URI in the Record-Route
   header so that it remains in the signaling path for subsequent
   signaling related to the session.  This is of particular importance
   for call transfers so that upstream Application Servers or proxy
   servers see signaling originating from the Application Server and not
   the VoiceXML Media Server itself.

I don't understand the purpose of the above. The SHOULD strength of this
requirement suggests to me an assumption of a particular operating
environment. In the general case, why should this be more than MAY strength?

DB> Actually, this paragraph is quite problematic. We shoudn';t be mandating how the AS behaves (and indeed a B2BUA doesn't Record-Route anyway). This paragraph is also a hang over from before we recommended against using VoiceXML transfer features with application servers. We propose to delete it.

Section 2.3:

IMO the use of a media-less session is an entirely valid sip usage. The
only concern I might potentially have is if it were to catch some UA
unaware, because some UAs just aren't prepared to handle this case. But
the recommended usage here always puts the choice of doing this in the
hands of *other* UA, not the media server. So I see no problem.

I do find it disconcerting that the initial invite and subsequent
reinvites are handled in different ways. For the initial one you stall
the VXML awaiting a media stream, but on subsequent ones you assume the
absence of a stream is equivalent to having a stream that doesn't send
anything. Why isn't the behavior consistent in these cases? If you need
to support both behaviors, then it might be better to explicit and
unique signaling for each. For instance, you might define that a media
stream with a=inactive or a=sendonly (from the client';s perspective -
client putting media server "on hold") could be treated as the absence
of input but the absence of the stream means you should wait for a
stream to be negotiated.

DB> We're conscious of the inconsistency but we took a judgement call to live with it. Preparing a session is hugely valuable for latency hiding. However, to support this kind of pause/resume feature once the dialog is executing is not practical since VoiceXML has no concept of such semantics and it is difficult to implement especially when external speech recognition and synthesis systems are used and operate on their own timers without pausing capabilities. 

[snip]
      In addition, the array's toString() function returns the full SIP
      Request-URI.  For example, assuming a Request-URI of sip:dialog
      example.com;voicexml=http://example.com;obj={"x":1,"y":true} then

I don't believe the above URI is valid. The ','; '{'; and '}'; aren't
syntactically correct in a URI pvalue. You would need to escape them.

DB> Good catch, will fix

Section 2.6.2:

IIUC, message (1) contains no offer, (5) contains an offer with media,
and (6) accepts the call but rejects the media. If so, then (9) will
most likely be invalid. To make it valid, the o-line from (8) needs to
be replaced with one consistent to that used in (6), with the version
number incremented. And if (5) has more m-lines than (8), then (9) needs
to be padded with extra (rejected) m-lines.

DB> Good catch. The intent here is to that the similarly SDP messages share the same m- and a- lines only. We propose to update the diagram and perhaps using a prime to denote a derivative SDP message, e.g. [offer29;], and clarifying in the text.


Section 5.2:

   On receipt of the REFER request, the VoiceXML Media Server MUST issue
   a provisional response, 100 Trying.  The 202 Accepted response
   indicates that the VoiceXML document has been fetched and parsed
   correctly.  The VoiceXML Media Server proceeds to place the outbound
   INVITE and will execute the application after the ACK is sent.

The rules of RFC 4320 need to be followed here. REFER is a non-invite
transaction and so the timing of the 100 must be as specified in 4320.

DB> Will fix

In the call flow, the sending of the initial NOTIFY before the 202 for
the REFER, and especially before determining if REFER is going to
succeed or fail, is at best unusual and almost certainly incorrect.
Sending the NOTIFY and then sending a failure response would certainly
be incorrect.

I think you have two choices:

- wait until the get is complete before sending the NOTIFY, and probably
send it after the 202.

- send a 202 for the REFER before doing the GET. Inform of a GET failure
via a NOTIFY.

DB> There is currently an ongoing thread on the list about whether we remove this feature (and hence section) altogether. For the sake of argument, I'll assume it stays, in which case we think option 2 is the most conventional (especially since most systems responds to the REFER instantaneously). A GET failure would be conveyed by a NOTIFY with a 500 Server Internal Failure sipfrag message (same status used for the INVITE case).
 

Section 6.3:

In the call flow I think you probably need another NOTIFY between
messages (6) and (7). Its potentially too long until (13).

DB> This is correct but we note in the text "(provisional responses and NOTIFY
   messages corresponding to provisional responses have been omitted for
   clarity)"

[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )