|
List Info
Thread: wxInputStream and multibyte character sets
|
|
| wxInputStream and multibyte character
sets |
  United Kingdom |
2007-05-12 10:40:05 |
Hi,
I have a problem with module Wx::Perl::ProcessStream.
This reads the STDOUT from an external process executed
using Wx::Process and Wx::ExecuteCommand.
To do this it, it does a 'readline' on the wxInputStream
available via Wx::Process.
My problem is that the implementation of READLINE in
Wx::InputStream works as follows:
read a char from the stream
append char to a wxString
return wxString if char == 'n'
This appears not to work if the output stream from the
external process is, for example, UTF-8.
I *think* what perhaps should happen is
read a char from the stream
add it to a charbuffer
if char == 'n' {
convert charbuffer to a wxString ( method determined by
wxWidgets unicode/ansi build macros )
return wxString
}
Alas, my C is too poor to implement / test this (I've tried
:-( ) - I suspect it would be v.simple for anyone with
adequate C skills.
I have a workaround for the Wx::Perl::ProcessStream module.
In that I have stopped using readline and do 'read's on the
wxInputStream in a perl loop with localised byte mode.
This seems to work.
I'm not sure if Wx::InputStream::READLINE needs changing or
not. Any thoughts?
Mark
------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
wxperl-users mailing list
wxperl-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wxperl-use
rs
|
|
| Re: wxInputStream and multibyte
character sets |

|
2007-05-13 11:18:47 |
On Sat, 12 May 2007 16:40:05 +0100
Mark Dootson <mark.dootson znix.com> wrote:
Hi,
> I have a problem with module Wx::Perl::ProcessStream.
> This reads the STDOUT from an external process executed
using Wx::Process and Wx::ExecuteCommand.
> This appears not to work if the output stream from the
external process is, for example, UTF-8.
I see.
> I *think* what perhaps should happen is
>
> read a char from the stream
> add it to a charbuffer
> if char == 'n' {
> convert charbuffer to a wxString ( method
determined by wxWidgets unicode/ansi build macros )
> return wxString
> }
Seems reasonable.
> I'm not sure if Wx::InputStream::READLINE needs
changing or not. Any thoughts?
I believe (at least) an option to do so would be a good
idea. Adding an
optional 'encoding' parameter is likely the best option.
For a test case, is
#!/usr/bin/perl -w
use Encode;
print Encode::encode_utf8( "àn" );
sleep 1;
print Encode::encode_utf8( "àn" );
sleep 1;
print Encode::encode_utf8( "àn" );
print Encode::encode_utf8( "àn" );
a good enough test case?
Regards.
Mattia
------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
wxperl-users mailing list
wxperl-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wxperl-use
rs
|
|
| Re: wxInputStream and multibyte
character sets |
  United Kingdom |
2007-05-13 17:11:04 |
Mattia Barbon wrote:
>
> use Encode;
> print Encode::encode_utf8( "àn" );
> sleep 1;
> print Encode::encode_utf8( "àn" );
> sleep 1;
> print Encode::encode_utf8( "àn" );
> print Encode::encode_utf8( "àn" );
>
> a good enough test case?
>
I am not sure. I must confess a high (but reducing ) level
of ignorance where multibyte char sets are concerned.
For my own particular case with Wx::Perl::ProcessStream, I
decided that the thing to do was to make sure I got an exact
byte for byte representation of the output stream returned
into the perl code. Then, whatever happens when the bytes
are treated as a string, at least it can be controlled
within your perl.
I found it difficult to predict the effects of different
operating systems and locale settings so reverted to
ensuring I output a known series of bytes to read in and
compare. The simplest way I found of ensuring no intervening
encoding layers were applied was to use test output in an
encoded file and then read it in and output it in binmode
with 'use bytes;'. I used the attached utf8.dat and
latin1.dat files sent to me by a user of
Wx::Perl::ProcessStream.
If you have better things to spend time on than this, if you
put together a quick untested code change that more or less
does the job I will be happy to learn how to construct
adequate test cases and test it. With the basic idea in
place, I should be able to plod through any changes flagged
up by testing.
Regards
Mark
------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
wxperl-users mailing list
wxperl-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wxperl-use
rs
|
|
|
|
| Re: wxInputStream and multibyte
character sets |

|
2007-06-19 15:15:32 |
On Sat, 12 May 2007 16:40:05 +0100
Mark Dootson <mark.dootson znix.com> wrote:
Hi,
> I have a problem with module Wx::Perl::ProcessStream.
> This reads the STDOUT from an external process executed
using Wx::Process and Wx::ExecuteCommand.
>
> To do this it, it does a 'readline' on the
wxInputStream available via Wx::Process.
>
> My problem is that the implementation of READLINE in
Wx::InputStream works as follows:
>
> read a char from the stream
> append char to a wxString
> return wxString if char == 'n'
>
>
> This appears not to work if the output stream from the
external process is, for example, UTF-8.
>
> I *think* what perhaps should happen is
>
> read a char from the stream
> add it to a charbuffer
> if char == 'n' {
> convert charbuffer to a wxString ( method
determined by wxWidgets unicode/ansi build macros )
> return wxString
> }
I do not agree with what you write above (note: I am not
saying the current implementation is correct!)
what I think readline should do is work with bytes:
read a char from the stream
add it to a charbuffer
if char == 'n' {
return the buffer as a byte string, without performing
any conversion
}
I believe that automatically interpreting program output
based upon
wxWidgets ideas (which usually means using the current
locale) will cause
trouble. Returning bytes leaves the interpretation to the
calling program
which is always a safe choice.
Regards
Mattia
------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
wxperl-users mailing list
wxperl-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wxperl-use
rs
|
|
| Re: wxInputStream and multibyte
character sets |
  United Kingdom |
2007-06-19 15:39:29 |
Mattia Barbon wrote:
> I do not agree with what you write above (note: I am
not saying the current implementation is correct!)
> what I think readline should do is work with bytes:
>
> read a char from the stream
> add it to a charbuffer
> if char == 'n' {
> return the buffer as a byte string, without
performing any conversion
> }
>
> I believe that automatically interpreting program
output based upon
> wxWidgets ideas (which usually means using the current
locale) will cause
> trouble. Returning bytes leaves the interpretation to
the calling program
> which is always a safe choice.
I think you are right. Bytes makes much more sense.
Regards
Mark
------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
wxperl-users mailing list
wxperl-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wxperl-use
rs
|
|
| Re: wxInputStream and multibyte
character sets |

|
2007-06-20 15:07:30 |
On Tue, 19 Jun 2007 21:39:29 +0100
Mark Dootson <mark.dootson znix.com> wrote:
> Mattia Barbon wrote:
> > I do not agree with what you write above (note:
I am not saying the current implementation is correct!)
> > what I think readline should do is work with
bytes:
> >
> > read a char from the stream
> > add it to a charbuffer
> > if char == 'n' {
> > return the buffer as a byte string, without
performing any conversion
> > }
> >
> > I believe that automatically interpreting
program output based upon
> > wxWidgets ideas (which usually means using the
current locale) will cause
> > trouble. Returning bytes leaves the
interpretation to the calling program
> > which is always a safe choice.
>
> I think you are right. Bytes makes much more sense.
Changed in Subversion. I tried it with
Wx::Perl::ProcessStream 0.09
and it works as I expect. Pleas let me know if it works for
you too.
Regards,
Mattia
------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
wxperl-users mailing list
wxperl-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wxperl-use
rs
|
|
| Re: wxInputStream and multibyte
character sets |

|
2007-06-21 13:50:20 |
On Thu, 21 Jun 2007 04:01:22 +0100
Mark Dootson <mark.dootson znix.com> wrote:
> It doesn't work as I expected on Win32. (but I might be
expecting the wrong thing).
You're expecting the right one.
Fixed, thanks!
Mattia
------------------------------------------------------------
-------------
This SF.net email is sponsored by DB2 Express
Download DB2 Express C - the FREE version of DB2 express and
take
control of your XML. No limits. Just data. Click to get it
now.
http://sourcefor
ge.net/powerbar/db2/
_______________________________________________
wxperl-users mailing list
wxperl-users lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/wxperl-use
rs
|
|
[1-7]
|
|
|
about | contact Other archives ( Real Estate discussion Medical topics )
|