List Info

Thread: win32-dir, unicode




win32-dir, unicode
user name
2006-05-28 07:50:33
Hi,

2006/5/28, Daniel Berger <djberg96gmail.com>:
> Heesob Park wrote:
> > Hi,
<snip>
>
> Thanks Heesob, but I'm getting some weird segfaults
with wide character
> functions and buffers over 245 characters.  I posted
about this to
> ruby-talk as well.  Here's some sample code that
demonstrates the problem:
>
> require 'Win32API'
>
> GetFullPathNameW = Win32API.new('kernel32',
'GetFullPathNameW', 'PLPP', 'L')
>
> path = "C:\\test"
> buf  = 0.chr * 260 # 245 or less works ok
>
> if GetFullPathNameW.call(path, buf.size, buf, 0) == 0
>    puts "Failed"
>    exit
> end
>
> p buf.split("\0\0").first # BOOM!
>
> I'm not sure what the significance of 245 or less is. 
I can inspect
> 'buf', copy and paste it to a separate editor as a
string and run ops on
> it with no problem, so I'm very curious as to what's
making Ruby segfault.
>
I'm very curious too.

Your sample code don't segfault.
But I came across segfaults several times in modifing
create_junction method.
It's behaviour is very unstable, as I insert p method,it
sometimes runs OK.
It's location was mainly split method or get_last_error
method.

I guess it's not related with 245 or 260, but it seems to
underlying C
pointer memory access failure problem.
Can you give me a sample stable segfault generating code?

Regards,

Park Heesob

_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-28 13:39:52
Heesob Park wrote:
> Hi,
> 
> 2006/5/28, Daniel Berger <djberg96gmail.com>:
>> Heesob Park wrote:
>>> Hi,
> <snip>
>> Thanks Heesob, but I'm getting some weird
segfaults with wide character
>> functions and buffers over 245 characters.  I
posted about this to
>> ruby-talk as well.  Here's some sample code that
demonstrates the problem:
>>
>> require 'Win32API'
>>
>> GetFullPathNameW = Win32API.new('kernel32',
'GetFullPathNameW', 'PLPP', 'L')
>>
>> path = "C:\\test"
>> buf  = 0.chr * 260 # 245 or less works ok
>>
>> if GetFullPathNameW.call(path, buf.size, buf, 0) ==
0
>>    puts "Failed"
>>    exit
>> end
>>
>> p buf.split("\0\0").first # BOOM!
>>
>> I'm not sure what the significance of 245 or less
is.  I can inspect
>> 'buf', copy and paste it to a separate editor as
a string and run ops on
>> it with no problem, so I'm very curious as to
what's making Ruby segfault.
>>
> I'm very curious too.
> 
> Your sample code don't segfault.
> But I came across segfaults several times in modifing
create_junction method.
> It's behaviour is very unstable, as I insert p
method,it sometimes runs OK.
> It's location was mainly split method or
get_last_error method.
> 
> I guess it's not related with 245 or 260, but it seems
to underlying C
> pointer memory access failure problem.
> Can you give me a sample stable segfault generating
code?

To add even more mystery to this problem, that code isn't
segfaulting 
for me at the moment, though it was regularly last night (I
still have 
the console window open that shows the segfaults to prove it
to myself).

I saw the same behavior you mentioned - inserting a 'puts'
would 
sometimes cause code that was previously segfaulting to
suddenly work.

Well, let's put the Unicode stuff on hold for now.  Perhaps
deep 
inspection of string.c some day will reveal what the
potential problem is.

Many thanks,

Dan
_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-29 02:20:14
Hi,

2006/5/28, Daniel Berger <djberg96gmail.com>:
> Heesob Park wrote:
> > Hi,
> >
> > 2006/5/28, Daniel Berger <djberg96gmail.com>:
> >> Heesob Park wrote:
> >>> Hi,
> > <snip>
> >> Thanks Heesob, but I'm getting some weird
segfaults with wide character
> >> functions and buffers over 245 characters.  I
posted about this to
> >> ruby-talk as well.  Here's some sample code
that demonstrates the problem:
> >>
> >> require 'Win32API'
> >>
> >> GetFullPathNameW = Win32API.new('kernel32',
'GetFullPathNameW', 'PLPP', 'L')
> >>
> >> path = "C:\\test"
> >> buf  = 0.chr * 260 # 245 or less works ok
> >>
> >> if GetFullPathNameW.call(path, buf.size, buf,
0) == 0
> >>    puts "Failed"
> >>    exit
> >> end
> >>
> >> p buf.split("\0\0").first #
BOOM!
> >>
> >> I'm not sure what the significance of 245 or
less is.  I can inspect
> >> 'buf', copy and paste it to a separate
editor as a string and run ops on
> >> it with no problem, so I'm very curious as to
what's making Ruby segfault.
> >>
> > I'm very curious too.
> >
> > Your sample code don't segfault.
> > But I came across segfaults several times in
modifing create_junction method.
> > It's behaviour is very unstable, as I insert p
method,it sometimes runs OK.
> > It's location was mainly split method or
get_last_error method.
> >
> > I guess it's not related with 245 or 260, but it
seems to underlying C
> > pointer memory access failure problem.
> > Can you give me a sample stable segfault
generating code?
>
> To add even more mystery to this problem, that code
isn't segfaulting
> for me at the moment, though it was regularly last
night (I still have
> the console window open that shows the segfaults to
prove it to myself).
>
> I saw the same behavior you mentioned - inserting a
'puts' would
> sometimes cause code that was previously segfaulting to
suddenly work.
>
> Well, let's put the Unicode stuff on hold for now. 
Perhaps deep
> inspection of string.c some day will reveal what the
potential problem is.
>
I have found out what is the problem.
It's not bug of Ruby or Windows, it is only bug of code.

First try this:

require 'Win32API'
GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
for i in 1..100
path = "c:\\test"
buf  = 0.chr * 260
if GetFullPathNameW.call(path, buf.size, buf, 0) == 0
    puts "Failed"
end
p buf.split("\0\0").first
end

It will cause various errors like
uninitialized constant GetFullPathNameW (NameError)
or
segfault.

Next, try this:

require 'Win32API'
GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
for i in 1..100
path = "c:\\test"
buf  = 0.chr * 260
# buf.size/2 -> actual length of buf
if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0
    puts "Failed"
end
p buf.split("\0\0").first
end

It runs Ok. but the result is not correct.

Next , try this:

require 'Win32API'
GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
for i in 1..100
# append \0 to path
path = "c:\\test\0"
buf  = 0.chr * 260
# buf.size/2 -> actual length of buf in unicode string
if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0
    puts "Failed"
end
p buf.split("\0\0").first
end

It runs ok. The result is correct.

Finally, the complete and correct code is like this:

require 'Win32API'
GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
for i in 1..100
path = "c\0:\0\\\0t\0e\0s\0t\0\0"
buf  = 0.chr * 260
# buf.size/2 -> actual length of buf in unicode string
if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0
    puts "Failed"
end
buf = buf.split("\0\0").first
buf = (buf.size % 2).zero? ? buf : buf+"\0"
p buf
end

Remeber, Ruby's string is terminated with "\0"
implicitly, but UTF16
string requires double "\0".
For ascii chars, it happens trailing three "\0"
 : one for ascii char
and two for string termination.

Regards,

Park Heesob

_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-29 04:22:36
Heesob Park wrote:
> Hi,
> 
> 2006/5/28, Daniel Berger <djberg96gmail.com>:
>> Heesob Park wrote:
>>> Hi,
>>>
>>> 2006/5/28, Daniel Berger <djberg96gmail.com>:
>>>> Heesob Park wrote:
>>>>> Hi,
>>> <snip>
>>>> Thanks Heesob, but I'm getting some weird
segfaults with wide character
>>>> functions and buffers over 245 characters. 
I posted about this to
>>>> ruby-talk as well.  Here's some sample
code that demonstrates the problem:
>>>>
>>>> require 'Win32API'
>>>>
>>>> GetFullPathNameW =
Win32API.new('kernel32', 'GetFullPathNameW', 'PLPP',
'L')
>>>>
>>>> path = "C:\\test"
>>>> buf  = 0.chr * 260 # 245 or less works ok
>>>>
>>>> if GetFullPathNameW.call(path, buf.size,
buf, 0) == 0
>>>>    puts "Failed"
>>>>    exit
>>>> end
>>>>
>>>> p buf.split("\0\0").first #
BOOM!
>>>>
>>>> I'm not sure what the significance of 245
or less is.  I can inspect
>>>> 'buf', copy and paste it to a separate
editor as a string and run ops on
>>>> it with no problem, so I'm very curious as
to what's making Ruby segfault.
>>>>
>>> I'm very curious too.
>>>
>>> Your sample code don't segfault.
>>> But I came across segfaults several times in
modifing create_junction method.
>>> It's behaviour is very unstable, as I insert p
method,it sometimes runs OK.
>>> It's location was mainly split method or
get_last_error method.
>>>
>>> I guess it's not related with 245 or 260, but
it seems to underlying C
>>> pointer memory access failure problem.
>>> Can you give me a sample stable segfault
generating code?
>> To add even more mystery to this problem, that code
isn't segfaulting
>> for me at the moment, though it was regularly last
night (I still have
>> the console window open that shows the segfaults to
prove it to myself).
>>
>> I saw the same behavior you mentioned - inserting a
'puts' would
>> sometimes cause code that was previously
segfaulting to suddenly work.
>>
>> Well, let's put the Unicode stuff on hold for now.
 Perhaps deep
>> inspection of string.c some day will reveal what
the potential problem is.
>>
> I have found out what is the problem.
> It's not bug of Ruby or Windows, it is only bug of
code.
> 
> First try this:
> 
> require 'Win32API'
> GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
> for i in 1..100
> path = "c:\\test"
> buf  = 0.chr * 260
> if GetFullPathNameW.call(path, buf.size, buf, 0) == 0
>     puts "Failed"
> end
> p buf.split("\0\0").first
> end
> 
> It will cause various errors like
> uninitialized constant GetFullPathNameW (NameError)
> or
> segfault.
> 
> Next, try this:
> 
> require 'Win32API'
> GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
> for i in 1..100
> path = "c:\\test"
> buf  = 0.chr * 260
> # buf.size/2 -> actual length of buf
> if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0
>     puts "Failed"
> end
> p buf.split("\0\0").first
> end
> 
> It runs Ok. but the result is not correct.
> 
> Next , try this:
> 
> require 'Win32API'
> GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
> for i in 1..100
> # append \0 to path
> path = "c:\\test\0"
> buf  = 0.chr * 260
> # buf.size/2 -> actual length of buf in unicode
string
> if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0
>     puts "Failed"
> end
> p buf.split("\0\0").first
> end
> 
> It runs ok. The result is correct.
> 
> Finally, the complete and correct code is like this:
> 
> require 'Win32API'
> GetFullPathNameW =
Win32API.new('kernel32','GetFullPathNameW','PLPP',
'L')
> for i in 1..100
> path = "c\0:\0\\\0t\0e\0s\0t\0\0"
> buf  = 0.chr * 260
> # buf.size/2 -> actual length of buf in unicode
string
> if GetFullPathNameW.call(path, buf.size/2, buf, 0) == 0
>     puts "Failed"
> end
> buf = buf.split("\0\0").first
> buf = (buf.size % 2).zero? ? buf :
buf+"\0"
> p buf
> end
> 
> Remeber, Ruby's string is terminated with
"\0" implicitly, but UTF16
> string requires double "\0".
> For ascii chars, it happens trailing three
"\0"  : one for ascii char
> and two for string termination.
> 
> Regards,
> 
> Park Heesob

While I understand why this code works, I'm still not
entirely clear why 
the previous code would cause the interpreter to segfault. 
Bad pointer 
address?

In any case, excellent work, thank you!

Now I'm trying to work out a general approach for the
windows-pr stuff. 
   Given a method like this:

def GetFullPathName(file, buf, buf_size, part)
    if $KCODE != 'NONE'
       GetFullPathNameW.call(file, buf, buf_size, part)
    else
       GetFullPathName.call(file, buf, buf_size, part)
    end
end

Should I modify it to try to do a best-guess?

if $KCODE != 'NONE'
    GetFullPathNameW.call(file, buf, buf_size/2, part)
end

Or do you think that's the user's job?

Thanks,

Dan

_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-29 04:50:34
Hi,

2006/5/29, Daniel Berger <djberg96gmail.com>:
<snip>
> While I understand why this code works, I'm still not
entirely clear why
> the previous code would cause the interpreter to
segfault.  Bad pointer
> address?
>
Yes, Ruby's string is not just character array, it is
actually a
structure and a tainted structure causes unexpected
behaviour.

> In any case, excellent work, thank you!
>
You are welcome.

> Now I'm trying to work out a general approach for the
windows-pr stuff.
>   Given a method like this:
>
> def GetFullPathName(file, buf, buf_size, part)
>    if $KCODE != 'NONE'
>       GetFullPathNameW.call(file, buf, buf_size, part)
>    else
>       GetFullPathName.call(file, buf, buf_size, part)
>    end
> end
>
> Should I modify it to try to do a best-guess?
>
> if $KCODE != 'NONE'
>    GetFullPathNameW.call(file, buf, buf_size/2, part)
> end
>

Be careful, before calling GetFullPathNameW, the
"file" must be UTF16 string.
I recommend all function call using W function, and if
string is not
UTF16 then first convert it to UTF16 string.
$KCODE must be used to only determine the string is UTF8
string, not
determine to call wheter Ansi function or W function.
Because Ruby interpreter cannot handle UTF16 code file, the
user has
no chance to use UTF16 string in real world in the ruby
code.

Sample code is like this:

     from = multi_to_wide(from)
     to = multi_to_wide(to)
     if GetFullPathNameW.call(from, from_path.size,
from_path, 0) == 0
        raise StandardError, 'GetFullPathName() failed: '
+ get_last_error
     end
     if GetFullPathNameW.call(to, to_path.size, to_path, 0)
== 0
        raise StandardError, 'GetFullPathName() failed: '
+ get_last_error
     end

> Or do you think that's the user's job?
>
I think the user would'nt care about file name is unicode
or ansi string.

Regards,

Park Heesob

_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-29 05:04:27
Heesob Park wrote:
> Hi,
> 
> 2006/5/29, Daniel Berger <djberg96gmail.com>:
> <snip>
>> While I understand why this code works, I'm still
not entirely clear why
>> the previous code would cause the interpreter to
segfault.  Bad pointer
>> address?
>>
> Yes, Ruby's string is not just character array, it is
actually a
> structure and a tainted structure causes unexpected
behaviour.

Ah, right.

> 
>> In any case, excellent work, thank you!
>>
> You are welcome.
> 
>> Now I'm trying to work out a general approach for
the windows-pr stuff.
>>   Given a method like this:
>>
>> def GetFullPathName(file, buf, buf_size, part)
>>    if $KCODE != 'NONE'
>>       GetFullPathNameW.call(file, buf, buf_size,
part)
>>    else
>>       GetFullPathName.call(file, buf, buf_size,
part)
>>    end
>> end
>>
>> Should I modify it to try to do a best-guess?
>>
>> if $KCODE != 'NONE'
>>    GetFullPathNameW.call(file, buf, buf_size/2,
part)
>> end
>>
> 
> Be careful, before calling GetFullPathNameW, the
"file" must be UTF16 string.
> I recommend all function call using W function, and if
string is not
> UTF16 then first convert it to UTF16 string.
> $KCODE must be used to only determine the string is
UTF8 string, not
> determine to call wheter Ansi function or W function.

What would you recommend then?  How should I determine
within Ruby if 
the string being passed to a function is UTF16? 
IsTextUnicode()? 
Something else?

How would you define GetFullPathName within file.rb (from
windows-pr) 
then, for example?

Thanks,

Dan
_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-29 05:29:00
2006/5/29, Daniel Berger <djberg96gmail.com>:
> Heesob Park wrote:
> > Hi,
> >
> > 2006/5/29, Daniel Berger <djberg96gmail.com>:
> > <snip>
> >> While I understand why this code works, I'm
still not entirely clear why
> >> the previous code would cause the interpreter
to segfault.  Bad pointer
> >> address?
> >>
> > Yes, Ruby's string is not just character array,
it is actually a
> > structure and a tainted structure causes
unexpected behaviour.
>
> Ah, right.
>
> >
> >> In any case, excellent work, thank you!
> >>
> > You are welcome.
> >
> >> Now I'm trying to work out a general approach
for the windows-pr stuff.
> >>   Given a method like this:
> >>
> >> def GetFullPathName(file, buf, buf_size, part)
> >>    if $KCODE != 'NONE'
> >>       GetFullPathNameW.call(file, buf,
buf_size, part)
> >>    else
> >>       GetFullPathName.call(file, buf,
buf_size, part)
> >>    end
> >> end
> >>
> >> Should I modify it to try to do a best-guess?
> >>
> >> if $KCODE != 'NONE'
> >>    GetFullPathNameW.call(file, buf,
buf_size/2, part)
> >> end
> >>
> >
> > Be careful, before calling GetFullPathNameW, the
"file" must be UTF16 string.
> > I recommend all function call using W function,
and if string is not
> > UTF16 then first convert it to UTF16 string.
> > $KCODE must be used to only determine the string
is UTF8 string, not
> > determine to call wheter Ansi function or W
function.
>
> What would you recommend then?  How should I determine
within Ruby if
> the string being passed to a function is UTF16? 
IsTextUnicode()?
> Something else?
>
The user might not call function with UTF16 by accident.
But the user who want call function with UTF16 string on
purpose, use
utf16 flag.

> How would you define GetFullPathName within file.rb
(from windows-pr)
> then, for example?
>
How about this?

def GetFullPathName(file, buf_size, buf, part, utf16 =
false)
   file = multi_to_wide(file) unless utf16
   GetFullPathNameW.call(file, buf.size/2, buf, part)
end

Regards,

Park Heesob

_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-29 07:19:16
Heesob Park wrote:

<snip>

> How about this?
> 
> def GetFullPathName(file, buf_size, buf, part, utf16 =
false)
>    file = multi_to_wide(file) unless utf16
>    GetFullPathNameW.call(file, buf.size/2, buf, part)
> end

Except that means adding an extra argument to a lot of
methods.

Hm....what about:

def GetFullPathName(file, buf_size, buf, part)
    file = multi_to_wide(file) unless IsTextUnicode(file)
    GetFullPathNameW.call(file, buf.size/2, buf, part)
end

A little more work for me, but less for the user to
remember.

Will that work? Or do you think IsTextUnicode() is too
unreliable?

(Sorry if you answered this previously)

Regards,

Dan



_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

win32-dir, unicode
user name
2006-05-29 07:53:04
Hi,

2006/5/29, Daniel Berger <djberg96gmail.com>:
> Heesob Park wrote:
>
> <snip>
>
> > How about this?
> >
> > def GetFullPathName(file, buf_size, buf, part,
utf16 = false)
> >    file = multi_to_wide(file) unless utf16
> >    GetFullPathNameW.call(file, buf.size/2, buf,
part)
> > end
>
> Except that means adding an extra argument to a lot of
methods.
>
> Hm....what about:
>
> def GetFullPathName(file, buf_size, buf, part)
>    file = multi_to_wide(file) unless
IsTextUnicode(file)
>    GetFullPathNameW.call(file, buf.size/2, buf, part)
> end
>
> A little more work for me, but less for the user to
remember.
>
That's Ok for internal Use.
But what if the user wants the result of the function ?
Every function needs to be a->w;callw;w->a; conversion
like this?

 def GetFullPathName(file, buf_size, buf, part)
    file = multi_to_wide(file) unless IsTextUnicode(file)
    GetFullPathNameW.call(file, buf.size/2, buf, part)
    buf = wide_to_multi(buf)
 end

I recommend to separate two functions like this:

 def GetFullPathName(file, buf_size, buf, part)
    GetFullPathName.call(file, buf.size/2, buf, part)
 end

 def GetFullPathNameW(file, buf_size, buf, part)
    file = multi_to_wide(file) unless IsTextUnicode(file)
    GetFullPathNameW.call(file, buf.size/2, buf, part)
 end


> Will that work? Or do you think IsTextUnicode() is too
unreliable?
>
I think it is a useful function.
I just don't want to make slow code with using another api
function


Regards,

Park Heesob

_______________________________________________
win32utils-devel mailing list
win32utils-develrubyforge.org
http://rubyforge.org/mailman/listinfo/win32utils-devel

[1-9]

about | contact  Other archives ( Real Estate discussion Medical topics )