|
List Info
Thread: More weirdness with PHP bindings
|
|
| More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 09:22:14 |
Hi all,
There's been a odd bug reported to us by Daniel Menard while
working on
the PHP bindings:
"I then tried to run the dotest target...All tests
passed, except the
one about get_matching_terms (smoketest.php line 94).
I added this line before the exit:
for ($i=0; $i<strlen($terms); $i++) echo
$c=ord($terms[$i]), ' ',
($c>31?$terms[$i]:''), "n";
and it appears that the first letter of each term is
replaced with a
null char.I tried to run the same test in a debian box, and
the test
pass, so perhaps this is a windows-related problem (more on
this below).
...[he then manages to get Xapian working in a real
situation]...
I was surprised it works so well because my script also uses
get_matching_terms, but it doesn't reproduce the bug above.
In fact, I don't use a
"join(get_matching_terms())" as smoketest do, but
iterate with get_matching_terms_begin and
get_matching_terms_end.
Just by curiosity, I added the following lines in
smoketest.php :
$hit=$mset->get_hit(0);
$it=$enq->get_matching_terms_begin($hit);
while (!
$it->equals($enq->get_matching_terms_end($hit)))
{
echo $it->get_term(), ' ';
$it->next();
}
and with that code, we get the correct terms. So the bug
only concerns
the way get_matching_terms is wrapped (and only appears
under
windows)... strange.
"
Anyway, I thought it might be worth raising in case anyone
with a better
knowledge of PHP might have a brainwave!
Cheers
Charlie
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
| Re: More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 09:35:32 |
On Tue, Apr 03, 2007 at 03:22:14PM +0100, Charlie Hull
wrote:
> and with that code, we get the correct terms. So the
bug only concerns
> the way get_matching_terms is wrapped (and only appears
under
> windows)... strange.
How odd. What if you take Xapian out of the equation and
run something
like:
<?
function foo() { return array("is",
"there"); }
$terms = join(" ", foo());
for ($i=0; $i<strlen($terms); $i++) echo
$c=ord($terms[$i]), ' ', ($c>31?$terms[$i]:''),
"n";
?>
Cheers,
Olly
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
| Re: More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 09:39:29 |
On Tue, Apr 03, 2007 at 03:22:14PM +0100, Charlie Hull
wrote:
> So the bug only concerns the way get_matching_terms is
wrapped (and
> only appears under windows)... strange.
>
> Anyway, I thought it might be worth raising in case
anyone with a better
> knowledge of PHP might have a brainwave!
get_matching_terms() isn't wrapped - it's a synthetic method
generated
only in the SWIG layer. Note that it relies on a typemap to
convert
std::pair<> into a list in the target language, which
is done at the
bottom of php/util.i - I'm wondering if there's a
Windows-specific bug
in that in some way? I don't know the PHP internals enough
to know for
sure, and I'm a little short on time right now or I'd prod
further myself...
J
--
/-----------------------------------------------------------
---------------
James Aylett
xapian.org
james tartarus.org
uncertaintydivision.org
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
| Re: More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 09:40:50 |
Olly Betts wrote:
> On Tue, Apr 03, 2007 at 03:22:14PM +0100, Charlie Hull
wrote:
>> and with that code, we get the correct terms. So
the bug only concerns
>> the way get_matching_terms is wrapped (and only
appears under
>> windows)... strange.
>
> How odd. What if you take Xapian out of the equation
and run something
> like:
>
> <?
> function foo() { return array("is",
"there"); }
> $terms = join(" ", foo());
> for ($i=0; $i<strlen($terms); $i++) echo
$c=ord($terms[$i]), ' ', ($c>31?$terms[$i]:''),
"n";
> ?>
>
> Cheers,
> Olly
>
I get:
105 i
115 s
32
116 t
104 h
101 e
114 r
101 e
C
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
| Re: More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 09:57:48 |
On Tue, Apr 03, 2007 at 03:40:50PM +0100, Charlie Hull
wrote:
> 105 i
> 115 s
> 32
> 116 t
> 104 h
> 101 e
> 114 r
> 101 e
OK, thanks for checking. That suggests it's probably the
bindings
rather than the PHP join function. As James says, we build
the return
array by hand, so it's possible something is amiss there but
we get away
with it on UNIX for some reason (perhaps ZTS related!)
I'll give that code a detailed inspection later (currently
I'm having a
stab at implemented the prog variant of the remote backend
for Windows).
Cheers,
Olly
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
| Re: More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 10:29:26 |
Charlie Hull wrote:
> I was surprised it works so well because my script also
uses
> get_matching_terms, but it doesn't reproduce the bug
above.
> In fact, I don't use a
"join(get_matching_terms())" as smoketest do, but
> iterate with get_matching_terms_begin and
get_matching_terms_end.
> Just by curiosity, I added the following lines in
smoketest.php :
> $hit=$mset->get_hit(0);
> $it=$enq->get_matching_terms_begin($hit);
> while (!
$it->equals($enq->get_matching_terms_end($hit)))
> {
> echo $it->get_term(), ' ';
> $it->next();
> }
> and with that code, we get the correct terms. So the
bug only concerns
> the way get_matching_terms is wrapped (and only appears
under
> windows)... strange.
> "
What were the terms involved in this test? Is it possible
that there's
an issue with character set conversion?
Alternatively, it may be a memory management problem: the
handling for
get_matching_terms() is special-cases for PHP at the end of
xapian-bindings/php/util.i, where there is code for
implementing the
special handling for term lists which allow a list
containing the terms
to be obtained. This works by copying each term into the
list with
"add_next_index_stringl". Perhaps this isn't
copying the contents of
the string correctly, or is failing to allocate space for
the contents
correctly.
I've checked through the sources for my version of PHP, and
it looks
like the allocation should happen correctly - but there are
many layers
of code here, where a problem could be being hidden.
In particular, PHP can use the native allocation routines,
or its own
memory allocation system. It would be interesting to try
using the
native allocation routines instead of the PHP one to check
if there's a
bug in its allocator. (I'm think you can turn off the PHP
allocator by
setting the "USE_ZEND_ALLOC" environment variable
to "0").
--
Richard
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
| Re: More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 10:45:09 |
On Tue, Apr 03, 2007 at 04:29:26PM +0100, Richard Boulton
wrote:
> What were the terms involved in this test? Is it
possible that there's
> an issue with character set conversion?
The terms are "is" and "there", so it
seems unlikely. The PHP bindings
don't do character set conversion, and PHP doesn't have much
in the way
of explicit unicode support currently anyway.
Cheers,
Olly
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
| Re: More weirdness with PHP bindings |
  United Kingdom |
2007-04-03 11:27:12 |
Olly Betts wrote:
> On Tue, Apr 03, 2007 at 04:29:26PM +0100, Richard
Boulton wrote:
>> What were the terms involved in this test? Is it
possible that there's
>> an issue with character set conversion?
>
> The terms are "is" and "there", so
it seems unlikely. The PHP bindings
> don't do character set conversion, and PHP doesn't have
much in the way
> of explicit unicode support currently anyway.
>
> Cheers,
> Olly
>
Just in case it's relevant, this problem also occurs with
the php4
bindings...
Charlie
_______________________________________________
Xapian-devel mailing list
Xapian-devel lists.xapian.org
http://lists.xapian.org/mailman/listinfo/xapian-devel
|
|
[1-8]
|
|