List Info

Thread: Unicode Encode Error




Unicode Encode Error
user name
2006-04-27 14:32:50
Frank Moore wrote:
> The later code (where I use codecs) is not giving me an
error (I must 
> have got that error during an intermediate step I
performed).
> However, it's also not writing anything away. It seems
to be silently 
> failing as html_text definitely has content.

Do you explicitly close the output file? If not, the data
may not be 
actually written.

Kent

_______________________________________________
Tutor maillist  -  Tutorpython.org
http://
mail.python.org/mailman/listinfo/tutor
Unicode Encode Error
user name
2006-04-27 16:15:06
Kent Johnson wrote:

>Do you explicitly close the output file? If not, the
data may not be 
>actually written.
>  
>
Kent,

You're right, I realised after playing with Tim's example
that the 
problem was that I wasn't calling close() on the codecs
file.
Adding this after the f.write(html_text) seems to flush the
buffer which 
means that the content now gets written to the file.

Thanks for your help,
Frank.




_______________________________________________
Tutor maillist  -  Tutorpython.org
http://
mail.python.org/mailman/listinfo/tutor
Unicode Encode Error
user name
2006-04-27 19:07:14
> You're right, I realised after playing with Tim's
example that the 
> problem was that I wasn't calling close() on the
codecs file. Adding 
> this after the f.write(html_text) seems to flush the
buffer which means 
> that the content now gets written to the file.

Hi Frank,

Quick note: it may be important to write and read from the
file using 
binary mode "b".  It's not so significant under
Unix, but it is more 
significant under Windows, because otherwise we may get some
weird 
results.

Good luck!
_______________________________________________
Tutor maillist  -  Tutorpython.org
http://
mail.python.org/mailman/listinfo/tutor
Unicode Encode Error
user name
2006-04-27 19:24:34
Danny Yoo wrote:
>> You're right, I realised after playing with Tim's
example that the 
>> problem was that I wasn't calling close() on the
codecs file. Adding 
>> this after the f.write(html_text) seems to flush
the buffer which means 
>> that the content now gets written to the file.
> 
> Hi Frank,
> 
> Quick note: it may be important to write and read from
the file using 
> binary mode "b".  It's not so significant
under Unix, but it is more 
> significant under Windows, because otherwise we may get
some weird 
> results.

But the file is utf-8 text, ISTM it should be written as
text, not 
binary. Why do you recommend binaray mode?

Kent

_______________________________________________
Tutor maillist  -  Tutorpython.org
http://
mail.python.org/mailman/listinfo/tutor
Unicode Encode Error
user name
2006-04-27 21:12:40
>>> You're right, I realised after playing with
Tim's example that the 
>>> problem was that I wasn't calling close() on
the codecs file. Adding 
>>> this after the f.write(html_text) seems to
flush the buffer which 
>>> means that the content now gets written to the
file.
>>
>> Quick note: it may be important to write and read
from the file using 
>> binary mode "b".  It's not so
significant under Unix, but it is more 
>> significant under Windows, because otherwise we may
get some weird 
>> results.
>
> But the file is utf-8 text, ISTM it should be written
as text, not 
> binary. Why do you recommend binaray mode?

Hi Kent,

Oh!  I just wrote that out because I had a vague and fuzzy
feeling that 
utf-8, having high-order binary bits, needed to be written
carefully. 
But let me examine that unexamined assumption...

No, you're right, we don't have to be so careful here, for
carriage 
returns and newlines have their standard interpretation
under utf-8 too. 
Ok, good to know.  Thank you!


I'd seen too many problems with Windows and binary data
that I do 'rb' out 
of habit whenever dealing with high-order binary data.  For
example, 
ord(26) causes Windows to prematurely truncate the reading
of a file in 
text mode:

     http://mail.python.org/pipermail/python-list/
2003-March/154659.html

On a close reading of how the utf-8 encoding standard,
though, I see that 
it does say that utf-8 avoids encoding high Unicode code
points with 
control characters, so my caution is unfounded.
_______________________________________________
Tutor maillist  -  Tutorpython.org
http://
mail.python.org/mailman/listinfo/tutor
[1-5]

about | contact  Other archives ( Real Estate discussion Medical topics )