List Info

Thread: On @rebus=<<`HIC`:Latin1 (Was: OK, but what about me?)




On @rebus=<<`HIC`:Latin1 (Was: OK, but what about me?)
user name
2007-04-23 11:29:45
> "Dr.Ruud" <rvtol+newsisolution.nl> writes:

>> Brain fart:
>>
>>     my $foo = <<'FOO' :koi8-r;
>> raw koi8-r data here
>> FOO

> Brilliant, except for the complication that the
end-of-heredoc string
> must be encoded in the foreign encoding (which I think
is a solvable
> problem).

Do you mean that "FOO" would have to be in koi8-r
rather that in the
script's own encoding, if any?

Yes, I agree it looks nifty; it's nice that it should use
the existing 
though little-used :attributes syntactic slot.  But I think
it leads
to some curious questions.

First of all, one wonders whether this wouldn't lead to more
general,
per-literal encoding specs, such as C<'string':enc>,
C<q!string!:enc>, 
(or even C<q:string:enc> skipping the dup colon?), and
similar ilk?

Even without :enc being applicable to general literals,
though, what 
about interpolated data from C< <<"FOO"
>?  That is, would something 
like C<"str1 $var str2":euc-tw>, written
heredockishly as  

	my $foo = <<"FOO" :euc-tw;
    str1 $var str2
    FOO

mean:  (line endings aside)

     decode(euc_tw => 'str1 ') 
   . $var 
   . decode(euc_tw => ' str2') 

Or would it instead mean:

     decode euc_tw => ('str1 ' . $var . ' str2')

Which goes first?

Mmm, doesn't this mean we'd get to specify an encoding on
readpipe? 
I think it does!

	my $rebus = <<`HIC`:Latin1;
    cmd1 $var | $cmd2
    cmd3
    HIC

Yum!  

You know, that's almost even somewhat appealing--compared
with 
the alternative: 

    my $rebus = do { 
	open(my $rdpipe, "|- :encoding(Latin1)",
"cmd1 $var | cmd2; cmd3 |");
	local $/;
	<$rdpipe>; 
    };

Although certainly the simpler 

    my $rebus = `cmd1 $var | cmd2; cmd3` :Latin1;

or, if you must, 

    my $rebus = qx(cmd1 $var | cmd2; cmd3) :Latin1;

would be easier on the eye and mind than C<
<<`HIC`:Latin1 > would.

Hm, looking at the command-interpolated version, it now
seems pretty
obvious that variable interpolation must occur before
"de-"encoding
(er, "en-"decoding? I just can't keep those two
straight in my head!), 
so that would mean 

     my $rebus = decode Latin1 => qx(cmd1 $var | cmd2;
cmd3);

So I guess that clears up the order of operations on the
prospective
C< <<"HIC":Latin1 > case, doesn't it?


	my $rebus = <<"HIC" :Latin1;
    str1 $var str2
    HIC

would be

     my $rebus = decode Latin1 => "str1 $var
str2";

Hm...

	my rebus = <<`HIC` :Latin1;
    str1 $var str2
    HIC

In Latin1, there's no trouble, but I'd have to unwrap that
to 
see when the implicit line-breaking split ran.  

    my rebus = split( /(?=n)/, decode(Latin1 => `str1
$var str2`) );

I wonder a little about other line terminators in very funky
encodings.
Let's say Jis0212-RAW had v stuff far beyond n.  Would

	my lines = <<`FOO` :jis0212-raw;
    str1 $var str2
    FOO

be therefore

    my lines = split( /(?=n)/, decode(jis0212_raw =>
`str1 $var str2`) );

Hm, looks like I'm relying on split losing the trailing null
field there.
I guess I could write the regex as /(?=n.)/s so split
doesn't have to go
to extra work of splitting the last thing and then throwing
it away.
Hm, maybe using R might be better:

    my lines = map { decode jis0212_raw => $_ } 
		split( /(?=R.)/s, `str1 $var str2`);

Oh, never mind; the qx// implicit split doesn't use n; it
uses $/ 
(which is a bit of a bother to put in a m//).  So that's
just: 

    my lines = split( m[(?=Q$/E.)]s, 
		       decode(jis0212_raw => `str1 $var str2`) );

> I'm surprised that no one responded to this
suggestion.

I'd noticed only Juerd's original, not the <<FOO:koi
reply, because 
in my hastiness, I carelessly ran % scan `pick -subj Smack`
and 
so missed the intriguing reply. 

Thanks, Johan!  Glad you flagged it.  Fun stuff, eh? 

--tom

PS: Now that I think of it, those pod markups would be
better 
    written as C<<< <<"HIC"
>>> instead of C< <<"HIC" >,
because
    the space you get around the string varies.  This:

	% ( echo "=head1 WITNESS" ; echo preamble
'I<<< <<"HIC":Latin1 >>>'
postamble ) | pod2text
	WITNESS
	preamble *<<"HIC":Latin1* postamble

    is probably better than this:

	% ( echo "=head1 WITNESS" ; echo preamble 'I<
<<"HIC":Latin1 >' postamble ) | pod2text
	WITNESS
	preamble * <<"HIC":Latin1 * postamble



[1]

about | contact  Other archives ( Real Estate discussion Medical topics )