List Info

Thread: note 78044 added to function.wordwrap




note 78044 added to function.wordwrap
user name
2007-09-25 09:04:40
wordwrap() doesn't handle Unicode (MB) strings properly, so
I wrote a new mb_wordwrap() that handles them properly:

function mb_wordwrap($str, $width = 70, $break =
"n", $cut = false)
{
	$return = '';
	$str_bytes = strlen($str);
	$first_char = true;
	
	$current_line = '';
	$current_line_char_count = 0;
	$current_word = '';
	$current_word_char_count = 0;
	
	for ($i=0; $i < $str_bytes; $i++)
	{
		//get the next char (unicode or ascii)
		$char = $str{$i};
		$h = ord($char);
		if ($h <= 0x7F) 
		{ $char_code = $h; } 
		else if ($h < 0xC2) 
		{ $char_code = false; } 
		else if ($h <= 0xDF) 
		{ 
			$c2 = $str{++$i};
			$char .= $c2;
			$char_code = ($h & 0x1F) << 6 | (ord($c2) &
0x3F); 
		} 
		else if ($h <= 0xEF) 
		{ 
			$c2 = $str{++$i};
			$c3 = $str{++$i};
			$char .= $c2.$c3;
			$char_code = ($h & 0x0F) << 12 | (ord($c2)
& 0x3F) << 6 | (ord($c3) & 0x3F); 
		} 
		else if ($h <= 0xF4) 
		{ 
			$c2 = $str{++$i};
			$c3 = $str{++$i};
			$c4 = $str{++$i};
			$char .= $c2.$c3.$c4;
			$char_code = ($h & 0x0F) << 18 | (ord($c2)
& 0x3F) << 12 | (ord($c3) & 0x3F) << 6 |
(ord($c4) & 0x3F); 
		} 
		else 
		{ 
			//unrecognized char, skip it
			continue; 
		}
						
		//if it's a space, new word commencing
		if ($char_code == 32)
		{
			//if line is too long, linebreak time!
			if ($current_line_char_count + $current_word_char_count
>= $width) 
			{
				if ($current_line_char_count)
				{ $return .= $current_line.$break; }
				
				//reset the current line
				$current_line = $current_word;
				$current_line_char_count = $current_word_char_count;
			}
			else
			{
				//include a space at the front of the word if this isn't
the first char
				//since we assume there was a space prior to this word
except for the first word
				$current_line .= ($first_char ? '' : ' ').$current_word;

				$current_line_char_count += $current_word_char_count +
($first_char ? 0 : 1);				
			}
			
			$current_word = '';
			$current_word_char_count = 0;
			
			$first_char = false;
		}
		//if it's a char, add it to the word
		else
		{ 
			if ($cut)
			{
				//check if this word is too long. if it is, slice it.
				if ($current_word_char_count >= $width)
				{
					//clear the current line and word to the return value
					if ($current_line_char_count)
					{ $return .= $current_line.$break; }
					
					$current_line = $current_word;
					$current_line_char_count = $current_word_char_count;
					
					$current_word = '';
					$current_word_char_count = 0;
				}
			}
		
			$current_word .= $char; 
			$current_word_char_count++;
		}
	}
	
	//check for leftovers and add them to the string
	if ($current_word_char_count)
	{ $return .= $current_line.($current_word_char_count ?
($current_word_char_count + $current_line_char_count >
$width ? "n" : ' ').$current_word : ''); }
	
	return $return;
}
----
Server IP: 216.235.15.211
Probable Submitter: 67.68.235.146
----
Manual Page -- ht
tp://www.php.net/manual/en/function.wordwrap.php
Edit        -- https://master
.php.net/note/edit/78044
Del: integrated  -- h
ttps://master.php.net/note/delete/78044/integrated
Del: useless     -- http
s://master.php.net/note/delete/78044/useless
Del: bad code    -- htt
ps://master.php.net/note/delete/78044/bad+code
Del: spam        -- https:/
/master.php.net/note/delete/78044/spam
Del: non-english -- 
https://master.php.net/note/delete/78044/non-english
Del: in docs     -- http
s://master.php.net/note/delete/78044/in+docs
Del: other reasons-- https://mast
er.php.net/note/delete/78044
Reject      -- https://mast
er.php.net/note/reject/78044
Search      -- https://
master.php.net/manage/user-notes.php

-- 
PHP Notes Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub
.php


note 78044 rejected from function.wordwrap by didou
user name
2007-10-08 08:11:20
Note Submitter: djneoform at gmail dot com 

----

wordwrap() doesn't handle Unicode (MB) strings properly, so
I wrote a new mb_wordwrap() that handles them properly:

function mb_wordwrap($str, $width = 70, $break =
"n", $cut = false)
{
	$return = '';
	$str_bytes = strlen($str);
	$first_char = true;
	
	$current_line = '';
	$current_line_char_count = 0;
	$current_word = '';
	$current_word_char_count = 0;
	
	for ($i=0; $i < $str_bytes; $i++)
	{
		//get the next char (unicode or ascii)
		$char = $str{$i};
		$h = ord($char);
		if ($h <= 0x7F) 
		{ $char_code = $h; } 
		else if ($h < 0xC2) 
		{ $char_code = false; } 
		else if ($h <= 0xDF) 
		{ 
			$c2 = $str{++$i};
			$char .= $c2;
			$char_code = ($h & 0x1F) << 6 | (ord($c2) &
0x3F); 
		} 
		else if ($h <= 0xEF) 
		{ 
			$c2 = $str{++$i};
			$c3 = $str{++$i};
			$char .= $c2.$c3;
			$char_code = ($h & 0x0F) << 12 | (ord($c2)
& 0x3F) << 6 | (ord($c3) & 0x3F); 
		} 
		else if ($h <= 0xF4) 
		{ 
			$c2 = $str{++$i};
			$c3 = $str{++$i};
			$c4 = $str{++$i};
			$char .= $c2.$c3.$c4;
			$char_code = ($h & 0x0F) << 18 | (ord($c2)
& 0x3F) << 12 | (ord($c3) & 0x3F) << 6 |
(ord($c4) & 0x3F); 
		} 
		else 
		{ 
			//unrecognized char, skip it
			continue; 
		}
						
		//if it's a space, new word commencing
		if ($char_code == 32)
		{
			//if line is too long, linebreak time!
			if ($current_line_char_count + $current_word_char_count
>= $width) 
			{
				if ($current_line_char_count)
				{ $return .= $current_line.$break; }
				
				//reset the current line
				$current_line = $current_word;
				$current_line_char_count = $current_word_char_count;
			}
			else
			{
				//include a space at the front of the word if this isn't
the first char
				//since we assume there was a space prior to this word
except for the first word
				$current_line .= ($first_char ? '' : ' ').$current_word;

				$current_line_char_count += $current_word_char_count +
($first_char ? 0 : 1);				
			}
			
			$current_word = '';
			$current_word_char_count = 0;
			
			$first_char = false;
		}
		//if it's a char, add it to the word
		else
		{ 
			if ($cut)
			{
				//check if this word is too long. if it is, slice it.
				if ($current_word_char_count >= $width)
				{
					//clear the current line and word to the return value
					if ($current_line_char_count)
					{ $return .= $current_line.$break; }
					
					$current_line = $current_word;
					$current_line_char_count = $current_word_char_count;
					
					$current_word = '';
					$current_word_char_count = 0;
				}
			}
		
			$current_word .= $char; 
			$current_word_char_count++;
		}
	}
	
	//check for leftovers and add them to the string
	if ($current_word_char_count)
	{ $return .= $current_line.($current_word_char_count ?
($current_word_char_count + $current_line_char_count >
$width ? "n" : ' ').$current_word : ''); }
	
	return $return;
}

-- 
PHP Notes Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub
.php


[1-2]

about | contact  Other archives ( Real Estate discussion Medical topics )