Web студия "GrandView"
  Главная   Написать Контакты
   
   
О проекте
Руководство php
 

mb_detect_encoding

(PHP 4 >= 4.0.6, PHP 5)

mb_detect_encoding -- Detect character encoding

Description

string mb_detect_encoding ( string str [, mixed encoding_list [, bool strict]] )

mb_detect_encoding() detects character encoding in string str. It returns detected character encoding.

encoding_list is list of character encoding. Encoding order may be specified by array or comma separated list string.

If encoding_list is omitted, detect_order is used.

strict specifies whether to use the strict encoding detection or not. Default is FALSE.

Пример 1. mb_detect_encoding() example

<?php
/* Detect character encoding with current detect_order */
echo mb_detect_encoding($str);

/* "auto" is expanded to "ASCII,JIS,UTF-8,EUC-JP,SJIS" */
echo mb_detect_encoding($str, "auto");

/* Specify encoding_list character encoding by comma separated list */
echo mb_detect_encoding($str, "JIS, eucjp-win, sjis-win");

/* Use array to specify encoding_list  */
$ary[] = "ASCII";
$ary[] = "JIS";
$ary[] = "EUC-JP";
echo
mb_detect_encoding($str, $ary);
?>

See also mb_detect_order().



mb_detect_order> <mb_decode_numericentity
Last updated: Sat, 27 Jan 2007
 
add a note add a note User Contributed Notes
mb_detect_encoding
mark at kinoko dot fr
12-Oct-2007 07:56
For: rl at itfigures dot nl

Just note that your Euro symbol being \x80 is NOT standard for ISO-8859-1 or ISO-8859-15 as \x80 is a reserved character.

It is however "common practice" for windows developpers to mix windows-1252 and ISO-8859-1. Just convert to windows-1252 instead of ISO-8859-1 and you'll get your € symbol at the right place.
rl at itfigures dot nl
04-Sep-2007 02:00
I used Chris's function "detectUTF8" to detect the need from conversion from utf8 to 8859-1, which works fine. I did have a problem with the following iconv-conversion.

The problem is that the iconv-conversion to 8859-1 (with //TRANSLIT) replaces the euro-sign with EUR, although it is common practice  that \x80 is used as the euro-sign in the 8859-1 charset.

I could not use 8859-15 since that mangled some other characters, so I added 2 str_replace's:

if(detectUTF8($str)){
  $str=str_replace("\xE2\x82\xAC","&euro;",$str);
  $str=iconv("UTF-8","ISO-8859-1//TRANSLIT",$str);
  $str=str_replace("&euro;","\x80",$str);
}

If html-output is needed the last line is not necessary (and even unwanted).
sunggsun
15-Aug-2006 12:26
from PHPDIG

    function isUTF8($str) {
        if ($str === mb_convert_encoding(mb_convert_encoding($str, "UTF-32", "UTF-8"), "UTF-8", "UTF-32")) {
            return true;
        } else {
            return false;
        }
    }
chris AT w3style.co DOT uk
03-Aug-2006 02:22
Based upon that snippet below using preg_match() I needed something faster and less specific.  That function works and is brilliant but it scans the entire strings and checks that it conforms to UTF-8.  I wanted something purely to check if a string contains UTF-8 characters so that I could switch character encoding from iso-8859-1 to utf-8.

I modified the pattern to only look for non-ascii multibyte sequences in the UTF-8 range and also to stop once it finds at least one multibytes string.  This is quite a lot faster.

<?php

function detectUTF8($string)
{
        return
preg_match('%(?:
        [\xC2-\xDF][\x80-\xBF]        # non-overlong 2-byte
        |\xE0[\xA0-\xBF][\x80-\xBF]               # excluding overlongs
        |[\xE1-\xEC\xEE\xEF][\x80-\xBF]{2}      # straight 3-byte
        |\xED[\x80-\x9F][\x80-\xBF]               # excluding surrogates
        |\xF0[\x90-\xBF][\x80-\xBF]{2}    # planes 1-3
        |[\xF1-\xF3][\x80-\xBF]{3}                  # planes 4-15
        |\xF4[\x80-\x8F][\x80-\xBF]{2}    # plane 16
        )+%xs'
, $string);
}

?>
telemach
27-Jul-2005 06:48
beware : even if you need to distinguish between UTF-8 and ISO-8859-1, and you the following detection order (as chrigu suggests)

mb_detect_encoding('accentu
Новости
11 июля 2007
Сайт запущен
© 2007 info@grandviewstudio.com

Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/sites/grandviewstudiocom/www/65f67d67a94ad980786580ae69e11c07/sape.php on line 324

Deprecated: Function set_magic_quotes_runtime() is deprecated in /home/sites/grandviewstudiocom/www/65f67d67a94ad980786580ae69e11c07/sape.php on line 330
Z058440144362 Z348613067571