Comunitatea PHP Romania
 

 
Text_LanguageDetect

Text_LanguageDetect

Detects the language of a given piece of text.

The package attempts to detect the language of a sample of text by correlating ranked 3-gram frequencies to a table of 3-gram frequencies of known languages.

It implements a version of a technique originally proposed by Cavnar & Trenkle (1994): "N-Gram-Based Text Categorization".

Example

<?php
require_once 'Text/LanguageDetect.php';
$l = new Text_LanguageDetect();

echo "Supported languages:\n";
$langs = $l->getLanguages();
if (PEAR::isError($langs)) {
    die($langs->getMessage());
}
sort($langs);
echo implode(', ', $langs) . "\n\n";

$text = <<<EOD
Hallo! Das ist ein Text in deutscher Sprache.
Mal sehen, ob die Klasse erkennt, welche Sprache das hier ist.
EOD;

$result = $l->detect($text, 4);
if (PEAR::isError($result)) {
    echo $result->getMessage(), "\n";
} else {
    print_r($result);
}
?>

The above example would give the following output:

Supported languages:
albanian, arabic, azeri, bengali, bulgarian, cebuano, croatian,
czech, danish, dutch, english, estonian, farsi, finnish, french,
german, hausa, hawaiian, hindi, hungarian, icelandic, indonesian,
italian, kazakh, kyrgyz, latin, latvian, lithuanian, macedonian,
mongolian, nepali, norwegian, pashto, pidgin, polish, portuguese,
romanian, russian, serbian, slovak, slovene, somali, spanish,
swahili, swedish, tagalog, turkish, ukrainian, urdu, uzbek,
vietnamese, welsh

Array
(
    [german] => 0.407037037037
    [dutch] => 0.288065843621
    [english] => 0.283333333333
    [danish] => 0.234526748971
)

Ultimele discutii in forum RSS Forum

Ultimele articole Ultimele articole

Top membri

Pirahna Pirahna
la birou
carco carco
Bucuresti
Birkoff Birkoff
Bucuresti
Mascka Mascka
Braila
raul_ raul_
whooper whooper
Toronto ON
mihaitha mihaitha
Sibiu
gabysolomon gabysolomon
Bacau
oriceon oriceon
Constanta
garlicinicolae garlicinicolae

Newsletter


Email:
 inscriere
 renuntare
 
 Arhiva newsletter

Parteneriat

Copyright © 2001-2008 PHP Romania Add PHPRomania to Google Add PHPRomania to Del.icio.us Add PHPRomania to Stumbleupon Add PHPRomania to Yahoo! Add PHPRomania to Digg Add PHPRomania to Blink Gómez PEER | Car Credit | Online Loans | Homeowner Loans | Car Loan
Ads: Partener Way2Web Nework: gazduire web | inregistrare domenii | web design | imobiliare | web hosting
Powered by Simplis