blueshoes php application framework and cms            plugins_onomastics
[ class tree: plugins_onomastics ] [ index: plugins_onomastics ] [ all elements ]

Class: Bs_Om_OnomasticsServer

Source Location: /plugins/onomastics/Bs_Om_OnomasticsServer.class.php

Class Overview

Bs_Object
   |
   --Bs_Om_OnomasticsServer

problems we face: o) typo: simth instead of smith.


Author(s):

Version:

  • 4.5.$Revision: 1.3 $ $Date: 2003/11/08 23:16:26 $

Copyright:

  • blueshoes.org

Variables

Methods


Inherited Variables

Inherited Methods

Class: Bs_Object

Bs_Object::Bs_Object()
Bs_Object::getErrors()
Basic error handling: Get *all* errors as string array from the global Bs_Error-error stack.
Bs_Object::getLastError()
Basic error handling: Get last error string from the global Bs_Error-error stack.
Bs_Object::getLastErrors()
Basic error handling: Get last errors string array from the global Bs_Error-error stack sinc last call of getLastErrors().
Bs_Object::persist()
Persists this object by serializing it and saving it to a file with unique name.
Bs_Object::setError()
Basic error handling: Push an error string on the global Bs_Error-error stack.
Bs_Object::toHtml()
Dumps the content of this object to a string using PHP's var_dump().
Bs_Object::toString()
Dumps the content of this object to a string using PHP's var_dump().
Bs_Object::unpersist()
Fetches an object that was persisted with persist()

Class Details

[line 91]
problems we face: o) typo: simth instead of smith.

o) foreign characters: mueller or müller, juerg or jürg. note that some systems use juerg instead of jürg cause of character sets, and some ppl want to be written this or that way. if your name is written "juerg müller" in your passport, you can't go and open a bank account as "jürg müller". o) nicknames: peggy for margaret, bill for william. o) short forms: b. gates for william gates (ouch!). o) localization: andreas from germany, andré from france, andrei/andrej from russia and andrea from italy is the same name, just translations. moved to the us, and all become andrew because those damn americans can't make a difference. mr. hundertwasser may also become hundertwater or so. with italians 'andrea' we have another problem, cause in germany andrea exists too, but is feminin. o) name families: adelheide, adelheidi. especially nowadays it's cool to invent new names, or at least new spellings. the americans might be different, having michael as #1 baby name since 1964 *shakes head*. in this example, heidi may be a different family member and a nickname, both is possible. o) titles and qualifiers: dr. prof. ph.d. nat. ... may be written as "dr. smith" or "smith, dr." o) prefixes: Di Caprio, D'Agostino, Al Afif, Mc Donald, ... o) suffix: dunno exactly what that is. o) married: "peggy smith" becomes "margaret johnson". bingo. she may aswell become "peggy johnson-smith" or so, which leaves us a chance. o) write form: john peter, johnpeter, john-peter o) localization again: once french was the world language #1, it was used to translate names from russian etc into 'european'. a real example is a person that was named Lejnine and now became Lezhnin (with passwort change and everything). o) sex operation: mister thomas smith becomes miss emelie smith. o) with spanish names and fulltext indexing we should be more strict cause everyone is called jose maria fernandez garcia gonzalez rodriguez. you'll find a match of those in nearly every name.

?Marialucrezia?

rtfm: american name society to promote onomastics - http://www.wtsn.binghamton.edu/ANS/ americn census (2000) - http://www.census.gov/ http://www.census.gov/genealogy/www/namesearch.html most common last names in the us - http://www.census.gov/genealogy/names/dist.all.last

name lists: http://www.zoope.com/about/about_names.html http://www.ssa.gov/OACT/NOTES/note139/1997/note139.html

dependencies: Bs_String




Tags:

version:  4.5.$Revision: 1.3 $ $Date: 2003/11/08 23:16:26 $
copyright:  blueshoes.org
author:  andrej arn <at blueshoes dot org>


[ Top ]


Class Variables

$Bs_String =

[line 99]

reference to global pseudostatic Bs_String utility class.



Tags:

access:  public

Type:   object


[ Top ]

$prefix = array("o'", 'de la', "de l'", "d'", 'd', 'de', 'di', 'dos', 'la', 'le', 'abdul', 'abd', 'al', 'von', 'van', 'vi', 'dal', 'da', 'del', 'lo', 'mc', 'mac', 'ait', 'aid', 'el', 'ed', 'ben', 'y')

[line 162]

includes 'infixes'.

todo: what's "vde." ? todo: "De Sa Costa Pereira" => what is "Sa"?

examples: fitz => fitzgerald (actually that's for first names... and should not be removed anyway. ignoring here.) i've seen that once in a lastname, "David Fitzjarrell" (usa). o' => o'brian, o'neill de la => de la croix, de la Rosa Fernandez de l' => De L'Endroit d' => d'agostino d => D Angelo (it's just the same as d'Angelo with a space instead of a "'".) de => De Marinis di => Di Battista dos => Dos Santos Ferreira la => La Rossa le => abd => short form of abdul, Arabic Abd Alla-'h, "slave of Allah" abdul => note that abdul is also a real first name. al => Al Afif, Al Atassi von => von Gunten (german) van => Van Eijk (dutch) vi => vi trinh (some asian) dal => Dal Cero da => Da Rugna del => Del Favero lo => Lo Bresti, Lo Monaco mc => mc donald mac => mac donalds :) ait => Ait Chidid aid => same as ait el => El Asri (el == al) ed => Ed Dalya ben => Ben Belgacem (ben stands for father or son, don't remember.) y => spanish for 'and' (not sure if used in first- or lastnames.)



Type:   mixed


[ Top ]

$pronounce = array()

[line 182]

N'Diaye => Ndiaye => N diaye M'Hamdi => Mhamdi => M Hamdi the problem is the pronounciation, a 'selbstlaut' is missing (aeiou).


Type:   mixed


[ Top ]

$qualifier = array('jr', 'sr', 'mr', 'ms', 'miss', 'fils', 'neto', 'sobrinho', 'ph.d', 'nat')

[line 173]

jr => junior

sr => senior mr => mister ms => miss miss => miss fils, neto, sobrinho, ph.d, nat,



Type:   mixed


[ Top ]

$suffix = array('aldin' , 'oglu', 'skii', 'skaya')

[line 123]

reference to global pseudostatic Bs_String utility class.


Type:   array


[ Top ]

$title = array('dr', 'rev', 'haj', 'sri', 'col')

[line 116]


Type:   array


[ Top ]



Class Methods


constructor Bs_Om_OnomasticsServer [line 218]

Bs_Om_OnomasticsServer Bs_Om_OnomasticsServer( )

constructor



[ Top ]

method addRelation [line 1173]

bool addRelation( int $relationTypeID, int $first_FirstnameID, int $second_FirstnameID, [int $third_FirstnameID = 0])

adds a relation for firstnames.



Tags:

return:  TRUE
throws:  bs_exception
access:  public


Parameters:

int   $relationTypeID  
int   $first_FirstnameID  
int   $second_FirstnameID  
int   $third_FirstnameID   (not used, right?)

[ Top ]

method calcSimilarityFirstname [line 634]

int calcSimilarityFirstname( string $nameOne, string $nameTwo, [int $pointsPercent = 100])

calculates the similarities of two firstnames.

examples: calcSimilarityFirstname('john', 'John') => 100 (equal) calcSimilarityFirstname('rené', 'rene') => 98 (equal after normalizing) calcSimilarityFirstname('jürg', 'juerg') => 98 (equal after normalizing) calcSimilarityFirstname('hansjürg', 'hans') => 80 (part of, starts with, ends with) calcSimilarityFirstname('hansjuerg', 'juerg') => 73 (part of, starts with, ends with (after normalizing)) calcSimilarityFirstname('john', 'hans') => 70 some relation exists (short form, translation, ...) calcSimilarityFirstname('william', 'bill') => 70 some relation exists (short form, translation, ...) (string compare)




Tags:

return:  (0 - 100, 100 being an exact match.)
see:  Bs_Om_OnomasticsServer::calcSimilarityFirstnameMultiple(), Bs_Om_OnomasticsServer::calcSimilarityLastname()
access:  public


Parameters:

string   $nameOne  
string   $nameTwo  
int   $pointsPercent  

[ Top ]

method calcSimilarityFirstnameMultiple [line 951]

void calcSimilarityFirstnameMultiple( mixed $nameOne, mixed $nameTwo)



[ Top ]

method calcSimilarityLastname [line 823]

int calcSimilarityLastname( string $nameOne, string $nameTwo, [int $pointsPercent = 100])



Tags:

return:  (0 - 100, 100 being an exact match.)
see:  Bs_Om_OnomasticsServer::calcSimilarityLastnameMultiple(), Bs_Om_OnomasticsServer::calcSimilarityFirstname()
access:  public


Parameters:

string   $nameOne  
string   $nameTwo  
int   $pointsPercent  

[ Top ]

method calcSimilarityLastnameMultiple [line 993]

void calcSimilarityLastnameMultiple( mixed $nameOne, mixed $nameTwo)



[ Top ]

method cleanLastname [line 311]

string cleanLastname( string $lastname)

cleans the given lastname and returns it.

removes all sorts of titles, suffixes, prefixes, qualifiers, ...

examples: "von roll" => "roll"




Tags:

access:  public


Parameters:

string   $lastname  

[ Top ]

method deleteRelation [line 1195]

bool deleteRelation( int $relationTypeID, int $first_FirstnameID, int $second_FirstnameID)

removes a relation for firstnames.



Tags:

return:  TRUE
throws:  bs_exception
access:  public


Parameters:

int   $relationTypeID  
int   $first_FirstnameID  
int   $second_FirstnameID  

[ Top ]

method findFirstname [line 275]

array findFirstname( string $name, [int $gender = 0], [bool $strict = FALSE], [bool $returnData = FALSE])

finds firstname records for the given $name and $gender.

param $returnData: FALSE = return a vector with ID's only. TRUE = return a hash where the key is the ID, the value is a hash with the key/value pairs of the fields.




Tags:

return:  (see param $returnData. 0-n records may be found.)
access:  public


Parameters:

string   $name  
int   $gender   (1=m, 2=f, numeric string is ok. 0 and everything else means unknown.)
bool   $strict   (if FALSE (default) then rené will match rene etc.)
bool   $returnData   (default is FALSE, see above.))

[ Top ]

method findRelations [line 1069]

void findRelations( int $forID, array &$relationWordIDs, array &$relationIDs, array &$allWordIDs)

finds the relations for the given word.

more details about the params: $forID is the record ID of the word you want. $relationWordIDs is a vector holding hashes with the keys: wordOneID, wordTwoID, relationTypeID $relationIDs is a vector with the record IDs of the relation table. $allWordIDs is a vector with all word record IDs (except the given $forID, i think). yes, we have that data in $relationWordIDs aswell. this is used internally, it's an optimization.




Tags:

return:  (the params are by ref, go figure...)
see:  Bs_Om_OnomasticsServer::findRelationsLimit()
access:  public


Parameters:

int   $forID   (word id)
array   &$relationWordIDs  
array   &$relationIDs  
array   &$allWordIDs  

[ Top ]

method findRelationsLimit [line 1123]

void findRelationsLimit( mixed $forID, array &$relationWordIDs, array &$relationIDs, array &$allWordIDs, [int $limit = 2])

finds the relations for the given word - and follows the found words to $limit relations.

example: $relationWordIDs = array(); $relationIDs = array(); $allWords = array($yourOriginalWordID); $Bs_Om_OnomasticsServer->findRelationsLimit($yourOriginalWordID, $relationWordIDs, $relationIDs, $allWords, 2); //now you could for example fetch details for the names found: while (list($wordID) = each($allWords)) { $sql = "SELECT * from BsOnomastics.Firstname where ID = {$wordID}"; $allWords[$wordID] = $this->_bsDb->getRow($sql); }




Tags:

return:  (the params are by ref, go figure...)
see:  Bs_Om_OnomasticsServer::findRelationsLimit()
access:  public


Parameters:

mixed   $forID   (int word id or vector filled with int word ids.)
array   &$relationWordIDs  
array   &$relationIDs  
array   &$allWordIDs  
int   $limit   (how many times to follow, default is 2.)

[ Top ]

method getGender [line 421]

int getGender( string $firstname, [bool $strict = FALSE], [bool $returnPlainText = FALSE])

returns the gender for the given firstname.

returned value: -2 = female -1 = more female

  1. = both
  2. = more male
  3. = male
examples: 'richard' => 2 'andrea' => 0 (real female in italy, but mostly male in germany/switzerland/...) 'sam' => 1 (can be both, eg sam as short form for samantha, but it's rather male.) 'sarah' => -2 'ztztzt' => bool FALSE 'john-peter' => 2 'john peter' => 2 'anna peter' => -2 (only the first name is used for the lookup)




Tags:

return:  (see above)
throws:  bool FALSE (can't tell)
access:  public


Parameters:

string   $firstname  
bool   $strict   (if the name compare should be done strict, eg 'rené' is the same as 'rene' or not.)
bool   $returnPlainText   (if set to TRUE then the return value will be a string, like "more female".

[ Top ]

method getNicknames [line 363]

array getNicknames( string $firstname)

returns the nicknames found for the given firstname.



Tags:

return:  (vector, may be empty)
todo:  code
access:  public


Parameters:

string   $firstname  

[ Top ]

method getTranslations [line 389]

array getTranslations( string $firstname, [mixed $toLang = null], [bool $strict = FALSE])

returns the name translations found for the given firstname.



Tags:

return:  (hash, may be empty)
todo:  code
access:  public


Parameters:

string   $firstname  
mixed   $toLang   (default is null which means anyone)
bool   $strict   (if the names have to match exactly, or if 'rené' => 'rene' etc is allowed aswell.)

[ Top ]

method getVariations [line 375]

array getVariations( string $firstname)

returns the name variants found for the given firstname.



Tags:

return:  (vector, may be empty)
todo:  code
access:  public


Parameters:

string   $firstname  

[ Top ]

method isOrderOk [line 545]

int isOrderOk( string $firstname, string $lastname)

given firstname and lastname, this method tells if the order of those names is ok. maybe the firstname/lastname fields were mixed, and now we have "smith john" instead of "john smith".

returned value:

  1. = looks wrong
  2. = can't tell
  3. = looks ok
this method uses the firstname database table to come to a result. if the given firstname is found in the firstname table, and the lastname is not, then we return 'looks ok' etc, go figure. now the lastname table is used aswell, altough there are not many records yet.




Tags:

return:  (see above)
access:  public


Parameters:

string   $firstname  
string   $lastname  

[ Top ]

method setDbByDsn [line 248]

bool setDbByDsn( array $dsn)

sets the db object.



Tags:

return:  TRUE
since:  bs4.3
see:  $this->setDbByObj(), var $this->_bsDb
throws:  bs_exception
access:  public


Parameters:

array   $dsn  

[ Top ]

method setDbByObj [line 233]

void setDbByObj( object &$bsDb)

sets the db object.



Tags:

since:  bs4.3
see:  $this->setDbByDsn(), var $this->_bsDb
access:  public


Parameters:

object   &$bsDb  

[ Top ]

method translateFirstname [line 609]

mixed translateFirstname( string $firstname, string $fromLang, [string $toLang = null])

translates the given firstname from $fromLang to $toLang.

if $toLang is not given (or null) then it will be translated to all available languages.




Tags:

return:  (string or hash depending if $toLang is used.)
todo:  code
throws:  bool FALSE (unknown firstname, no translation ...)
access:  public


Parameters:

string   $firstname  
string   $fromLang   (country or language iso code)
string   $toLang   (country or language iso code)

[ Top ]


Documentation generated on Mon, 29 Dec 2003 21:12:22 +0100 by phpDocumentor 1.2.3