Fast and Effective Searches of Personal Names in an International Environment
Dr.Sandeep Gupta , Arun Pratap Srivastava, Dr. Shashank Awasthi
Abstract
Fast and effective search of personal names in an international environment uses concepts of approximate string matching and applies them to special case of finding ‘close’ or ‘similar’ names, to an input name, from a large database of names. Such ProperName-Approximate matching finds applications in situations where a user is unsure of how a person’s name is spelled, such as in a telephone directory search system or a library search system where a user wishes to search books on an author’s name. In this Paper we examine this problem in two main aspects: How to organize data efficiently, so as to obtain relevant results quickly, and how to develop suitable search techniques which would rank results suitably. We suggest four new data organization techniques to replace the current standard technique, Soundex, and we suggest refinements to the currently available search techniques. We then assess the performance of the developed techniques and compare them against the currently available ones.
Keywords
Hash table, Editex, Q-grams, Soundex, Hindex
Reference
[1] Finkel, Jenny Rose, Grenager, Trond and Manning, Christopher. 2005. Incorporating Non-local Information into Information Extraction Systems by Gibbs Sampling. Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics ACL 2005), pp. 363-370.
[2] Malouf, Robert. 2002 Markov models for language independent named entity recognition. In Proceedings of CoNLL-2002 Taipei, Taiwan, pages 591-599.
[3] Justin Zobel and Philip Dart “Phonetic String matching: lessons from Information Retrieval”, SIGIR'96,Zurich ,pp. 105-110, 1996.
[4] Pattern Matching Algorithms, Alberto Apostolico & Zvi Galil, Oxford University Press, UK, 1997.
[5] R. Baeza-Yates and G. Navarro. Fast Approximate String Matching in a Dictionary. Proc. 1998.
[6] V. I. Levenshtein, Binary codes capable of correcting deletions, insertions, and reversals. Soviet Physics Doklady 10 (1966)
[7] Zobel, J. and Dart, P. [1995]. Finding approximate matches in large lexicons.Software-Practice and Experience, 25(3):331- 345.
[8] Zobel, J. and Dart, P. [1996]. Fnetik: An integrated system for phonetic matching. Technical Report 96-6, Department of Computer Science, RMIT.
Cites this article as
[Dr.Sandeep Gupta, Arun Pratap Srivastava, Dr. Shashank Awasthi
(2014), Fast and Effective Searches of Personal Names in an International Environment, International Journal of Innovative Research in Engineering & Management (IJIREM), Vol-1, Issue-1, Page No-1-5], (ISSN 2347 - 5552). www.ijirem.org
Corresponding Author
Dr.Sandeep Gupta
Department. of CSE, Noida Institute of Engg. Technology,
Greator Noida, India