Sunteți pe pagina 1din 3

A Vertical Search Engine for Medical Standards

Geoferry J1, Muttan S2, Chinniah P3


PG, Electronics and Communication Engineering Department, Anna University, Chennai.India. Professor, Electronics and Communication Engineering Department, Anna University, Chennai. India. 3 Research scholar, Electronics and Communication Engineering Department, Anna University, Chennai. India.
2 1

geoferry.j@gmail.com muthan_s@annauniv.edu 3 chinnaiah_p@yahoo.co.in


2

Abstract The surge of vertical search engines in the web space is irrepressible. Vertical search engines contain concise, structured and accurate information. In this paper an effort has been made to build a vertical search engine for medical device standards and medical software standards. Most of the hospital and clinics doesnt use these standards and are unaware of these standards. Our search engine helps those to find the required medical standards to build a standardized EMRs and medical devices. This medical search engine brings out the need to follow the standards in order to provide a healthy future to healthcare industry. Keywords vertical search engine, medical standard, ontology.

I. INTRODUCTION Vertical search engines, or domain-specific search engines also called Vortals, facilitate more accurate, relevant and faster search by indexing in specific domains. Some of the examples of vertical search engines are Financial Search Engines, Law Search Engines, Medical search engines etc.[1]. Medical search engines are popular among web users. In the past it was only used by medical practitioners and researchers. The impetus of medical search engines is that it combines the advantage of two approaches: high precision advantage of specialized guides and subject directories with high recall advantage [2]. Hard ware medical standards, ISO (the International Organization for Standardization) and IEC (the International Electro technical Commission) form the specialized system for world-wide standardization. National bodies that are members of ISO or IEC participate in the development of International Standards through technical committees established by the respective organization to deal with particular fields of technical activity. ISO and IEC technical committees collaborate in fields of mutual interest. Similarly many other international organizations, governmental and non-governmental, in liaison with ISO and IEC, also take part in the work. Standard is intended to enable medical devices to interconnect and interoperate with computerized healthcare information systems in a manner suitable for the clinical environment. This context imposes several key requirements in the areas of reliability, network configuration, and support for frequently changing host computer topologies, patient

safety, and interface to healthcare personnel. The requirements of hard ware standards are patient and user safety of medical devices, Network reconfiguration, simple user interface, Support of host computer topologies, allowing implementations that span multiple beds and care units, Minimize implementation complexity for high-volume medical devices and wide applications on the area of medical device communications. Software medical standards provide the minimal requirements for any of the medical data such as to improve the quality of patient care, to decrease the number of errors, to improve the cost effectiveness of day-to-day healthcare operations, to provide the tools for world-class clinical research etc. Standardization of the various fields of medical information efforts has improved their stability and helps that information to be applied in the wide range of applications. Standardization is more extensive, most workflows are well defined, and integration of the resources with other domain becomes very easy and simple. Standardization efforts in the medical services domain particularly help to solve application integration problem. The role of software medical standards is to precisely document and define what constitutes good software engineering practices and they reduce or eliminate premarket submission documentation for the software components of medical devices, and they enhance global harmonization. II. RELATED WORKS Aysu Betin Can [3] proposed search engine enhanced with the domain knowledge obtained from UMLS to increase the effectiveness of the searches called MedicoPort, which transforms a keyword search into a conceptual search to generate maximum output with semantic value using minimum input from the user. F.Malamateniou [4] proposed a system wherein virtual patient records provide a means for integrated access to patient information that may be scattered around different healthcare settings. G. Eysenbach [5] stated that the Internet has changed modern medicine forever because a computer with a fast connection internet eliminates countless hours of digging through stacks of bound journals in medical library now takes mere seconds. Thomas E. Vanhecke [6] compared two medical literature search engines PubMed and HighWire Press available for free to anyone on the

Internet and measured retrieval accuracy, number of results generated, retrieval speed, features and search tools. C. Boyer [7] has proposed a robot called Multi-Agent Retrieval Vagabond on Information Networks that searches sites and documents specifically related to a given specialized field. According to Carlos Angulo [8] a message-oriented lightweight data integration engine that allows homogeneous and concurrent access to clinical information from disperse and heterogeneous data sources extracts the information and passes it to the requesting client applications in a flexible XML format and response message can be formatted on demand by appropriate Extensible Style sheet Language (XSL) transformations in order to meet the needs of client applications. Yilu Zhou [8] proposed an information seeking system to integrate the Chinese medical information system which web portal in the medical domain allows users only to search for Web pages from local collections and meta-search engines III. METHODOLOGY In this section, we like to introduce our system framework as shown in figure 1. URL servers send lists of URLs to be fetched to the crawlers. Crawlers are used to create a copy of all the visited pages for later processing by a search engine that will index the downloaded pages to provide fast searches. The number of possible crawlable URLs being generated by server-side software has also made it difficult for web crawlers to avoid retrieving duplicate content. Endless combinations of HTTP GET (URL-based) parameters exist, of which only a small selection will actually return unique content. We use HTML scraping technique for parsing the web pages. Conversion of all files formats like pdf, doc and ppt etc., to readable HTML format is known as HTML scraping technique. The repository contains the full HTML of every web page. In the repository, the documents are stored one after the other and are prefixed by docID, length, and URL. The repository requires no other data structures to be used in order to access it. This helps with data consistency and makes development much easier; we can rebuild all the other data structures from only the repository and a file which lists crawler errors. Indexer collects, parses, and stores data to facilitate fast and accurate information retrieval. The purpose of storing an index is to optimize speed and performance in finding relevant documents for a search query. Without an index, the search engine would scan every document in the corpus, which would require considerable time and computing power. For example, while an index of 10,000 documents can be queried within milliseconds, a sequential scan of every word in 10,000 large documents could take hours. The additional computer storage required to store the index, as well as the considerable increase in the time required for an update to take place, are traded off for the time saved during information retrieval. Document index keeps information about each document. The information stored in each entry includes the current document status, a pointer into the repository, a document

checksum, and various statistics. If the document has been crawled, it also contains a pointer into a variable width file called docinfo which contains its URL and title. Otherwise the pointer points into the URLlist which contains just the URL. This design decision was driven by the desire to have a reasonably compact data structure, and the ability to fetch a record in one disk seeks during a search.Barrels stores partial sorted indexes of the URL pages. As the crawler crawls for new webpages during ever particular period of time, the indexer updates the values in the docinfo and barrels. The barrels will be updated by the learning agent during every search by the user.

Fig. 1 System framework

Learning Agent: This component is used to learn the users interest/disinterest from his querying of the medical documents. The learning has been based on the users input, including the initial documents, the retrieved documents and the users ratings of relevance/non-relevance for the documents. With the input as mentioned, the learning agent will maintain the values for the positive and the negative user profiles to represent the users searching focus. Queries to search engine on the Internet are usually short, and cannot provide enough information for effective retrievals. So go for ontology based query expansion theory [10]. In the algorithm, we utilize the knowledge, formalized by ontology, to generate semantic diagraph for combinations of words in one query. And then according to the semantic distance between the vertexes in the diagraph, we selected the candidates to be added. Terms added into the initial query are obviously related in semantic with the initial one. Query processor translates the user query to parsed words. The user query is expanded as single words or combination of words in such a way to represent the exact user query. The resultant query is matched with the barrels and the matched documents are sent back to the user. The user are asked to rate the resultant pages for relevance/non-relevance to their search. Using the rating the users focus the obtained by the learning agent keeps a track of these searches. As learning agent learns more and more on the users focus the search results are well refined.

IV. CONCLUSIONS In this vertical search engine for Medical Device and software standards assist the requirements of professional of different fields who seeks information regarding medical device standards and medical software. In particular, the new search engine created here helps to establish an information platform on standard for researchers, clinicians, manufacturers, hospital administrators and biomedical personnel for information retrieval in the field of medicine and health care. REFERENCES
R. Shettar and R.Bhuptani, A Vertical Search Engine Based On Domain Classifier, International Journal of Computer Science and Security, Volume (2) : Issue (4). [2] A. Inthiran, Saadat M. Alhashmi and Pervaiz K. Ahmed., Collaborative Personalization on Medical Search Engines Using User Exploratory Survey, Seventh International Conference on Fuzzy Systems and Knowledge Discovery, 2010. [3] Aysu Betin Can, Nazife Baykal MedicoPort: A medical search engine for all, Computer Methods and Programs in Biomedicine, Elsevier, Volume 86, 2007, pp. 73-86. [4] F. Malamateniou, G. Vassilacopoulos, J. Mantas, A search engine for virtual patient records, International Journal of Medical Informatics Volume 55 1999, pp. 103-115. [5] G. Eysenbach, C Koehler, How do consumers search for and appraise health information on the World Wide Web? Qualtitative study using focus groups, usability tests, and in-depth interviews, Biology and Medicine Journal, 2002. [6] Thomas E.Vanhecke, Michael A. Barnes, Janet Zimmerman, Sandor Shoichet , PubMed vs. HighWire Press: A head-to-head comparison of two medical literature search engines, Computers in Biology and Medicine, Elsevier, Volume 37, 2007, pp. 1252-1258. [7] C. Boyer, O. Baujard, V. Baujard, S. Aurel , M. Selby, R.D. Appel, Health on the Net automated database of health and medical information, International Journal of Medical Informatics, Elsevier, Volume 47, 1997, pp 2729. [8] Carlos Angulo, Pere Crespo, Jose A. Maldonado, David Moner, Daniel Perez, Irene Abad, Jes us Mandingorra, Montserrat Robles, Non-invasive lightweight integration engine for building EHR from autonomous distributed systems, International journal of medical informatics, Elsevier, 2007.. [9] Yilu Zhou , Jialun Qin, Hsinchun Chen, CMedPort: An integrated approach to facilitating Chinese medical information seeking Decision Support Systems Volume 42, 2006, pp 14311448. [10] Liangjun Ma, Lin Chen and Yiping Yang, Ontology based Query Expansion in Vertical Search Engine, Sixth International Conference on Fuzzy Systems and Knowledge Discovery, 2009,pp 285-289 [1]

S-ar putea să vă placă și