Sunteți pe pagina 1din 7

Language, Data Coding and Classification

(Introduction to Medical Informatics) (http://www.cpmc.columbia.edu/edu/textbook) Common Health Care Language A finite, enumerated set of terms intended to conve information unambi!uousl "ocabular : dictionar containin! the terminolo! of a sub#ect or related sub#ect $erminolo! : a set of terms representin! the s stem of concepts of a particular field %here would ou use vocabularies& 'ab tests, dia!nosis codes, ... for stora!e and retrieval( common stora!e for all users clinical database epidemiolo!ic database biblio!raphic searches dictionar of terms for human use natural lan!ua!e processin! automated decision support maintenance of the vocabular itself )nambi!uous retrieval * sensitive, specific, reliable sensitive * all the data I want specific * onl the data I want reliable * same data for ever +uer $erm: s mbolic representation of a sin!le concept (and optionall , some of the concept,s attributes) Components: - .ode (re+uired): a s mbol (e!, number, strin!) that uni+uel identifies the concept in the vocabular - /ame: word or phrase that makes sense to a person names vs codes: names are easier to write and reco!ni0e as distinct codes are unambi!uous similar names can have different meanin!s (e.!., .ushin! disease vs s ndrome) different names can have the same meanin! (e.!., heart attack vs MI) - 1efinition: narrative text, name itself, relations 2 attributes often defined b what it is /3$ (I.1-4)

ex: I.1-4 codes 546-544 * MA'I7/A/$ /839'A:M 3; 3$<8= A/1 )/:98.I;I81 :I$8: - 3ther attributes: can be literal (arbitrar strin!)( cate!orical (controlled possibilities)( semantic (another code * relation) intensional knowled!e: describe a term itself extensional knowled!e: describe relations between terms - =elations to other terms (this is structure), includin! hierarch - Implementation information (how to store 2 +uer ) .9M. example name * urine potassium ion measurement code * 5>?? defintion * name @ relations relations: specimen * urine specimen( substance measured * potassium partAof * urine electrol tes attributes: units * m8+/l (distin!uish relations from attributes: relations have a M81 code as a value) %h or!ani0e (wh add structure)& increase power of the vocabular - map terms amon! sub-vocabularies - help a user choose the correct term - facilitate its own maintenance - be used as a knowled!e base for other s stems <ow to or!ani0e the collection of terms Bflat listB of terms oka for small vocabularies human bein! keeps track of relations, ... even in alphabetical order, user cannot find term Blun! diseaseB or Bdisease of the lun!B $herefore impose structure Bclassification hierarch B or!ani0e lar!e number of terms into classes !o from most !eneral to most specific end up with a tree structure, much like an outline e!, heart attack is a heart disease, which is a disease define parent, child, ancestor, descendent, root, leaf facilitates findin! terms and addin! new ones (maintenance)

8CI:$I/7 "3.AD)'A=E 8CAM9'8: I.1-4 - International .lassification of 1iseases, F4 %orld <ealth 3r! for collectin! health statistics ori!inall for epidemiolo! .linical Modifications added for clinical codin! now also for billin! (strict hierach with extensive s non m indexin!) has name, code, definition, implementation, hierarch :/3M81 - : stemati0ed /omenclature of Medicine I.1-4 inade+uate for .olle!e of Amer. 9atholo!ists wrote :/39, and then :/3M81 G axes: topolo! , morpholo! , etiolo! , function, disease, procedure, occupation can assemble complex terms from simple ones Me:< - Medical :ub#ect <eadin!s /ational 'ibrar of Medicine indexin! the medical literature loose hierarch M81 - Medical 8ntities 1ictionar .9M. made of other vocabularies (I.1-4,Me:<,local,...) (directed ac clic !raph) made of other vocabularies )M': - )nified Medical 'an!ua!e : stem /'M 54HG 7oal: facilitate the use of disparate medical terminolo!ies for accessin! a variet of information sources .omponents: information sources map metathesaurus semantic network: stores both intensional and extensional knowled!e /ote the man different wa s of sa in! the same thin! 8C8=.I:8: .=8A$8 A "3.AD)'A=E Aspirin( A:A( $ lenol( morphine( warfarin( .oumadin( anticoa!ulant( antip retic anal!esic( pain-killer( fever-medication( antibiotic( oxacillin( penicillin( respirator diseases( pneumonia( emph sema( lun!( left lun!( ri!ht lun!( '''( =''( =M'( =)'( ')'( lin!ual( heart( atrium( ventricle( brain( infectious diseases( menin!itis( menin!es( ventricle( chem-G( 9resb terian chem-G( Allen chem-G( laborator test( laborator ( chemistr laborator ( hematolo! laborator ( batter ( test( sodium ion( sodium test( urine electrol tes 9=398=$I8: 3; A .3/$=3''81 M81I.A' 'A/7)A78

1omain completeness Anticipate all terms in a domain not reall possible to foresee all needed terms. =esult of incompleteness is that ou cannot record all the information ou have 9ratical solutions :/3M81 permits assemblin! terms into complex ones Me:< uses modifiers to add meanin!s I.1-4 uses BotherB (not elsewhere classified) but at least its structure should not limit I.1-4: I levels of depth with 56 nodes per level :/3M81: > levels with 5J nodes per level )nambi!uous same term must not have more than one meanin! 3ften occurs because it is not worth differentiatin! in vocabular ,s domain (but would be in others) =esult of ambi!uit is that ma retrieve unwanted data from a database based upon the vocabular BventricleB as part of heart and part of brain Me:<: Bcardiac outputB as parameter and test I.1-4: BotherB * Bnot elsewhere classifiedB /on-redundant each concept must have exactl one term (one wa to sa the same thin!) difficult with multiple vocabular authors =esult of redundanc is that ma onl find one of two terms for same concept (and therefore onl some of the data in a database) e!, if find MI, m ocardial infarction, heart attack :/3M81: man wa s to assemble terms pulmonar tuberculosis 165HH vs lun!@!ranuloma@M $D@fever : non m allow more than one name for a sin!le term easier for users to find a term not redundant since still onl one code ex: M81 uses a literal attribute to do this e!, let MI, m ocardial infarction, and heart attack be s non ms of same term

lexical variants: s non mous terms that var onl in word order, punctuation, plurali0ation and the like ex: $etralo! of ;allot vs ;allot,s $etralo! <ierarchical classification actual Bre+uirementB is ease of use and maintenance classification ma be seen as one approach associative network could be an alternative (like human bein!s) can also add inheritance children inherit relations and attributes from parent diseases have etiolo!ies, so heart diseases have etiolo!ies Multiple classification strict hierarch (tree) - each term belon!s to one class but some terms reall belon! to two classes bacteria pneumonia is bacterial A/1 respirator therefore use directed ac clic !raph (1A7) directed * parent and child (not e+ual) ac clic * no term is an ancestor of itself but terms can have more than one parent (unlike strict) multiple hierarch is 1A7 with one root so same term can have several parents :M3M81: 9neumococcal pneumonia under bacterial disease .linical pneumonia under respirator disease :taph lococcal pneumonia under morpholo! .onsistenc of views 3ne wa to implement multiple classification is to put the same term in two places (in a strict hierarch ) with a pointer between them Is the term exactl the same (e!, same children) in each location& Me:<: salic lates has aspirin onl in some contexts could ar!ue that BcontextsB (inconsistenc of views) are a benefit standard terminolo!ies are KrepleteK with such inconsistencies inconsistenc is likel when vocabularies are written b committees 8xplicit relationships (semantic network) what does the parent-child relationship si!nif & isAa * Bis a t pe ofB (ulnar nerve -L nerve) partAof * Bis a part ofB (ulnar nerve -L arm) etiolo!ic * causes/causedAb

even partAof can be divisionAof (lobe of lun!) vs compenentAof (nerve in lun!) :/3M81: isAa, isApartAof, isAmadeAof, causes, isAin Me:<: isAa, partAof, associatedAwith, e+uivalentAto Muer : t pesAof vs causesAof pulmonar disease Inheritance: lun! !ets emph sema( so lobe of lun! !ets emph sema( but nerve of lun! does not !et emph sema $herefore want to define each relation explicitl (can then choose which use inheritance) I::)8: :hould the codes contain semantic information& the code is the uni+ue s mbol for a concept usuall safest to let codes contain no information can chan!e the term,s name, relations, location without chan!in! the code (as lon! as the concept has not chan!ed) e!, if pneumonia C is first thou!ht to be infectious, then found to be autoimmune, need to chan!e the hierarch , but not the concept if code is term name, then obviousl must contain info Man put the hierarchical path into code ?:I>:? * disease:heart disease:MI better efficienc for +ueries e!, I.1-4 is hierarch of di!its but 1A7 will have several paths to one term also, I.1-4 and :/3M81 put info in last di!it problem * maintenance: chan!in! hierarch re+uires chan!in! code :tructure and content of vocabular depends on use I.1-4 to classif diseases, not findin!s complete covera!e of $D, important in epidemiolo! Me:< to index literature, not treat patients %hat does a non-leaf node mean& class of terms, but not a term itself Bnot otherwise specifiedB Bnot elsewhere classifiedB * other Ball childrenB

Maintenance functions add a new term (must fulfill above re+uirements) chan!e a term mer!e two vocabularies delete a term (is it possible if data are stored&) )se vocabular to help maintain itself redundanc : look for redundant relations and attributes classification: pick lo!ical classes from attributes ambi!uit : force term to fill in a set of attributes

S-ar putea să vă placă și