10380010@unknown@formal@none@1@S@Information@@@@1@1@@danf@17-8-2009
10380020@unknown@formal@none@1@S@'''Information''' as a [[Conveyed concept|concept]] has a diversity of meanings, from everyday usage to technical settings.@@@@1@16@@danf@17-8-2009
10380030@unknown@formal@none@1@S@Generally speaking, the concept of information is closely related to notions of [[constraint]], [[communication]], [[control system|control]], [[data]], [[form]], [[instruction]], [[knowledge]], [[Meaning (linguistics)|meaning]], [[stimulation|mental stimulus]], [[pattern]], [[perception]], and [[knowledge representation|representation]].@@@@1@29@@danf@17-8-2009
10380040@unknown@formal@none@1@S@Many people speak about the [[Information Age]] as the advent of the Knowledge Age or [[knowledge society]], the [[information society]], the [[Information revolution]], and [[Information technology|information technologies]], and even though [[informatics]], [[information science]] and [[computer science]] are often in the spotlight, the word "information" is often used without careful consideration of the various meanings it has acquired.@@@@1@57@@danf@17-8-2009
10380050@unknown@formal@none@1@S@== Etymology ==@@@@1@3@@danf@17-8-2009
10380060@unknown@formal@none@1@S@According to the [[Oxford English Dictionary]], the earliest historical meaning of the word ''information'' in [[English language|English]] was the act of ''informing'', or giving form or shape to the mind, as in education, instruction, or training.@@@@1@36@@danf@17-8-2009
10380070@unknown@formal@none@1@S@A quote from 1387: "Five books come down from heaven for information of mankind."@@@@1@14@@danf@17-8-2009
10380080@unknown@formal@none@1@S@It was also used for an ''item'' of training, ''e.g.'' a particular instruction.@@@@1@13@@danf@17-8-2009
10380090@unknown@formal@none@1@S@"Melibee had heard the great skills and reasons of Dame Prudence, and her wise information and techniques."@@@@1@17@@danf@17-8-2009
10380100@unknown@formal@none@1@S@(1386)@@@@1@1@@danf@17-8-2009
10380110@unknown@formal@none@1@S@The English word was apparently derived by adding the common "noun of action" ending "''-ation''" (descended through Francais from Latin "''-tio''") to the earlier verb ''to inform'', in the sense of to give form to the mind, to discipline, instruct, teach: "Men so wise should go and inform their kings."@@@@1@50@@danf@17-8-2009
10380120@unknown@formal@none@1@S@(1330) ''Inform'' itself comes (via French) from the Latin verb ''informare'', to give form to, to form an idea of.@@@@1@20@@danf@17-8-2009
10380125@unknown@formal@none@1@S@Furthermore, Latin itself already even contained the word ''informatio'' meaning concept or idea, but the extent to which this may have influenced the development of the word ''information'' in English is unclear.@@@@1@32@@danf@17-8-2009
10380130@unknown@formal@none@1@S@As a final note, the ancient Greek word for ''form'' was [eidos], and this word was famously used in a technical philosophical sense by [Plato] (and later Aristotle) to denote the ideal identity or essence of something (see [Theory of forms]).@@@@1@41@@danf@17-8-2009
10380140@unknown@formal@none@1@S@"Eidos" can also be associated with [thought], [proposition] or even [concept].@@@@1@11@@danf@17-8-2009
10380150@unknown@formal@none@1@S@== Information as a message ==@@@@1@6@@danf@17-8-2009
10380160@unknown@formal@none@1@S@'''Information''' is the state of a system of interest.@@@@1@9@@danf@17-8-2009
10380170@unknown@formal@none@1@S@Message is the information materialized.@@@@1@5@@danf@17-8-2009
10380180@unknown@formal@none@1@S@Information is a quality of a [[message]] from a [[sender]] to one or more receivers.@@@@1@15@@danf@17-8-2009
10380190@unknown@formal@none@1@S@Information is always ''about'' something (size of a parameter, occurrence of an event, etc).@@@@1@14@@danf@17-8-2009
10380200@unknown@formal@none@1@S@Viewed in this manner, information does not have to be accurate.@@@@1@11@@danf@17-8-2009
10380210@unknown@formal@none@1@S@It may be a truth or a lie, or just the sound of a falling tree.@@@@1@16@@danf@17-8-2009
10380220@unknown@formal@none@1@S@Even a disruptive noise used to inhibit the flow of communication and create misunderstanding would in this view be a form of information.@@@@1@23@@danf@17-8-2009
10380230@unknown@formal@none@1@S@However, generally speaking, if the ''amount'' of information in the received message increases, the message is more accurate.@@@@1@18@@danf@17-8-2009
10380240@unknown@formal@none@1@S@This model assumes there is a definite [[sender]] and at least one receiver.@@@@1@13@@danf@17-8-2009
10380250@unknown@formal@none@1@S@Many refinements of the model assume the existence of a common language understood by the sender and at least one of the receivers.@@@@1@23@@danf@17-8-2009
10380260@unknown@formal@none@1@S@An important variation identifies information as that which would be communicated by a message if it were sent from a sender to a receiver capable of understanding the message.@@@@1@29@@danf@17-8-2009
10380270@unknown@formal@none@1@S@Notably, it is not required that the sender be capable of understanding the message, or even cognizant that there is a message.@@@@1@22@@danf@17-8-2009
10380280@unknown@formal@none@1@S@Thus, information is something that can be extracted from an environment, e.g., through observation, reading or measurement.@@@@1@17@@danf@17-8-2009
10380290@unknown@formal@none@1@S@Information is a term with many meanings depending on context, but is as a rule closely related to such concepts as meaning, knowledge, instruction, communication, representation, and mental stimulus.@@@@1@29@@danf@17-8-2009
10380300@unknown@formal@none@1@S@Simply stated, information is a message received and understood.@@@@1@9@@danf@17-8-2009
10380310@unknown@formal@none@1@S@In terms of data, it can be defined as a collection of facts from which conclusions may be drawn.@@@@1@19@@danf@17-8-2009
10380320@unknown@formal@none@1@S@There are many other aspects of information since it is the knowledge acquired through study or experience or instruction.@@@@1@19@@danf@17-8-2009
10380330@unknown@formal@none@1@S@But overall, information is the result of processing, manipulating and organizing data in a way that adds to the knowledge of the person receiving it.@@@@1@25@@danf@17-8-2009
10380340@unknown@formal@none@1@S@[[Communication theory]] provides a numerical measure of the uncertainty of an outcome.@@@@1@12@@danf@17-8-2009
10380350@unknown@formal@none@1@S@For example, we can say that "the signal contained thousands of bits of information".@@@@1@14@@danf@17-8-2009
10380360@unknown@formal@none@1@S@Communication theory tends to use the concept of [[information entropy]], generally attributed to [[C.E. Shannon]] (see below).@@@@1@17@@danf@17-8-2009
10380370@unknown@formal@none@1@S@Another form of information is [[Fisher information]], a concept of [[R.A. Fisher]].@@@@1@12@@danf@17-8-2009
10380380@unknown@formal@none@1@S@This is used in application of statistics to [[estimation theory]] and to science in general.@@@@1@15@@danf@17-8-2009
10380390@unknown@formal@none@1@S@Fisher information is thought of as the amount of information that a message carries about an unobservable parameter.@@@@1@18@@danf@17-8-2009
10380400@unknown@formal@none@1@S@It can be computed from knowledge of the [[likelihood function]] defining the system.@@@@1@13@@danf@17-8-2009
10380410@unknown@formal@none@1@S@For example, with a normal likelihood function, the Fisher information is the reciprocal of the variance of the law.@@@@1@19@@danf@17-8-2009
10380420@unknown@formal@none@1@S@In the absence of knowledge of the likelihood law, the Fisher information may be computed from normally distributed score data as the reciprocal of their second moment.@@@@1@27@@danf@17-8-2009
10380430@unknown@formal@none@1@S@Even though information and data are often used interchangeably, they are actually very different.@@@@1@14@@danf@17-8-2009
10380440@unknown@formal@none@1@S@Data is a set of unrelated information, and as such is of no use until it is properly evaluated.@@@@1@19@@danf@17-8-2009
10380450@unknown@formal@none@1@S@Upon evaluation, once there is some significant relation between data, and they show some relevance, then they are converted into information.@@@@1@21@@danf@17-8-2009
10380460@unknown@formal@none@1@S@Now this same data can be used for different purposes.@@@@1@10@@danf@17-8-2009
10380470@unknown@formal@none@1@S@Thus, till the data convey some information, they are not useful.@@@@1@11@@danf@17-8-2009
10380480@unknown@formal@none@1@S@=== Measuring information entropy ===@@@@1@5@@danf@17-8-2009
10380490@unknown@formal@none@1@S@The view of information as a message came into prominence with the publication in 1948 of an influential paper by [[Claude Shannon]], "[[A Mathematical Theory of Communication]]."@@@@1@27@@danf@17-8-2009
10380500@unknown@formal@none@1@S@This paper provides the foundations of [[information theory]] and endows the word ''information'' not only with a technical meaning but also a measure.@@@@1@23@@danf@17-8-2009
10380510@unknown@formal@none@1@S@If the sending device is equally likely to send any one of a set of messages, then the preferred measure of "the information produced when one message is chosen from the set" is the base two [[logarithm]] of (This measure is called ''[[self-information]]'').@@@@1@45@@danf@17-8-2009
10380520@unknown@formal@none@1@S@In this paper, Shannon continues:@@@@1@5@@danf@17-8-2009
10380530@unknown@formal@none@1@S@A complementary way of measuring information is provided by [[algorithmic information theory]].@@@@1@12@@danf@17-8-2009
10380540@unknown@formal@none@1@S@In brief, this measures the information content of a list of symbols based on how predictable they are, or more specifically how easy it is to compute the list through a [[computer program|program]]: the information content of a sequence is the number of bits of the shortest program that computes it.@@@@1@51@@danf@17-8-2009
10380550@unknown@formal@none@1@S@The sequence below would have a very low algorithmic information measurement since it is a very predictable pattern, and as the pattern continues the measurement would not change.@@@@1@28@@danf@17-8-2009
10380560@unknown@formal@none@1@S@Shannon information would give the same information measurement for each symbol, since they are [[statistical randomness|statistically random]], and each new symbol would increase the measurement.@@@@1@25@@danf@17-8-2009
10380570@unknown@formal@none@1@S@:123456789101112131415161718192021@@@@1@1@@danf@17-8-2009
10380580@unknown@formal@none@1@S@It is important to recognize the limitations of traditional information theory and algorithmic information theory from the perspective of human meaning.@@@@1@21@@danf@17-8-2009
10380590@unknown@formal@none@1@S@For example, when referring to the meaning content of a message Shannon noted “Frequently the messages have ''meaning…'' these semantic aspects of communication are irrelevant to the engineering problem.@@@@1@29@@danf@17-8-2009
10380600@unknown@formal@none@1@S@The significant aspect is that the actual message is one selected ''from a set of possible messages''” (emphasis in original).@@@@1@20@@danf@17-8-2009
10380610@unknown@formal@none@1@S@In information theory signals are part of a process, not a substance; they do something, they do not contain any specific meaning.@@@@1@22@@danf@17-8-2009
10380620@unknown@formal@none@1@S@Combining algorithmic information theory and information theory we can conclude that the most random signal contains the most information as it can be interpreted in any way and cannot be compressed.@@@@1@31@@danf@17-8-2009
10380630@unknown@formal@none@1@S@Michael Reddy noted that "'signals' of the [[mathematical theory]] are 'patterns that can be exchanged'.@@@@1@15@@danf@17-8-2009
10380640@unknown@formal@none@1@S@There is no message contained in the signal, the signals convey the ability to select from a set of possible messages."@@@@1@21@@danf@17-8-2009
10380650@unknown@formal@none@1@S@In information theory "the system must be designed to operate for each possible selection, not just the one which will actually be chosen since this is unknown at the time of design".@@@@1@32@@danf@17-8-2009
10380660@unknown@formal@none@1@S@== Information as a pattern ==@@@@1@6@@danf@17-8-2009
10380670@unknown@formal@none@1@S@Information is any represented [[pattern]].@@@@1@5@@danf@17-8-2009
10380680@unknown@formal@none@1@S@This view assumes neither accuracy nor directly communicating parties, but instead assumes a separation between an object and its representation.@@@@1@20@@danf@17-8-2009
10380690@unknown@formal@none@1@S@Consider the following example: [[economic statistics]] represent an [[Economics|economy]], however inaccurately.@@@@1@11@@danf@17-8-2009
10380700@unknown@formal@none@1@S@What are commonly referred to as data in [[computing]], [[statistics]], and other fields, are forms of information in this sense.@@@@1@20@@danf@17-8-2009
10380710@unknown@formal@none@1@S@The [[electromagnetism|electro-magnetic]] patterns in a [[computer network]] and connected [[peripheral device|device]]s are related to something other than the pattern itself, such as [[Character (computing)|text characters]] to be displayed and [[Computer keyboard|keyboard]] input.@@@@1@32@@danf@17-8-2009
10380720@unknown@formal@none@1@S@[[Signal (information theory)|Signal]]s, [[Sign (linguistics)|sign]]s, and [[symbol]]s are also in this category.@@@@1@12@@danf@17-8-2009
10380730@unknown@formal@none@1@S@On the other hand, according to [[semiotics]], data is symbols with certain syntax and information is data with a certain semantic.@@@@1@21@@danf@17-8-2009
10380740@unknown@formal@none@1@S@[[Painting]] and [[drawing]] contain information to the extent that they represent something such as an assortment of objects on a table, a [[profile]], or a [[landscape]].@@@@1@26@@danf@17-8-2009
10380750@unknown@formal@none@1@S@In other words, when a pattern of something is transposed to a pattern of something else, the latter is information.@@@@1@20@@danf@17-8-2009
10380760@unknown@formal@none@1@S@This would be the case whether or not there was anyone to perceive it.@@@@1@14@@danf@17-8-2009
10380770@unknown@formal@none@1@S@But if information can be defined merely as a pattern, does that mean that neither [[utility]] nor meaning are necessary components of information?@@@@1@23@@danf@17-8-2009
10380780@unknown@formal@none@1@S@Arguably a distinction must be made between raw unprocessed data and information which possesses utility, [[value (economics)|value]] or some quantum of meaning.@@@@1@22@@danf@17-8-2009
10380790@unknown@formal@none@1@S@On this view, information may indeed be characterized as a pattern; but this is a [[necessary]] condition, not a [[sufficient]] one.@@@@1@21@@danf@17-8-2009
10380800@unknown@formal@none@1@S@An individual entry in a telephone book, which follows a specific pattern formed by name, address and telephone number, does not become "informative" in some sense unless and until it possesses some degree of utility, value or meaning.@@@@1@38@@danf@17-8-2009
10380810@unknown@formal@none@1@S@For example, someone might look up a girlfriend's number, might order a take away etc.@@@@1@15@@danf@17-8-2009
10380820@unknown@formal@none@1@S@The vast majority of numbers will never be construed as "information" in any meaningful sense.@@@@1@15@@danf@17-8-2009
10380830@unknown@formal@none@1@S@The gap between data and information is only closed by a behavioral bridge whereby some value, utility or meaning is added to transform mere data or pattern into information.@@@@1@29@@danf@17-8-2009
10380840@unknown@formal@none@1@S@When one constructs a representation of an object, one can selectively extract from the object ([[sampling (case studies)|sampling]]) or use a [[system]] of signs to replace ([[encode|encoding]]), or both.@@@@1@29@@danf@17-8-2009
10380850@unknown@formal@none@1@S@The sampling and encoding result in representation.@@@@1@7@@danf@17-8-2009
10380860@unknown@formal@none@1@S@An example of the former is a "sample" of a product; an example of the latter is "verbal description" of a product.@@@@1@22@@danf@17-8-2009
10380870@unknown@formal@none@1@S@Both contain information of the product, however inaccurate.@@@@1@8@@danf@17-8-2009
10380880@unknown@formal@none@1@S@When one interprets representation, one can predict a broader pattern from a limited number of observations (inference) or understand the relation between patterns of two different things ([[decode|decoding]]).@@@@1@28@@danf@17-8-2009
10380890@unknown@formal@none@1@S@One example of the former is to sip a [[soup]] to know if it is spoiled; an example of the latter is examining footprints to determine the animal and its condition.@@@@1@31@@danf@17-8-2009
10380900@unknown@formal@none@1@S@In both cases, information sources are not constructed or presented by some "sender" of information.@@@@1@15@@danf@17-8-2009
10380910@unknown@formal@none@1@S@Regardless, information is dependent upon, but usually unrelated to and separate from, the medium or media used to express it.@@@@1@20@@danf@17-8-2009
10380920@unknown@formal@none@1@S@In other words, the position of a theoretical series of bits, or even the output once interpreted by a [[computer]] or similar device, is unimportant, except when someone or something is present to interpret the information.@@@@1@36@@danf@17-8-2009
10380930@unknown@formal@none@1@S@Therefore, a quantity of information is totally distinct from its medium.@@@@1@11@@danf@17-8-2009
10380940@unknown@formal@none@1@S@== Information as sensory input ==@@@@1@6@@danf@17-8-2009
10380950@unknown@formal@none@1@S@Often information is viewed as a type of [[input]] to an [[organism]] or designed device.@@@@1@15@@danf@17-8-2009
10380960@unknown@formal@none@1@S@Inputs are of two kinds.@@@@1@5@@danf@17-8-2009
10380970@unknown@formal@none@1@S@Some inputs are important to the function of the organism (for example, food) or device ([[energy]]) by themselves.@@@@1@18@@danf@17-8-2009
10380980@unknown@formal@none@1@S@In his book ''Sensory Ecology,'' Dusenbery called these causal inputs.@@@@1@10@@danf@17-8-2009
10380990@unknown@formal@none@1@S@Other inputs (information) are important only because they are associated with causal inputs and can be used to predict the occurrence of a causal input at a later time (and perhaps another place).@@@@1@33@@danf@17-8-2009
10381000@unknown@formal@none@1@S@Some information is important because of association with other information but eventually there must be a connection to a causal input.@@@@1@21@@danf@17-8-2009
10381010@unknown@formal@none@1@S@In practice, information is usually carried by weak stimuli that must be detected by specialized sensory systems and amplified by energy inputs before they can be functional to the organism or device.@@@@1@32@@danf@17-8-2009
10381020@unknown@formal@none@1@S@For example, light is often a causal input to plants but provides information to animals.@@@@1@15@@danf@17-8-2009
10381030@unknown@formal@none@1@S@The colored light reflected from a flower is too weak to do much photosynthetic work but the visual system of the bee detects it and the bee's nervous system uses the information to guide the bee to the flower, where the bee often finds nectar or pollen, which are causal inputs, serving a nutritional function.@@@@1@55@@danf@17-8-2009
10381040@unknown@formal@none@1@S@Information is any type of sensory input.@@@@1@7@@danf@17-8-2009
10381050@unknown@formal@none@1@S@When an organism with a [[nervous system]] receives an input, it transforms the input into an electrical signal.@@@@1@18@@danf@17-8-2009
10381060@unknown@formal@none@1@S@This is regarded information by some.@@@@1@6@@danf@17-8-2009
10381070@unknown@formal@none@1@S@The idea of representation is still relevant, but in a slightly different manner.@@@@1@13@@danf@17-8-2009
10381080@unknown@formal@none@1@S@That is, while [[abstract painting]] does not represent anything concretely, when the viewer sees the painting, it is nevertheless transformed into electrical signals that create a representation of the painting.@@@@1@30@@danf@17-8-2009
10381090@unknown@formal@none@1@S@Defined this way, information does not have to be related to truth, communication, or representation of an object.@@@@1@18@@danf@17-8-2009
10381100@unknown@formal@none@1@S@[[Entertainment]] in general is not intended to be informative.@@@@1@9@@danf@17-8-2009
10381110@unknown@formal@none@1@S@[[Music]], the [[performing arts]], [[amusement park]]s, works of [[fiction]] and so on are thus forms of information in this sense, but they are not necessarily forms of information according to some definitions given above.@@@@1@34@@danf@17-8-2009
10381120@unknown@formal@none@1@S@Consider another example: food supplies both nutrition and taste for those who eat it.@@@@1@14@@danf@17-8-2009
10381130@unknown@formal@none@1@S@If information is equated to sensory input, then nutrition is not information but taste is.@@@@1@15@@danf@17-8-2009
10381140@unknown@formal@none@1@S@== Information as an influence which leads to a transformation ==@@@@1@11@@danf@17-8-2009
10381150@unknown@formal@none@1@S@Information is any type of pattern that influences the formation or transformation of other patterns.@@@@1@15@@danf@17-8-2009
10381160@unknown@formal@none@1@S@In this sense, there is no need for a conscious mind to perceive, much less appreciate, the pattern.@@@@1@18@@danf@17-8-2009
10381170@unknown@formal@none@1@S@Consider, for example, [[DNA]].@@@@1@4@@danf@17-8-2009
10381180@unknown@formal@none@1@S@The sequence of [[nucleotide]]s is a pattern that influences the formation and development of an organism without any need for a conscious mind.@@@@1@23@@danf@17-8-2009
10381190@unknown@formal@none@1@S@[[Systems theory]] at times seems to refer to information in this sense, assuming information does not necessarily involve any conscious mind, and patterns circulating (due to [[feedback]]) in the system can be called information.@@@@1@34@@danf@17-8-2009
10381200@unknown@formal@none@1@S@In other words, it can be said that information in this sense is something potentially perceived as representation, though not created or presented for that purpose.@@@@1@26@@danf@17-8-2009
10381210@unknown@formal@none@1@S@When [[Marshall McLuhan]] speaks of [[media (communication)|media]] and their effects on human cultures, he refers to the structure of [[cultural artifact|artifacts]] that in turn shape our behaviors and mindsets.@@@@1@29@@danf@17-8-2009
10381220@unknown@formal@none@1@S@Also, [[pheromone]]s are often said to be "information" in this sense.@@@@1@11@@danf@17-8-2009
10381230@unknown@formal@none@1@S@(See also [[Gregory Bateson]].)@@@@1@4@@danf@17-8-2009
10381240@unknown@formal@none@1@S@== Information as a property in physics ==@@@@1@8@@danf@17-8-2009
10381250@unknown@formal@none@1@S@In 2003, J. D. Bekenstein claimed there is a growing trend in [[physics]] to define the physical world as being made of information itself (and thus information is defined in this way).@@@@1@32@@danf@17-8-2009
10381260@unknown@formal@none@1@S@Information has a well defined meaning in physics.@@@@1@8@@danf@17-8-2009
10381270@unknown@formal@none@1@S@Examples of this include the phenomenon of [[quantum entanglement]] where particles can interact without reference to their separation or the speed of light.@@@@1@23@@danf@17-8-2009
10381280@unknown@formal@none@1@S@Information itself cannot travel faster than light even if the information is transmitted indirectly.@@@@1@14@@danf@17-8-2009
10381290@unknown@formal@none@1@S@This could lead to the fact that all attempts at physically observing a particle with an "entangled" relationship to another are slowed down, even though the particles are not connected in any other way other than by the information they carry.@@@@1@41@@danf@17-8-2009
10381300@unknown@formal@none@1@S@Another link is demonstrated by the [[Maxwell's demon]] thought experiment.@@@@1@10@@danf@17-8-2009
10381310@unknown@formal@none@1@S@In this experiment, a direct relationship between information and another physical property, [[entropy]], is demonstrated.@@@@1@15@@danf@17-8-2009
10381320@unknown@formal@none@1@S@A consequence is that it is impossible to destroy information without increasing the entropy of a system; in practical terms this often means generating heat.@@@@1@25@@danf@17-8-2009
10381330@unknown@formal@none@1@S@Another, more philosophical, outcome is that information could be thought of as interchangeable with [[Energy#Transformations_of_energy|energy]].@@@@1@15@@danf@17-8-2009
10381340@unknown@formal@none@1@S@Thus, in the study of [[logic gates]], the theoretical lower bound of thermal energy released by an ''AND gate'' is higher than for the ''NOT gate'' (because information is destroyed in an ''AND gate'' and simply converted in a ''NOT gate'').@@@@1@41@@danf@17-8-2009
10381350@unknown@formal@none@1@S@Physical information is of particular importance in the theory of [[quantum computers]].@@@@1@12@@danf@17-8-2009
10381360@unknown@formal@none@1@S@== Information as records ==@@@@1@5@@danf@17-8-2009
10381370@unknown@formal@none@1@S@Records are a specialized form of information.@@@@1@7@@danf@17-8-2009
10381380@unknown@formal@none@1@S@Essentially, records are information produced consciously or as by-products of business activities or transactions and retained because of their value.@@@@1@20@@danf@17-8-2009
10381390@unknown@formal@none@1@S@Primarily their value is as evidence of the activities of the organization but they may also be retained for their informational value.@@@@1@22@@danf@17-8-2009
10381400@unknown@formal@none@1@S@Sound [[records management]] ensures that the integrity of records is preserved for as long as they are required.@@@@1@18@@danf@17-8-2009
10381410@unknown@formal@none@1@S@The international standard on records management, ISO 15489, defines records as "information created, received, and maintained as evidence and information by an organization or person, in pursuance of legal obligations or in the transaction of business".@@@@1@36@@danf@17-8-2009
10381420@unknown@formal@none@1@S@The International Committee on Archives (ICA) Committee on electronic records defined a record as, "a specific piece of recorded information generated, collected or received in the initiation, conduct or completion of an activity and that comprises sufficient content, context and structure to provide proof or evidence of that activity".@@@@1@49@@danf@17-8-2009
10381430@unknown@formal@none@1@S@Records may be retained because of their business value, as part of the [[corporate memory]] of the organization or to meet legal, fiscal or accountability requirements imposed on the organization.@@@@1@30@@danf@17-8-2009
10381440@unknown@formal@none@1@S@Willis (2005) expressed the view that sound management of business records and information delivered "…six key requirements for good [[corporate governance]]…transparency; accountability; due process; compliance; meeting statutory and common law requirements; and security of personal and corporate information."@@@@1@38@@danf@17-8-2009
10381450@unknown@formal@none@1@S@== Information and semiotics ==@@@@1@5@@danf@17-8-2009
10381460@unknown@formal@none@1@S@Beynon-Davies explains the multi-faceted concept of information in terms of that of signs and sign-systems.@@@@1@15@@danf@17-8-2009
10381470@unknown@formal@none@1@S@Signs themselves can be considered in terms of four inter-dependent levels, layers or branches of [[semiotics]]: pragmatics, semantics, syntactics and empirics.@@@@1@21@@danf@17-8-2009
10381480@unknown@formal@none@1@S@These four layers serve to connect the social world on the one hand with the physical or technical world on the other.@@@@1@22@@danf@17-8-2009
10381490@unknown@formal@none@1@S@[[Pragmatics]] is concerned with the purpose of communication.@@@@1@8@@danf@17-8-2009
10381500@unknown@formal@none@1@S@Pragmatics links the issue of signs with that of intention.@@@@1@10@@danf@17-8-2009
10381510@unknown@formal@none@1@S@The focus of pragmatics is on the intentions of human agents underlying communicative behaviour.@@@@1@14@@danf@17-8-2009
10381520@unknown@formal@none@1@S@In other words, intentions link language to action.@@@@1@8@@danf@17-8-2009
10381530@unknown@formal@none@1@S@[[Semantics]] is concerned with the meaning of a message conveyed in a communicative act.@@@@1@14@@danf@17-8-2009
10381535@unknown@formal@none@1@S@Semantics considers the content of communication.@@@@1@6@@danf@17-8-2009
10381540@unknown@formal@none@1@S@Semantics is the study of the meaning of signs - the association between signs and behaviour.@@@@1@16@@danf@17-8-2009
10381550@unknown@formal@none@1@S@Semantics can be considered as the study of the link between symbols and their referents or concepts; particularly the way in which signs relate to human behaviour.@@@@1@27@@danf@17-8-2009
10381560@unknown@formal@none@1@S@Syntactics is concerned with the formalism used to represent a message.@@@@1@11@@danf@17-8-2009
10381570@unknown@formal@none@1@S@Syntactics as an area studies the form of communication in terms of the logic and grammar of sign systems.@@@@1@19@@danf@17-8-2009
10381580@unknown@formal@none@1@S@Syntactics is devoted to the study of the form rather than the content of signs and sign-systems.@@@@1@17@@danf@17-8-2009
10381590@unknown@formal@none@1@S@Empirics is the study of the signals used to carry a message; the physical characteristics of the medium of communication.@@@@1@20@@danf@17-8-2009
10381600@unknown@formal@none@1@S@Empirics is devoted to the study of communication channels and their characteristics, e.g., sound, light, electronic transmission etc.@@@@1@18@@danf@17-8-2009
10381610@unknown@formal@none@1@S@Communication normally exists within the context of some social situation.@@@@1@10@@danf@17-8-2009
10381620@unknown@formal@none@1@S@The social situation sets the context for the intentions conveyed (pragmatics) and the form in which communication takes place.@@@@1@19@@danf@17-8-2009
10381630@unknown@formal@none@1@S@In a communicative situation intentions are expressed through messages which comprise collections of inter-related signs taken from a language which is mutually understood by the agents involved in the communication.@@@@1@30@@danf@17-8-2009
10381640@unknown@formal@none@1@S@Mutual understanding implies that agents involved understand the chosen language in terms of its agreed syntax (syntactics) and semantics.@@@@1@19@@danf@17-8-2009
10381650@unknown@formal@none@1@S@The sender codes the message in the language and sends the message as signals along some communication channel (empirics).@@@@1@19@@danf@17-8-2009
10381660@unknown@formal@none@1@S@The chosen communication channel will have inherent properties which determine outcomes such as the speed with which communication can take place and over what distance.@@@@1@25@@danf@17-8-2009
10390010@unknown@formal@none@1@S@Information extraction@@@@1@2@@danf@17-8-2009
10390020@unknown@formal@none@1@S@In [[natural language processing]], '''information extraction''' (IE) is a type of [[information retrieval]] whose goal is to automatically extract structured information, i.e. categorized and contextually and semantically well-defined data from a certain domain, from unstructured [[machine-readable]] documents.@@@@1@37@@danf@17-8-2009
10390030@unknown@formal@none@1@S@An example of information extraction is the extraction of instances of corporate mergers, more formally , from an online news sentence such as: "Yesterday, New-York based Foo Inc. announced their acquisition of Bar Corp."@@@@1@36@@danf@17-8-2009
10390040@unknown@formal@none@1@S@A broad goal of IE is to allow computation to be done on the previously unstructured data.@@@@1@17@@danf@17-8-2009
10390050@unknown@formal@none@1@S@A more specific goal is to allow logical reasoning to draw inferences based on the logical content of the input data.@@@@1@21@@danf@17-8-2009
10390060@unknown@formal@none@1@S@The significance of IE is determined by the growing amount of information available in unstructured (i.e. without [[metadata]]) form, for instance on the Internet.@@@@1@24@@danf@17-8-2009
10390070@unknown@formal@none@1@S@This knowledge can be made more accessible by means of transformation into [[relational database|relational form]], or by marking-up with [[XML]] tags.@@@@1@21@@danf@17-8-2009
10390080@unknown@formal@none@1@S@An intelligent agent monitoring a news data feed requires IE to transform unstructured data into something that can be reasoned with.@@@@1@21@@danf@17-8-2009
10390090@unknown@formal@none@1@S@A typical application of IE is to scan a set of documents written in a [[natural language]] and populate a database with the information extracted.@@@@1@25@@danf@17-8-2009
10390100@unknown@formal@none@1@S@Current approaches to IE use [[natural language processing]] techniques that focus on very restricted domains.@@@@1@15@@danf@17-8-2009
10390110@unknown@formal@none@1@S@For example, the ''[[Message Understanding Conference]]'' (MUC) is a competition-based conference that focused on the following domains in the past:@@@@1@20@@danf@17-8-2009
10390120@unknown@formal@none@1@S@*MUC-1 (1987), MUC-2 (1989): Naval operations messages.@@@@1@7@@danf@17-8-2009
10390130@unknown@formal@none@1@S@*MUC-3 (1991), MUC-4 (1992): Terrorism in Latin American countries.@@@@1@9@@danf@17-8-2009
10390140@unknown@formal@none@1@S@*MUC-5 (1993): Joint ventures and microelectronics domain.@@@@1@7@@danf@17-8-2009
10390150@unknown@formal@none@1@S@*MUC-6 (1995): News articles on management changes.@@@@1@7@@danf@17-8-2009
10390160@unknown@formal@none@1@S@*MUC-7 (1998): Satellite launch reports.@@@@1@5@@danf@17-8-2009
10390170@unknown@formal@none@1@S@Natural Language texts may need to use some form of a [[Text simplification]] to create a more easily machine readable text to extract the sentences.@@@@1@25@@danf@17-8-2009
10390180@unknown@formal@none@1@S@Typical subtasks of IE are:@@@@1@5@@danf@17-8-2009
10390190@unknown@formal@none@1@S@* [[Named Entity Recognition]]: recognition of entity names (for people and organizations), place names, temporal expressions, and certain types of numerical expressions.@@@@1@22@@danf@17-8-2009
10390200@unknown@formal@none@1@S@* [[Coreference]]: identification chains of [[noun phrase]]s that refer to the same object.@@@@1@13@@danf@17-8-2009
10390210@unknown@formal@none@1@S@For example, [[Anaphora (linguistics)|anaphora]] is a type of coreference.@@@@1@9@@danf@17-8-2009
10390220@unknown@formal@none@1@S@* [[Terminology extraction]]: finding the relevant terms for a given [[text corpus|corpus]]@@@@1@12@@danf@17-8-2009
10390230@unknown@formal@none@1@S@* Relation Extraction: identification of relations between entities, such as:@@@@1@10@@danf@17-8-2009
10390240@unknown@formal@none@1@S@**PERSON works for ORGANIZATION (extracted from the sentence "Bill works for IBM.")@@@@1@12@@danf@17-8-2009
10390250@unknown@formal@none@1@S@**PERSON located in LOCATION (extracted from the sentence "Bill is in France.")@@@@1@12@@danf@17-8-2009
10400010@unknown@formal@none@1@S@Information retrieval@@@@1@2@@danf@17-8-2009
10400020@unknown@formal@none@1@S@'''Information retrieval''' ('''IR''') is the science of searching for documents, for [[information]] within documents and for [[Metadata (computing)|metadata]] about documents, as well as that of searching [[relational database]]s and the [[World Wide Web]].@@@@1@33@@danf@17-8-2009
10400030@unknown@formal@none@1@S@There is overlap in the usage of the terms data retrieval, [[document retrieval]], information retrieval, and [[text retrieval]], but each also has its own body of literature, theory, [[Praxis (process)|praxis]] and technologies.@@@@1@32@@danf@17-8-2009
10400040@unknown@formal@none@1@S@IR is [[interdisciplinary]], based on [[computer science]], [[mathematics]], [[library science]], [[information science]], [[information architecture]], [[cognitive psychology]], [[linguistics]], [[statistics]] and [[physics]].@@@@1@20@@danf@17-8-2009
10400050@unknown@formal@none@1@S@Automated information retrieval systems are used to reduce what has been called "[[information overload]]".@@@@1@14@@danf@17-8-2009
10400060@unknown@formal@none@1@S@Many universities and [[public library|public libraries]] use IR systems to provide access to books, journals and other documents.@@@@1@18@@danf@17-8-2009
10400070@unknown@formal@none@1@S@Web [[Web search engine|search engine]]s are the most visible [[Information retrieval applications|IR applications]].@@@@1@13@@danf@17-8-2009
10400080@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009
10400090@unknown@formal@none@1@S@The idea of using computers to search for relevant pieces of information was popularized in an article ''[[As We May Think]]'' by [[Vannevar Bush]] in 1945.@@@@1@26@@danf@17-8-2009
10400100@unknown@formal@none@1@S@First implementations of information retrieval systems were introduced in the 1950s and 1960s.@@@@1@13@@danf@17-8-2009
10400110@unknown@formal@none@1@S@By 1990 several different techniques had been shown to perform well on small text corpora (several thousand documents).@@@@1@18@@danf@17-8-2009
10400120@unknown@formal@none@1@S@In 1992 the US Department of Defense, along with the [[National Institute of Standards and Technology]] (NIST), cosponsored the [[Text Retrieval Conference]] (TREC) as part of the TIPSTER text program.@@@@1@30@@danf@17-8-2009
10400130@unknown@formal@none@1@S@The aim of this was to look into the information retrieval community by supplying the infrastructure that was needed for evaluation of text retrieval methodologies on a very large text collection.@@@@1@31@@danf@17-8-2009
10400140@unknown@formal@none@1@S@This catalyzed research on methods that [[scalability|scale]] to huge corpora.@@@@1@10@@danf@17-8-2009
10400150@unknown@formal@none@1@S@The introduction of web [[Web search engine|search engine]]s has boosted the need for very large scale retrieval systems even further.@@@@1@20@@danf@17-8-2009
10400160@unknown@formal@none@1@S@The use of digital methods for storing and retrieving information has led to the phenomenon of [[digital obsolescence]], where a digital resource ceases to be readable because the physical media, the reader required to read the media, the hardware, or the software that runs on it, is no longer available.@@@@1@50@@danf@17-8-2009
10400170@unknown@formal@none@1@S@The information is initially easier to retrieve than if it were on paper, but is then effectively lost.@@@@1@18@@danf@17-8-2009
10400180@unknown@formal@none@1@S@=== Timeline ===@@@@1@3@@danf@17-8-2009
10400190@unknown@formal@none@1@S@* 1890: Hollerith tabulating machines were used to analyze the US census.@@@@1@12@@danf@17-8-2009
10400200@unknown@formal@none@1@S@([[Herman Hollerith]]).@@@@1@2@@danf@17-8-2009
10400210@unknown@formal@none@1@S@* 1945: [[Vannevar Bush]]'s ''[[As We May Think]]'' appeared in ''[[Atlantic Monthly]]''@@@@1@12@@danf@17-8-2009
10400220@unknown@formal@none@1@S@* Late 1940s: The US military confronted problems of indexing and retrieval of wartime scientific research documents captured from Germans.@@@@1@20@@danf@17-8-2009
10400230@unknown@formal@none@1@S@* 1947: [[Hans Peter Luhn]] (research engineer at IBM since 1941) began work on a mechanized, punch card based system for searching chemical compounds.@@@@1@24@@danf@17-8-2009
10400240@unknown@formal@none@1@S@* 1950: The term "information retrieval" may have been coined by [[Calvin Mooers]].@@@@1@13@@danf@17-8-2009
10400250@unknown@formal@none@1@S@* 1950s: Growing concern in the US for a "science gap" with the USSR motivated, encouraged funding, and provided a backdrop for mechanized literature searching systems ([[Allen Kent]] et al) and the invention of citation indexing ([[Eugene Garfield]]).@@@@1@38@@danf@17-8-2009
10400260@unknown@formal@none@1@S@* 1955: Allen Kent joined [[Case Western Reserve University]], and eventually becomes associate director of the Center for Documentation and Communications Research.@@@@1@22@@danf@17-8-2009
10400270@unknown@formal@none@1@S@That same year, Kent and colleagues publish a paper in American Documentation describing the precision and recall measures, as well as detailing a proposed "framework" for evaluating an IR system, which includes statistical sampling methods for determining the number of relevant documents not retrieved.@@@@1@44@@danf@17-8-2009
10400280@unknown@formal@none@1@S@* 1958: International Conference on Scientific Information Washington DC included consideration of IR systems as a solution to problems identified.@@@@1@20@@danf@17-8-2009
10400290@unknown@formal@none@1@S@See: Proceedings of the International Conference on Scientific Information, 1958 (National Academy of Sciences, Washington, DC, 1959)@@@@1@17@@danf@17-8-2009
10400300@unknown@formal@none@1@S@* 1959: Hans Peter Luhn published "Auto-encoding of documents for information retrieval."@@@@1@12@@danf@17-8-2009
10400310@unknown@formal@none@1@S@* 1960: Melvin Earl (Bill) Maron and J. L. Kuhns published "On relevance, probabilistic indexing, and information retrieval" in Journal of the ACM 7(3):216-244, July 1960.@@@@1@26@@danf@17-8-2009
10400320@unknown@formal@none@1@S@* Early 1960s: [[Gerard Salton]] began work on IR at Harvard, later moved to Cornell.@@@@1@15@@danf@17-8-2009
10400330@unknown@formal@none@1@S@* 1962: [[Cyril W. Cleverdon]] published early findings of the Cranfield studies, developing a model for IR system evaluation.@@@@1@19@@danf@17-8-2009
10400340@unknown@formal@none@1@S@See: Cyril W. Cleverdon, "Report on the Testing and Analysis of an Investigation into the Comparative Efficiency of Indexing Systems".@@@@1@20@@danf@17-8-2009
10400350@unknown@formal@none@1@S@Cranfield Coll. of Aeronautics, Cranfield, England, 1962.@@@@1@7@@danf@17-8-2009
10400360@unknown@formal@none@1@S@* 1962: Kent published Information Analysis and Retrieval@@@@1@8@@danf@17-8-2009
10400370@unknown@formal@none@1@S@* 1963: Weinberg report "Science, Government and Information" gave a full articulation of the idea of a "crisis of scientific information."@@@@1@21@@danf@17-8-2009
10400380@unknown@formal@none@1@S@The report was named after Dr. [[Alvin Weinberg]].@@@@1@8@@danf@17-8-2009
10400390@unknown@formal@none@1@S@* 1963: [[Joseph Becker]] and [[Robert M. Hayes]] published text on information retrieval.@@@@1@13@@danf@17-8-2009
10400400@unknown@formal@none@1@S@Becker, Joseph; Hayes, Robert Mayo.@@@@1@5@@danf@17-8-2009
10400410@unknown@formal@none@1@S@Information storage and retrieval: tools, elements, theories.@@@@1@7@@danf@17-8-2009
10400420@unknown@formal@none@1@S@New York, Wiley (1963).@@@@1@4@@danf@17-8-2009
10400430@unknown@formal@none@1@S@* 1964: [[Karen Spärck Jones]] finished her thesis at Cambridge, ''Synonymy and Semantic Classification'', and continued work on [[computational linguistics]] as it applies to IR@@@@1@25@@danf@17-8-2009
10400440@unknown@formal@none@1@S@* 1964: The [[National Bureau of Standards]] sponsored a symposium titled "Statistical Association Methods for Mechanized Documentation."@@@@1@17@@danf@17-8-2009
10400450@unknown@formal@none@1@S@Several highly significant papers, including G. Salton's first published reference (we believe) to the SMART system.@@@@1@16@@danf@17-8-2009
10400460@unknown@formal@none@1@S@* Mid-1960s: National Library of Medicine developed [[MEDLARS]] Medical Literature Analysis and Retrieval System, the first major machine-readable database and batch retrieval system@@@@1@23@@danf@17-8-2009
10400470@unknown@formal@none@1@S@* Mid-1960s: Project Intrex at MIT@@@@1@6@@danf@17-8-2009
10400480@unknown@formal@none@1@S@* 1965: [[J. C. R. Licklider]] published ''Libraries of the Future''@@@@1@11@@danf@17-8-2009
10400490@unknown@formal@none@1@S@* 1966: [[Don Swanson]] was involved in studies at University of Chicago on Requirements for Future Catalogs@@@@1@17@@danf@17-8-2009
10400500@unknown@formal@none@1@S@* 1968: Gerard Salton published ''Automatic Information Organization and Retrieval''.@@@@1@10@@danf@17-8-2009
10400510@unknown@formal@none@1@S@* 1968: [[J. W. Sammon]]'s RADC Tech report "Some Mathematics of Information Storage and Retrieval..." outlined the vector model.@@@@1@19@@danf@17-8-2009
10400520@unknown@formal@none@1@S@* 1969: Sammon's "A nonlinear mapping for data structure analysis" (IEEE Transactions on Computers) was the first proposal for visualization interface to an IR system.@@@@1@25@@danf@17-8-2009
10400530@unknown@formal@none@1@S@* Late 1960s: [[F. W. Lancaster]] completed evaluation studies of the MEDLARS system and published the first edition of his text on information retrieval@@@@1@24@@danf@17-8-2009
10400540@unknown@formal@none@1@S@* Early 1970s: first online systems--NLM's AIM-TWX, MEDLINE; Lockheed's Dialog; SDC's ORBIT@@@@1@12@@danf@17-8-2009
10400550@unknown@formal@none@1@S@* Early 1970s: [[Theodor Nelson]] promoting concept of [[hypertext]], published Computer Lib/Dream Machines@@@@1@13@@danf@17-8-2009
10400560@unknown@formal@none@1@S@* 1971: [[N. Jardine]] and [[C. J. Van Rijsbergen]] published "The use of hierarchic clustering in information retrieval", which articulated the "cluster hypothesis."@@@@1@23@@danf@17-8-2009
10400570@unknown@formal@none@1@S@(Information Storage and Retrieval, 7(5), pp. 217-240, Dec 1971)@@@@1@9@@danf@17-8-2009
10400580@unknown@formal@none@1@S@*1975: Three highly influential publications by Salton fully articulated his vector processing framework and term discrimination model:@@@@1@17@@danf@17-8-2009
10400590@unknown@formal@none@1@S@** A Theory of Indexing (Society for Industrial and Applied Mathematics)@@@@1@11@@danf@17-8-2009
10400600@unknown@formal@none@1@S@** "A theory of term importance in automatic text analysis", (JASIS v. 26)@@@@1@13@@danf@17-8-2009
10400610@unknown@formal@none@1@S@** "A vector space model for automatic indexing", (CACM 18:11)@@@@1@10@@danf@17-8-2009
10400620@unknown@formal@none@1@S@* 1978: The First [[Association for Computing Machinery|ACM]] [[SIGIR]] conference.@@@@1@10@@danf@17-8-2009
10400630@unknown@formal@none@1@S@* 1979: C. J. Van Rijsbergen published ''Information Retrieval'' (Butterworths).@@@@1@10@@danf@17-8-2009
10400640@unknown@formal@none@1@S@Heavy emphasis on probabilistic models.@@@@1@5@@danf@17-8-2009
10400650@unknown@formal@none@1@S@* 1980: First international ACM SIGIR conference, joint with British Computer Society IR group in Cambridge@@@@1@16@@danf@17-8-2009
10400660@unknown@formal@none@1@S@* 1982: [[Nicholas J. Belkin|Belkin]], Oddy, and Brooks proposed the ASK (Anomalous State of Knowledge) viewpoint for information retrieval.@@@@1@19@@danf@17-8-2009
10400670@unknown@formal@none@1@S@This was an important concept, though their automated analysis tool proved ultimately disappointing.@@@@1@13@@danf@17-8-2009
10400680@unknown@formal@none@1@S@* 1983: Salton (and M. McGill) published Introduction to Modern Information Retrieval (McGraw-Hill), with heavy emphasis on vector models.@@@@1@19@@danf@17-8-2009
10400690@unknown@formal@none@1@S@* Mid-1980s: Efforts to develop end user versions of commercial IR systems.@@@@1@12@@danf@17-8-2009
10400700@unknown@formal@none@1@S@* 1985-1993: Key papers on and experimental systems for visualization interfaces.@@@@1@11@@danf@17-8-2009
10400710@unknown@formal@none@1@S@* Work by [[D. B. Crouch]], [[Robert R. Korfhage]], [[M. Chalmers]], [[A. Spoerri]] and others.@@@@1@15@@danf@17-8-2009
10400720@unknown@formal@none@1@S@* 1989: First [[World Wide Web]] proposals by [[Tim Berners-Lee]] at [[CERN]].@@@@1@12@@danf@17-8-2009
10400730@unknown@formal@none@1@S@* 1992: First TREC conference.@@@@1@5@@danf@17-8-2009
10400740@unknown@formal@none@1@S@* 1997: Publication of [[Robert R. Korfhage|Korfhage]]'s ''Information Storage and Retrieval'' with emphasis on visualization and multi-reference point systems.@@@@1@19@@danf@17-8-2009
10400750@unknown@formal@none@1@S@* Late 1990s: Web [[Web search engine|search engine]] implementation of many features formerly found only in experimental IR systems@@@@1@19@@danf@17-8-2009
10400760@unknown@formal@none@1@S@== Overview ==@@@@1@3@@danf@17-8-2009
10400770@unknown@formal@none@1@S@An information retrieval process begins when a user enters a query into the system.@@@@1@14@@danf@17-8-2009
10400780@unknown@formal@none@1@S@Queries are formal statements of [[information need]]s, for example search strings in web search engines.@@@@1@15@@danf@17-8-2009
10400790@unknown@formal@none@1@S@In information retrieval a query does not uniquely identify a single object in the collection.@@@@1@15@@danf@17-8-2009
10400800@unknown@formal@none@1@S@Instead, several objects may match the query, perhaps with different degrees of [[relevance|relevancy]].@@@@1@13@@danf@17-8-2009
10400810@unknown@formal@none@1@S@An object is an entity which keeps or stores information in a database.@@@@1@13@@danf@17-8-2009
10400820@unknown@formal@none@1@S@User queries are matched to objects stored in the database.@@@@1@10@@danf@17-8-2009
10400830@unknown@formal@none@1@S@Depending on the [[Information retrieval applications|application]] the data objects may be, for example, text documents, images or videos.@@@@1@18@@danf@17-8-2009
10400840@unknown@formal@none@1@S@Often the documents themselves are not kept or stored directly in the IR system, but are instead represented in the system by document surrogates.@@@@1@24@@danf@17-8-2009
10400850@unknown@formal@none@1@S@Most IR systems compute a numeric score on how well each object in the database match the query, and rank the objects according to this value.@@@@1@26@@danf@17-8-2009
10400860@unknown@formal@none@1@S@The top ranking objects are then shown to the user.@@@@1@10@@danf@17-8-2009
10400870@unknown@formal@none@1@S@The process may then be iterated if the user wishes to refine the query.@@@@1@14@@danf@17-8-2009
10400880@unknown@formal@none@1@S@== Performance measures ==@@@@1@4@@danf@17-8-2009
10400890@unknown@formal@none@1@S@Many different measures for evaluating the performance of information retrieval systems have been proposed.@@@@1@14@@danf@17-8-2009
10400900@unknown@formal@none@1@S@The measures require a collection of documents and a query.@@@@1@10@@danf@17-8-2009
10400910@unknown@formal@none@1@S@All common measures described here assume a ground truth notion of relevancy: every document is known to be either relevant or non-relevant to a particular query.@@@@1@26@@danf@17-8-2009
10400920@unknown@formal@none@1@S@In practice queries may be [[ill-posed]] and there may be different shades of relevancy.@@@@1@14@@danf@17-8-2009
10400930@unknown@formal@none@1@S@=== Precision ===@@@@1@3@@danf@17-8-2009
10400940@unknown@formal@none@1@S@Precision is the fraction of the documents retrieved that are [[Relevance (information retrieval)|relevant]] to the user's information need.@@@@1@18@@danf@17-8-2009
10400950@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10400960@unknown@formal@none@1@S@In [[binary classification]], precision is analogous to [[positive predictive value]].@@@@1@10@@danf@17-8-2009
10400970@unknown@formal@none@1@S@Precision takes all retrieved documents into account.@@@@1@7@@danf@17-8-2009
10400980@unknown@formal@none@1@S@It can also be evaluated at a given cut-off rank, considering only the topmost results returned by the system.@@@@1@19@@danf@17-8-2009
10400990@unknown@formal@none@1@S@This measure is called ''precision at n'' or ''P\sn''.@@@@1@9@@danf@17-8-2009
10401000@unknown@formal@none@1@S@Note that the meaning and usage of "precision" in the field of Information Retrieval differs from the definition of [[accuracy and precision]] within other branches of science and technology.@@@@1@29@@danf@17-8-2009
10401010@unknown@formal@none@1@S@=== Recall ===@@@@1@3@@danf@17-8-2009
10401020@unknown@formal@none@1@S@Recall is the fraction of the documents that are relevant to the query that are successfully retrieved.@@@@1@17@@danf@17-8-2009
10401030@unknown@formal@none@1@S@:@@@@1@5@@danf@17-8-2009
10401040@unknown@formal@none@1@S@In binary classification, recall is called [[sensitivity (tests)|sensitivity]].@@@@1@8@@danf@17-8-2009
10401050@unknown@formal@none@1@S@So it can be looked at as ''the probability that a relevant document is retrieved by the query''.@@@@1@18@@danf@17-8-2009
10401060@unknown@formal@none@1@S@It is trivial to achieve recall of 100% by returning all documents in response to any query.@@@@1@17@@danf@17-8-2009
10401070@unknown@formal@none@1@S@Therefore recall alone is not enough but one needs to measure the number of non-relevant documents also, for example by computing the precision.@@@@1@23@@danf@17-8-2009
10401080@unknown@formal@none@1@S@=== Fall-Out ===@@@@1@3@@danf@17-8-2009
10401090@unknown@formal@none@1@S@The proportion of non-relevant documents that are retrieved, out of all non-relevant documents available:@@@@1@14@@danf@17-8-2009
10401100@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10401110@unknown@formal@none@1@S@In binary classification, fall-out is closely related to [[specificity (tests)|specificity]].@@@@1@10@@danf@17-8-2009
10401120@unknown@formal@none@1@S@More precisely: .@@@@1@3@@danf@17-8-2009
10401130@unknown@formal@none@1@S@It can be looked at as ''the probability that a non-relevant document is retrieved by the query''.@@@@1@17@@danf@17-8-2009
10401140@unknown@formal@none@1@S@It is trivial to achieve fall-out of 0% by returning zero documents in response to any query.@@@@1@17@@danf@17-8-2009
10401150@unknown@formal@none@1@S@=== F-measure ===@@@@1@3@@danf@17-8-2009
10401160@unknown@formal@none@1@S@The weighted [[harmonic mean]] of precision and recall, the traditional F-measure or balanced F-score is:@@@@1@15@@danf@17-8-2009
10401170@unknown@formal@none@1@S@:@@@@1@11@@danf@17-8-2009
10401180@unknown@formal@none@1@S@This is also known as the measure, because recall and precision are evenly weighted.@@@@1@15@@danf@17-8-2009
10401190@unknown@formal@none@1@S@The general formula for non-negative real ß is:@@@@1@8@@danf@17-8-2009
10401200@unknown@formal@none@1@S@:@@@@1@15@@danf@17-8-2009
10401210@unknown@formal@none@1@S@Two other commonly used F measures are the measure, which weights recall twice as much as precision, and the measure, which weights precision twice as much as recall.@@@@1@30@@danf@17-8-2009
10401220@unknown@formal@none@1@S@The F-measure was derived by van Rijsbergen (1979) so that "measures the effectiveness of retrieval with respect to a user who attaches ß times as much importance to recall as precision".@@@@1@32@@danf@17-8-2009
10401230@unknown@formal@none@1@S@It is based on van Rijsbergen's effectiveness measure .@@@@1@13@@danf@17-8-2009
10401240@unknown@formal@none@1@S@Their relationship is where .@@@@1@10@@danf@17-8-2009
10401250@unknown@formal@none@1@S@=== Average precision of precision and recall===@@@@1@7@@danf@17-8-2009
10401260@unknown@formal@none@1@S@The precision and recall are based on the whole list of documents returned by the system.@@@@1@16@@danf@17-8-2009
10401270@unknown@formal@none@1@S@Average precision emphasizes returning more relevant documents earlier.@@@@1@8@@danf@17-8-2009
10401280@unknown@formal@none@1@S@It is average of precisions computed after truncating the list after each of the relevant documents in turn:@@@@1@18@@danf@17-8-2009
10401290@unknown@formal@none@1@S@:@@@@1@11@@danf@17-8-2009
10401300@unknown@formal@none@1@S@where ''r'' is the rank, ''N'' the number retrieved, ''rel()'' a binary function on the relevance of a given rank, and ''P()'' precision at a given cut-off rank.@@@@1@28@@danf@17-8-2009
10401310@unknown@formal@none@1@S@== Model types ==@@@@1@4@@danf@17-8-2009
10401320@unknown@formal@none@1@S@[[Image:Information-Retrieval-Models.png|thumb|500px|categorization of IR-models (translated from [http://de.wikipedia.org/wiki/Informationsrückgewinnung#Klassifikation_von_Modellen_zur_Repr.C3.A4sentation_nat.C3.BCrlichsprachlicher_Dokumente German entry], original source [http://www.logos-verlag.de/cgi-bin/engbuchmid?isbn=0514&lng=eng&id= Dominik Kuropka])]]@@@@1@13@@danf@17-8-2009
10401325@unknown@formal@none@1@S@For the information retrieval to be efficient, the documents are typically transformed into a suitable representation.@@@@1@16@@danf@17-8-2009
10401330@unknown@formal@none@1@S@There are several representations.@@@@1@4@@danf@17-8-2009
10401340@unknown@formal@none@1@S@The picture on the right illustrates the relationship of some common models.@@@@1@12@@danf@17-8-2009
10401350@unknown@formal@none@1@S@In the picture, the models are categorized according to two dimensions: the mathematical basis and the properties of the model.@@@@1@20@@danf@17-8-2009
10401360@unknown@formal@none@1@S@=== First dimension: mathematical basis ===@@@@1@6@@danf@17-8-2009
10401370@unknown@formal@none@1@S@* ''Set-theoretic models'' represent documents as sets of words or phrases.@@@@1@11@@danf@17-8-2009
10401380@unknown@formal@none@1@S@Similarities are usually derived from set-theoretic operations on those sets.@@@@1@10@@danf@17-8-2009
10401390@unknown@formal@none@1@S@Common models are:@@@@1@3@@danf@17-8-2009
10401400@unknown@formal@none@1@S@** [[Standard Boolean model]]@@@@1@4@@danf@17-8-2009
10401410@unknown@formal@none@1@S@** [[Extended Boolean model]]@@@@1@4@@danf@17-8-2009
10401420@unknown@formal@none@1@S@** [[Fuzzy retrieval]]@@@@1@3@@danf@17-8-2009
10401430@unknown@formal@none@1@S@* ''Algebraic models'' represent documents and queries usually as vectors, matrices or tuples.@@@@1@13@@danf@17-8-2009
10401440@unknown@formal@none@1@S@The similarity of the query vector and document vector is represented as a scalar value.@@@@1@15@@danf@17-8-2009
10401450@unknown@formal@none@1@S@** [[Vector space model]]@@@@1@4@@danf@17-8-2009
10401460@unknown@formal@none@1@S@** [[Generalized vector space model]]@@@@1@5@@danf@17-8-2009
10401470@unknown@formal@none@1@S@** Topic-based vector space model (literature: [http://www.kuropka.net/files/TVSM.pdf], [http://www.logos-verlag.de/cgi-bin/engbuchmid?isbn=0514&lng=eng&id=])@@@@1@8@@danf@17-8-2009
10401480@unknown@formal@none@1@S@** [[Extended Boolean model]]@@@@1@4@@danf@17-8-2009
10401490@unknown@formal@none@1@S@** Enhanced topic-based vector space model (literature: [http://kuropka.net/files/HPI_Evaluation_of_eTVSM.pdf], [http://www.logos-verlag.de/cgi-bin/engbuchmid?isbn=0514&lng=eng&id=])@@@@1@9@@danf@17-8-2009
10401500@unknown@formal@none@1@S@** Latent semantic indexing aka [[latent semantic analysis]]@@@@1@8@@danf@17-8-2009
10401510@unknown@formal@none@1@S@* ''Probabilistic models'' treat the process of document retrieval as a probabilistic inference.@@@@1@13@@danf@17-8-2009
10401520@unknown@formal@none@1@S@Similarities are computed as probabilities that a document is relevant for a given query.@@@@1@14@@danf@17-8-2009
10401530@unknown@formal@none@1@S@Probabilistic theorems like the [[Bayes' theorem]] are often used in these models.@@@@1@12@@danf@17-8-2009
10401540@unknown@formal@none@1@S@** [[Binary independence retrieval]]@@@@1@4@@danf@17-8-2009
10401550@unknown@formal@none@1@S@** [[Probabilistic relevance model (BM25)]]@@@@1@5@@danf@17-8-2009
10401560@unknown@formal@none@1@S@** Uncertain inference@@@@1@3@@danf@17-8-2009
10401570@unknown@formal@none@1@S@** [[Language model]]s@@@@1@3@@danf@17-8-2009
10401580@unknown@formal@none@1@S@** [[Divergence-from-randomness model]]@@@@1@3@@danf@17-8-2009
10401590@unknown@formal@none@1@S@** [[Latent Dirichlet allocation]]@@@@1@4@@danf@17-8-2009
10401600@unknown@formal@none@1@S@=== Second dimension: properties of the model ===@@@@1@8@@danf@17-8-2009
10401610@unknown@formal@none@1@S@* ''Models without term-interdependencies'' treat different terms/words as independent.@@@@1@9@@danf@17-8-2009
10401620@unknown@formal@none@1@S@This fact is usually represented in vector space models by the [[orthogonality]] assumption of term vectors or in probabilistic models by an [[independency]] assumption for term variables.@@@@1@27@@danf@17-8-2009
10401630@unknown@formal@none@1@S@* ''Models with immanent term interdependencies'' allow a representation of interdependencies between terms.@@@@1@13@@danf@17-8-2009
10401640@unknown@formal@none@1@S@However the degree of the interdependency between two terms is defined by the model itself.@@@@1@15@@danf@17-8-2009
10401650@unknown@formal@none@1@S@It is usually directly or indirectly derived (e.g. by [[dimension reduction|dimensional reduction]]) from the [[co-occurrence]] of those terms in the whole set of documents.@@@@1@24@@danf@17-8-2009
10401660@unknown@formal@none@1@S@* ''Models with transcendent term interdependencies'' allow a representation of interdependencies between terms, but they do not allege how the interdependency between two terms is defined.@@@@1@26@@danf@17-8-2009
10401670@unknown@formal@none@1@S@They relay an external source for the degree of interdependency between two terms.@@@@1@13@@danf@17-8-2009
10401680@unknown@formal@none@1@S@(For example a human or sophisticated algorithms.)@@@@1@7@@danf@17-8-2009
10401690@unknown@formal@none@1@S@== Major figures ==@@@@1@4@@danf@17-8-2009
10401700@unknown@formal@none@1@S@* [[Gerard Salton]]@@@@1@3@@danf@17-8-2009
10401710@unknown@formal@none@1@S@* [[Hans Peter Luhn]]@@@@1@4@@danf@17-8-2009
10401720@unknown@formal@none@1@S@* [http://ciir.cs.umass.edu/personnel/croft.html W. Bruce Croft]@@@@1@5@@danf@17-8-2009
10401730@unknown@formal@none@1@S@* [[Karen Spärck Jones]]@@@@1@4@@danf@17-8-2009
10401740@unknown@formal@none@1@S@* [[C. J. van Rijsbergen]]@@@@1@5@@danf@17-8-2009
10401750@unknown@formal@none@1@S@* [http://www.soi.city.ac.uk/~ser/homepage.html Stephen E. Robertson]@@@@1@5@@danf@17-8-2009
10401760@unknown@formal@none@1@S@== Awards in the field ==@@@@1@6@@danf@17-8-2009
10401770@unknown@formal@none@1@S@* [[Tony Kent Strix award]]@@@@1@5@@danf@17-8-2009
10401780@unknown@formal@none@1@S@* [[Gerard Salton Award]]@@@@1@4@@danf@17-8-2009
10410010@unknown@formal@none@1@S@Information theory@@@@1@2@@danf@17-8-2009
10410020@unknown@formal@none@1@S@'''Information theory''' is a branch of [[applied mathematics]] and [[electrical engineering]] involving the quantification of [[information]].@@@@1@16@@danf@17-8-2009
10410030@unknown@formal@none@1@S@Historically, information theory was developed to find fundamental limits on compressing and reliably [[communication|communicating]] data.@@@@1@15@@danf@17-8-2009
10410040@unknown@formal@none@1@S@Since its inception it has broadened to find applications in many other areas, including [[statistical inference]], [[natural language processing]], [[cryptography]] generally, [[networks]] other than communication networks -- as in [[neurobiology]], the evolution and function of molecular codes, model selection in ecology, thermal physics, [[quantum computing]], plagiarism detection and other forms of [[data analysis]].@@@@1@53@@danf@17-8-2009
10410050@unknown@formal@none@1@S@A key measure of information in the theory is known as [[information entropy]], which is usually expressed by the average number of bits needed for storage or communication.@@@@1@28@@danf@17-8-2009
10410060@unknown@formal@none@1@S@Intuitively, entropy quantifies the uncertainty involved when encountering a [[random variable]].@@@@1@11@@danf@17-8-2009
10410070@unknown@formal@none@1@S@For example, a fair coin flip (2 equally likely outcomes) will have less entropy than a roll of a die (6 equally likely outcomes).@@@@1@24@@danf@17-8-2009
10410080@unknown@formal@none@1@S@Applications of fundamental topics of information theory include [[lossless data compression]] (e.g. [[ZIP (file format)|ZIP files]]), [[lossy data compression]] (e.g. [[MP3]]s), and [[channel capacity|channel coding]] (e.g. for [[DSL]] lines).@@@@1@29@@danf@17-8-2009
10410110@unknown@formal@none@1@S@The field is at the intersection of [[mathematics]], [[statistics]], [[computer science]], [[physics]], [[neurobiology]], and [[electrical engineering]].@@@@1@16@@danf@17-8-2009
10410120@unknown@formal@none@1@S@Its impact has been crucial to the success of the [[Voyager program|Voyager]] missions to deep space, the invention of the CD, the feasibility of mobile phones, the development of the [[Internet]], the study of [[linguistics]] and of human perception, the understanding of [[black hole]]s, and numerous other fields.@@@@1@48@@danf@17-8-2009
10410130@unknown@formal@none@1@S@Important sub-fields of information theory are source coding, channel coding, algorithmic complexity theory, algorithmic information theory, and measures of information.@@@@1@20@@danf@17-8-2009
10410140@unknown@formal@none@1@S@==Overview==@@@@1@1@@danf@17-8-2009
10410150@unknown@formal@none@1@S@The main concepts of information theory can be grasped by considering the most widespread means of human communication: language.@@@@1@19@@danf@17-8-2009
10410160@unknown@formal@none@1@S@Two important aspects of a good language are as follows: First, the most common words (e.g., "a", "the", "I") should be shorter than less common words (e.g., "benefit", "generation", "mediocre"), so that sentences will not be too long.@@@@1@38@@danf@17-8-2009
10410170@unknown@formal@none@1@S@Such a tradeoff in word length is analogous to [[data compression]] and is the essential aspect of [[source coding]].@@@@1@19@@danf@17-8-2009
10410180@unknown@formal@none@1@S@Second, if part of a sentence is unheard or misheard due to noise -— e.g., a passing car -— the listener should still be able to glean the meaning of the underlying message.@@@@1@33@@danf@17-8-2009
10410190@unknown@formal@none@1@S@Such robustness is as essential for an electronic communication system as it is for a language; properly building such robustness into communications is done by [[Channel capacity|channel coding]].@@@@1@28@@danf@17-8-2009
10410200@unknown@formal@none@1@S@Source coding and channel coding are the fundamental concerns of information theory.@@@@1@12@@danf@17-8-2009
10410210@unknown@formal@none@1@S@Note that these concerns have nothing to do with the ''importance'' of messages.@@@@1@13@@danf@17-8-2009
10410220@unknown@formal@none@1@S@For example, a platitude such as "Thank you; come again" takes about as long to say or write as the urgent plea, "Call an ambulance!" while clearly the latter is more important and more meaningful.@@@@1@35@@danf@17-8-2009
10410230@unknown@formal@none@1@S@Information theory, however, does not consider message importance or meaning, as these are matters of the quality of data rather than the quantity and readability of data, the latter of which is determined solely by probabilities.@@@@1@36@@danf@17-8-2009
10410240@unknown@formal@none@1@S@Information theory is generally considered to have been founded in 1948 by [[Claude Elwood Shannon|Claude Shannon]] in his seminal work, "[[A Mathematical Theory of Communication]]."@@@@1@25@@danf@17-8-2009
10410250@unknown@formal@none@1@S@The central paradigm of classical information theory is the engineering problem of the transmission of information over a noisy channel.@@@@1@20@@danf@17-8-2009
10410260@unknown@formal@none@1@S@The most fundamental results of this theory are Shannon's [[source coding theorem]], which establishes that, on average, the number of ''bits'' needed to represent the result of an uncertain event is given by its [[information entropy|entropy]]; and Shannon's [[noisy-channel coding theorem]], which states that ''reliable'' communication is possible over ''noisy'' channels provided that the rate of communication is below a certain threshold called the channel capacity.@@@@1@66@@danf@17-8-2009
10410270@unknown@formal@none@1@S@The channel capacity can be approached in practice by using appropriate encoding and decoding systems.@@@@1@15@@danf@17-8-2009
10410280@unknown@formal@none@1@S@Information theory is closely associated with a collection of pure and applied disciplines that have been investigated and reduced to engineering practice under a variety of rubrics throughout the world over the past half century or more: [[adaptive system]]s, [[anticipatory system]]s, [[artificial intelligence]], [[complex system]]s, [[complexity science]], [[cybernetics]], [[informatics]], [[machine learning]], along with [[systems science]]s of many descriptions.@@@@1@58@@danf@17-8-2009
10410290@unknown@formal@none@1@S@Information theory is a broad and deep mathematical theory, with equally broad and deep applications, amongst which is the vital field of [[coding theory]].@@@@1@24@@danf@17-8-2009
10410300@unknown@formal@none@1@S@Coding theory is concerned with finding explicit methods, called ''codes'', of increasing the efficiency and reducing the net error rate of data communication over a noisy channel to near the limit that Shannon proved is the maximum possible for that channel.@@@@1@41@@danf@17-8-2009
10410310@unknown@formal@none@1@S@These codes can be roughly subdivided into [[data compression]] (source coding) and [[error-correction]] (channel coding) techniques.@@@@1@16@@danf@17-8-2009
10410320@unknown@formal@none@1@S@In the latter case, it took many years to find the methods Shannon's work proved were possible.@@@@1@17@@danf@17-8-2009
10410330@unknown@formal@none@1@S@A third class of information theory codes are cryptographic algorithms (both [[code (cryptography)|code]]s and [[cipher]]s).@@@@1@15@@danf@17-8-2009
10410340@unknown@formal@none@1@S@Concepts, methods and results from coding theory and information theory are widely used in [[cryptography]] and [[cryptanalysis]].@@@@1@17@@danf@17-8-2009
10410350@unknown@formal@none@1@S@''See the article [[ban (information)]] for a historical application.''@@@@1@9@@danf@17-8-2009
10410360@unknown@formal@none@1@S@Information theory is also used in [[information retrieval]], [[intelligence (information gathering)|intelligence gathering]], [[gambling]], [[statistics]], and even in [[musical composition]].@@@@1@19@@danf@17-8-2009
10410370@unknown@formal@none@1@S@==Historical background==@@@@1@2@@danf@17-8-2009
10410380@unknown@formal@none@1@S@The landmark event that established the discipline of information theory, and brought it to immediate worldwide attention, was the publication of [[Claude E. Shannon]]'s classic paper "[[A Mathematical Theory of Communication]]" in the ''[[Bell System Technical Journal]]'' in July and October of 1948.@@@@1@43@@danf@17-8-2009
10410390@unknown@formal@none@1@S@Prior to this paper, limited information theoretic ideas had been developed at Bell Labs, all implicitly assuming events of equal probability.@@@@1@21@@danf@17-8-2009
10410400@unknown@formal@none@1@S@[[Harry Nyquist]]'s 1924 paper, ''Certain Factors Affecting Telegraph Speed,'' contains a theoretical section quantifying "intelligence" and the "line speed" at which it can be transmitted by a communication system, giving the relation , where ''W'' is the speed of transmission of intelligence, ''m'' is the number of different voltage levels to choose from at each time step, and ''K'' is a constant.@@@@1@66@@danf@17-8-2009
10410410@unknown@formal@none@1@S@[[Ralph Hartley]]'s 1928 paper, ''Transmission of Information,'' uses the word ''information'' as a measurable quantity, reflecting the receiver's ability to distinguish that one sequence of symbols from any other, thus quantifying information as , where ''S'' was the number of possible symbols, and ''n'' the number of symbols in a transmission.@@@@1@58@@danf@17-8-2009
10410420@unknown@formal@none@1@S@The natural unit of information was therefore the decimal digit, much later renamed the [[ban (information)|hartley]] in his honour as a unit or scale or measure of information.@@@@1@28@@danf@17-8-2009
10410430@unknown@formal@none@1@S@[[Alan Turing]] in 1940 used similar ideas as part of the statistical analysis of the breaking of the German second world war [[Cryptanalysis of the Enigma|Enigma]] ciphers.@@@@1@27@@danf@17-8-2009
10410440@unknown@formal@none@1@S@Much of the mathematics behind information theory with events of different probabilities was developed for the field of [[thermodynamics]] by [[Ludwig Boltzmann]] and [[J. Willard Gibbs]].@@@@1@26@@danf@17-8-2009
10410450@unknown@formal@none@1@S@Connections between information-theoretic entropy and thermodynamic entropy, including the important contributions by [[Rolf Landauer]] in the 1960s, are explored in ''[[Entropy in thermodynamics and information theory]]''.@@@@1@26@@danf@17-8-2009
10410460@unknown@formal@none@1@S@In Shannon's revolutionary and groundbreaking paper, the work for which had been substantially completed at Bell Labs by the end of 1944, Shannon for the first time introduced the qualitative and quantitative model of communication as a statistical process underlying information theory, opening with the assertion that@@@@1@47@@danf@17-8-2009
10410470@unknown@formal@none@1@S@:"The fundamental problem of communication is that of reproducing at one point, either exactly or approximately, a message selected at another point."@@@@1@22@@danf@17-8-2009
10410480@unknown@formal@none@1@S@With it came the ideas of@@@@1@6@@danf@17-8-2009
10410490@unknown@formal@none@1@S@* the [[information entropy]] and [[redundancy (information theory)|redundancy]] of a source, and its relevance through the [[source coding theorem]];@@@@1@19@@danf@17-8-2009
10410500@unknown@formal@none@1@S@* the [[mutual information]], and the [[channel capacity]] of a noisy channel, including the promise of perfect loss-free communication given by the [[noisy-channel coding theorem]];@@@@1@25@@danf@17-8-2009
10410510@unknown@formal@none@1@S@* the practical result of the [[Shannon–Hartley law]] for the channel capacity of a Gaussian channel; and of course@@@@1@19@@danf@17-8-2009
10410520@unknown@formal@none@1@S@* the [[bit]]—a new way of seeing the most fundamental unit of information@@@@1@13@@danf@17-8-2009
10410530@unknown@formal@none@1@S@==Ways of measuring information==@@@@1@4@@danf@17-8-2009
10410540@unknown@formal@none@1@S@Information theory is based on [[probability theory]] and [[statistics]].@@@@1@9@@danf@17-8-2009
10410550@unknown@formal@none@1@S@The most important quantities of information are [[Information entropy|entropy]], the information in a [[random variable]], and [[mutual information]], the amount of information in common between two random variables.@@@@1@28@@danf@17-8-2009
10410560@unknown@formal@none@1@S@The former quantity indicates how easily message data can be [[data compression|compressed]] while the latter can be used to find the communication rate across a [[Channel (communications)|channel]].@@@@1@27@@danf@17-8-2009
10410570@unknown@formal@none@1@S@The choice of logarithmic base in the following formulae determines the [[units of measurement|unit]] of [[information entropy]] that is used.@@@@1@20@@danf@17-8-2009
10410580@unknown@formal@none@1@S@The most common unit of information is the [[bit]], based on the [[binary logarithm]].@@@@1@14@@danf@17-8-2009
10410590@unknown@formal@none@1@S@Other units include the [[nat (information)|nat]], which is based on the [[natural logarithm]], and the [[deciban|hartley]], which is based on the [[common logarithm]].@@@@1@23@@danf@17-8-2009
10410600@unknown@formal@none@1@S@In what follows, an expression of the form is considered by convention to be equal to zero whenever @@@@1@23@@danf@17-8-2009
10410605@unknown@formal@none@1@S@This is justified because for any logarithmic base.@@@@1@16@@danf@17-8-2009
10410610@unknown@formal@none@1@S@===Entropy===@@@@1@1@@danf@17-8-2009
10410620@unknown@formal@none@1@S@The '''[[information entropy|entropy]]''', , of a discrete random variable is a measure of the amount of ''uncertainty'' associated with the value of .@@@@1@24@@danf@17-8-2009
10410630@unknown@formal@none@1@S@Suppose one transmits 1000 bits (0s and 1s).@@@@1@8@@danf@17-8-2009
10410640@unknown@formal@none@1@S@If these bits are known ahead of transmission (to be a certain value with absolute probability), logic dictates that no information has been transmitted.@@@@1@24@@danf@17-8-2009
10410650@unknown@formal@none@1@S@If, however, each is equally and independently likely to be 0 or 1, 1000 bits (in the information theoretic sense) have been transmitted.@@@@1@23@@danf@17-8-2009
10410660@unknown@formal@none@1@S@Between these two extremes, information can be quantified as follows.@@@@1@10@@danf@17-8-2009
10410670@unknown@formal@none@1@S@If is the set of all messages that could be, and is the probability of given , then the entropy of is defined:@@@@1@29@@danf@17-8-2009
10410680@unknown@formal@none@1@S@:@@@@1@12@@danf@17-8-2009
10410690@unknown@formal@none@1@S@(Here, is the [[self-information]], which is the entropy contribution of an individual message.)@@@@1@14@@danf@17-8-2009
10410700@unknown@formal@none@1@S@An important property of entropy is that it is maximized when all the messages in the message space are equiprobable—i.e., most unpredictable—in which case @@@@1@28@@danf@17-8-2009
10410710@unknown@formal@none@1@S@The special case of information entropy for a random variable with two outcomes is the '''[[binary entropy function]]''':@@@@1@18@@danf@17-8-2009
10410720@unknown@formal@none@1@S@:@@@@1@9@@danf@17-8-2009
10410730@unknown@formal@none@1@S@===Joint entropy===@@@@1@2@@danf@17-8-2009
10410740@unknown@formal@none@1@S@The '''[[joint entropy]]''' of two discrete random variables and is merely the entropy of their pairing: .@@@@1@20@@danf@17-8-2009
10410750@unknown@formal@none@1@S@This implies that if and are [[statistical independence|independent]], then their joint entropy is the sum of their individual entropies.@@@@1@21@@danf@17-8-2009
10410760@unknown@formal@none@1@S@For example, if represents the position of a [[chess]] piece — the row and the column, then the joint entropy of the row of the piece and the column of the piece will be the entropy of the position of the piece.@@@@1@45@@danf@17-8-2009
10410770@unknown@formal@none@1@S@:@@@@1@16@@danf@17-8-2009
10410780@unknown@formal@none@1@S@Despite similar notation, joint entropy should not be confused with '''[[cross entropy]]'''.@@@@1@12@@danf@17-8-2009
10410790@unknown@formal@none@1@S@===Conditional entropy (equivocation)===@@@@1@3@@danf@17-8-2009
10410800@unknown@formal@none@1@S@The '''[[conditional entropy]]''' or '''conditional uncertainty''' of given random variable (also called the '''equivocation''' of about ) is the average conditional entropy over :@@@@1@27@@danf@17-8-2009
10410810@unknown@formal@none@1@S@:@@@@1@22@@danf@17-8-2009
10410820@unknown@formal@none@1@S@Because entropy can be conditioned on a random variable or on that random variable being a certain value, care should be taken not to confuse these two definitions of conditional entropy, the former of which is in more common use.@@@@1@40@@danf@17-8-2009
10410830@unknown@formal@none@1@S@A basic property of this form of conditional entropy is that:@@@@1@11@@danf@17-8-2009
10410840@unknown@formal@none@1@S@: @@@@1@8@@danf@17-8-2009
10410850@unknown@formal@none@1@S@===Mutual information (transinformation)===@@@@1@3@@danf@17-8-2009
10410860@unknown@formal@none@1@S@'''[[Mutual information]]''' measures the amount of information that can be obtained about one random variable by observing another.@@@@1@18@@danf@17-8-2009
10410870@unknown@formal@none@1@S@It is important in communication where it can be used to maximize the amount of information shared between sent and received signals.@@@@1@22@@danf@17-8-2009
10410880@unknown@formal@none@1@S@The mutual information of relative to is given by:@@@@1@11@@danf@17-8-2009
10410890@unknown@formal@none@1@S@:@@@@1@10@@danf@17-8-2009
10410900@unknown@formal@none@1@S@where (''S''pecific mutual ''I''nformation) is the [[pointwise mutual information]].@@@@1@10@@danf@17-8-2009
10410910@unknown@formal@none@1@S@A basic property of the mutual information is that@@@@1@9@@danf@17-8-2009
10410920@unknown@formal@none@1@S@: @@@@1@6@@danf@17-8-2009
10410930@unknown@formal@none@1@S@That is, knowing ''Y'', we can save an average of bits in encoding ''X'' compared to not knowing ''Y''.@@@@1@21@@danf@17-8-2009
10410940@unknown@formal@none@1@S@Mutual information is [[symmetric function|symmetric]]:@@@@1@5@@danf@17-8-2009
10410950@unknown@formal@none@1@S@: @@@@1@10@@danf@17-8-2009
10410960@unknown@formal@none@1@S@Mutual information can be expressed as the average [[Kullback–Leibler divergence]] (information gain) of the [[posterior probability|posterior probability distribution]] of ''X'' given the value of ''Y'' to the [[prior probability|prior distribution]] on ''X'':@@@@1@32@@danf@17-8-2009
10410970@unknown@formal@none@1@S@: @@@@1@10@@danf@17-8-2009
10410980@unknown@formal@none@1@S@In other words, this is a measure of how much, on the average, the probability distribution on ''X'' will change if we are given the value of ''Y''.@@@@1@28@@danf@17-8-2009
10410990@unknown@formal@none@1@S@This is often recalculated as the divergence from the product of the marginal distributions to the actual joint distribution:@@@@1@19@@danf@17-8-2009
10411000@unknown@formal@none@1@S@: @@@@1@7@@danf@17-8-2009
10411010@unknown@formal@none@1@S@Mutual information is closely related to the [[likelihood-ratio test|log-likelihood ratio test]] in the context of contingency tables and the [[multinomial distribution]] and to [[Pearson's chi-square test|Pearson's χ2 test]]: mutual information can be considered a statistic for assessing independence between a pair of variables, and has a well-specified asymptotic distribution.@@@@1@49@@danf@17-8-2009
10411020@unknown@formal@none@1@S@===Kullback–Leibler divergence (information gain)===@@@@1@4@@danf@17-8-2009
10411030@unknown@formal@none@1@S@The '''[[Kullback–Leibler divergence]]''' (or '''information divergence''', '''information gain''', or '''relative entropy''') is a way of comparing two distributions: a "true" [[probability distribution]] ''p(X)'', and an arbitrary probability distribution ''q(X)''.@@@@1@29@@danf@17-8-2009
10411040@unknown@formal@none@1@S@If we compress data in a manner that assumes ''q(X)'' is the distribution underlying some data, when, in reality, ''p(X)'' is the correct distribution, the Kullback–Leibler divergence is the number of average additional bits per datum necessary for compression.@@@@1@39@@danf@17-8-2009
10411050@unknown@formal@none@1@S@It is thus defined@@@@1@4@@danf@17-8-2009
10411060@unknown@formal@none@1@S@:@@@@1@24@@danf@17-8-2009
10411070@unknown@formal@none@1@S@Although it is sometimes used as a 'distance metric', it is not a true [[Metric (mathematics)|metric]] since it is not symmetric and does not satisfy the [[triangle inequality]] (making it a semi-quasimetric).@@@@1@32@@danf@17-8-2009
10411080@unknown@formal@none@1@S@===Other quantities===@@@@1@2@@danf@17-8-2009
10411090@unknown@formal@none@1@S@Other important information theoretic quantities include [[Rényi entropy]] (a generalization of entropy) and [[differential entropy]] (a generalization of quantities of information to continuous distributions.)@@@@1@24@@danf@17-8-2009
10411100@unknown@formal@none@1@S@==Coding theory==@@@@1@2@@danf@17-8-2009
10411110@unknown@formal@none@1@S@[[Coding theory]] is one of the most important and direct applications of information theory.@@@@1@14@@danf@17-8-2009
10411120@unknown@formal@none@1@S@It can be subdivided into [[data compression|source coding]] theory and [[error correction|channel coding]] theory.@@@@1@14@@danf@17-8-2009
10411130@unknown@formal@none@1@S@Using a statistical description for data, information theory quantifies the number of bits needed to describe the data, which is the information entropy of the source.@@@@1@26@@danf@17-8-2009
10411140@unknown@formal@none@1@S@* Data compression (source coding): There are two formulations for the compression problem:@@@@1@13@@danf@17-8-2009
10411150@unknown@formal@none@1@S@#[[lossless data compression]]: the data must be reconstructed exactly;@@@@1@9@@danf@17-8-2009
10411160@unknown@formal@none@1@S@#[[lossy data compression]]: allocates bits needed to reconstruct the data, within a specified fidelity level measured by a distortion function.@@@@1@20@@danf@17-8-2009
10411170@unknown@formal@none@1@S@This subset of Information theory is called [[rate–distortion theory]].@@@@1@9@@danf@17-8-2009
10411180@unknown@formal@none@1@S@* Error-correcting codes (channel coding): While data compression removes as much [[redundancy (information theory)|redundancy]] as possible, an error correcting code adds just the right kind of redundancy (i.e. [[error correction]]) needed to transmit the data efficiently and faithfully across a noisy channel.@@@@1@42@@danf@17-8-2009
10411190@unknown@formal@none@1@S@This division of coding theory into compression and transmission is justified by the information transmission theorems, or source–channel separation theorems that justify the use of bits as the universal currency for information in many contexts.@@@@1@35@@danf@17-8-2009
10411200@unknown@formal@none@1@S@However, these theorems only hold in the situation where one transmitting user wishes to communicate to one receiving user.@@@@1@19@@danf@17-8-2009
10411210@unknown@formal@none@1@S@In scenarios with more than one transmitter (the multiple-access channel), more than one receiver (the [[broadcast channel]]) or intermediary "helpers" (the [[relay channel]]), or more general [[computer network|networks]], compression followed by transmission may no longer be optimal.@@@@1@37@@danf@17-8-2009
10411220@unknown@formal@none@1@S@[[Network information theory]] refers to these multi-agent communication models.@@@@1@9@@danf@17-8-2009
10411230@unknown@formal@none@1@S@===Source theory===@@@@1@2@@danf@17-8-2009
10411240@unknown@formal@none@1@S@Any process that generates successive messages can be considered a '''[[Communication source|source]]''' of information.@@@@1@14@@danf@17-8-2009
10411250@unknown@formal@none@1@S@A memoryless source is one in which each message is an [[Independent identically-distributed random variables|independent identically-distributed random variable]], whereas the properties of [[ergodic theory|ergodicity]] and [[stationary process|stationarity]] impose more general constraints.@@@@1@31@@danf@17-8-2009
10411260@unknown@formal@none@1@S@All such sources are [[stochastic process|stochastic]].@@@@1@6@@danf@17-8-2009
10411270@unknown@formal@none@1@S@These terms are well studied in their own right outside information theory.@@@@1@12@@danf@17-8-2009
10411280@unknown@formal@none@1@S@====Rate====@@@@1@1@@danf@17-8-2009
10411290@unknown@formal@none@1@S@Information [[Entropy rate|'''rate''']] is the average entropy per symbol.@@@@1@9@@danf@17-8-2009
10411300@unknown@formal@none@1@S@For memoryless sources, this is merely the entropy of each symbol, while, in the case of a stationary stochastic process, it is@@@@1@22@@danf@17-8-2009
10411310@unknown@formal@none@1@S@:@@@@1@7@@danf@17-8-2009
10411320@unknown@formal@none@1@S@that is, the conditional entropy of a symbol given all the previous symbols generated.@@@@1@14@@danf@17-8-2009
10411330@unknown@formal@none@1@S@For the more general case of a process that is not necessarily stationary, the ''average rate'' is@@@@1@17@@danf@17-8-2009
10411340@unknown@formal@none@1@S@:@@@@1@10@@danf@17-8-2009
10411350@unknown@formal@none@1@S@that is, the limit of the joint entropy per symbol.@@@@1@10@@danf@17-8-2009
10411360@unknown@formal@none@1@S@For stationary sources, these two expressions give the same result.@@@@1@10@@danf@17-8-2009
10411370@unknown@formal@none@1@S@It is common in information theory to speak of the "rate" or "entropy" of a language.@@@@1@16@@danf@17-8-2009
10411380@unknown@formal@none@1@S@This is appropriate, for example, when the source of information is English prose.@@@@1@13@@danf@17-8-2009
10411390@unknown@formal@none@1@S@The rate of a source of information is related to its [[redundancy (information theory)|redundancy]] and how well it can be [[data compression|compressed]], the subject of '''source coding'''.@@@@1@27@@danf@17-8-2009
10411400@unknown@formal@none@1@S@===Channel capacity===@@@@1@2@@danf@17-8-2009
10411410@unknown@formal@none@1@S@Communications over a channel—such as an [[ethernet]] wire—is the primary motivation of information theory.@@@@1@14@@danf@17-8-2009
10411420@unknown@formal@none@1@S@As anyone who's ever used a telephone (mobile or landline) knows, however, such channels often fail to produce exact reconstruction of a signal; noise, periods of silence, and other forms of signal corruption often degrade quality.@@@@1@36@@danf@17-8-2009
10411430@unknown@formal@none@1@S@How much information can one hope to communicate over a noisy (or otherwise imperfect) channel?@@@@1@15@@danf@17-8-2009
10411440@unknown@formal@none@1@S@Consider the communications process over a discrete channel.@@@@1@8@@danf@17-8-2009
10411450@unknown@formal@none@1@S@A simple model of the process is shown below:@@@@1@9@@danf@17-8-2009
10411460@unknown@formal@none@1@S@Here ''X'' represents the space of messages transmitted, and ''Y'' the space of messages received during a unit time over our channel.@@@@1@22@@danf@17-8-2009
10411470@unknown@formal@none@1@S@Let be the [[conditional probability]] distribution function of ''Y'' given ''X''.@@@@1@12@@danf@17-8-2009
10411480@unknown@formal@none@1@S@We will consider to be an inherent fixed property of our communications channel (representing the nature of the '''[[Signal noise|noise]]''' of our channel).@@@@1@24@@danf@17-8-2009
10411490@unknown@formal@none@1@S@Then the joint distribution of ''X'' and ''Y'' is completely determined by our channel and by our choice of , the marginal distribution of messages we choose to send over the channel.@@@@1@32@@danf@17-8-2009
10411500@unknown@formal@none@1@S@Under these constraints, we would like to maximize the rate of information, or the '''[[Signal (electrical engineering)|signal]]''', we can communicate over the channel.@@@@1@23@@danf@17-8-2009
10411510@unknown@formal@none@1@S@The appropriate measure for this is the [[mutual information]], and this maximum mutual information is called the '''[[channel capacity]]''' and is given by:@@@@1@23@@danf@17-8-2009
10411520@unknown@formal@none@1@S@:@@@@1@6@@danf@17-8-2009
10411530@unknown@formal@none@1@S@This capacity has the following property related to communicating at information rate ''R'' (where ''R'' is usually bits per symbol).@@@@1@20@@danf@17-8-2009
10411540@unknown@formal@none@1@S@For any information rate ''R < C'' and coding error ε > 0, for large enough ''N'', there exists a code of length ''N'' and rate ≥ R and a decoding algorithm, such that the maximal probability of block error is ≤ ε; that is, it is always possible to transmit with arbitrarily small block error.@@@@1@56@@danf@17-8-2009
10411550@unknown@formal@none@1@S@In addition, for any rate ''R > C'', it is impossible to transmit with arbitrarily small block error.@@@@1@18@@danf@17-8-2009
10411560@unknown@formal@none@1@S@'''[[Channel code|Channel coding]]''' is concerned with finding such nearly optimal [[error detection and correction|codes]] that can be used to transmit data over a noisy channel with a small coding error at a rate near the channel capacity.@@@@1@37@@danf@17-8-2009
10411570@unknown@formal@none@1@S@====Channel capacity of particular model channels====@@@@1@6@@danf@17-8-2009
10411580@unknown@formal@none@1@S@* A continuous-time analog communications channel subject to Gaussian noise — see [[Shannon–Hartley theorem]].@@@@1@14@@danf@17-8-2009
10411590@unknown@formal@none@1@S@* A [[binary symmetric channel]] (BSC) with crossover probability ''p'' is a binary input, binary output channel that flips the input bit with probability '' p''.@@@@1@26@@danf@17-8-2009
10411600@unknown@formal@none@1@S@The BSC has a capacity of bits per channel use, where is the [[binary entropy function]]:@@@@1@20@@danf@17-8-2009
10411610@unknown@formal@none@1@S@::@@@@1@1@@danf@17-8-2009
10411620@unknown@formal@none@1@S@* A binary erasure channel (BEC) with erasure probability '' p '' is a binary input, ternary output channel.@@@@1@19@@danf@17-8-2009
10411630@unknown@formal@none@1@S@The possible channel outputs are ''0'', ''1'', and a third symbol 'e' called an erasure.@@@@1@15@@danf@17-8-2009
10411640@unknown@formal@none@1@S@The erasure represents complete loss of information about an input bit.@@@@1@11@@danf@17-8-2009
10411650@unknown@formal@none@1@S@The capacity of the BEC is ''1 - p'' bits per channel use.@@@@1@13@@danf@17-8-2009
10411660@unknown@formal@none@1@S@::@@@@1@1@@danf@17-8-2009
10411670@unknown@formal@none@1@S@==Applications to other fields==@@@@1@4@@danf@17-8-2009
10411680@unknown@formal@none@1@S@===Intelligence uses and secrecy applications===@@@@1@5@@danf@17-8-2009
10411690@unknown@formal@none@1@S@Information theoretic concepts apply to [[cryptography]] and [[cryptanalysis]].@@@@1@8@@danf@17-8-2009
10411700@unknown@formal@none@1@S@[[Turing]]'s information unit, the [[Ban (information)|ban]], was used in the [[Ultra]] project, breaking the German [[Enigma machine]] code and hastening the [[Victory in Europe Day|end of WWII in Europe]].@@@@1@29@@danf@17-8-2009
10411710@unknown@formal@none@1@S@Shannon himself defined an important concept now called the [[unicity distance]].@@@@1@11@@danf@17-8-2009
10411720@unknown@formal@none@1@S@Based on the [[redundancy (information theory)|redundancy]] of the [[plaintext]], it attempts to give a minimum amount of [[ciphertext]] necessary to ensure unique decipherability.@@@@1@23@@danf@17-8-2009
10411730@unknown@formal@none@1@S@Information theory leads us to believe it is much more difficult to keep secrets than it might first appear.@@@@1@19@@danf@17-8-2009
10411740@unknown@formal@none@1@S@A [[brute force attack]] can break systems based on [[public-key cryptography|asymmetric key algorithms]] or on most commonly used methods of [[symmetric-key algorithm|symmetric key algorithms]] (sometimes called secret key algorithms), such as [[block cipher]]s.@@@@1@33@@danf@17-8-2009
10411750@unknown@formal@none@1@S@The security of all such methods currently comes from the assumption that no known attack can break them in a practical amount of time.@@@@1@24@@danf@17-8-2009
10411760@unknown@formal@none@1@S@[[Information theoretic security]] refers to methods such as the [[one-time pad]] that are not vulnerable to such brute force attacks.@@@@1@20@@danf@17-8-2009
10411770@unknown@formal@none@1@S@In such cases, the positive conditional [[mutual information]] between the [[plaintext]] and [[ciphertext]] (conditioned on the [[key (cryptography)| key]]) can ensure proper transmission, while the unconditional mutual information between the plaintext and ciphertext remains zero, resulting in absolutely secure communications.@@@@1@40@@danf@17-8-2009
10411780@unknown@formal@none@1@S@In other words, an eavesdropper would not be able to improve his or her guess of the plaintext by gaining knowledge of the ciphertext but not of the key.@@@@1@29@@danf@17-8-2009
10411790@unknown@formal@none@1@S@However, as in any other cryptographic system, care must be used to correctly apply even information-theoretically secure methods; the [[Venona project]] was able to crack the one-time pads of the [[Soviet Union]] due to their improper reuse of key material.@@@@1@40@@danf@17-8-2009
10411800@unknown@formal@none@1@S@===Pseudorandom number generation===@@@@1@3@@danf@17-8-2009
10411810@unknown@formal@none@1@S@[[Pseudorandom number generator]]s are widely available in computer language libraries and application programs.@@@@1@13@@danf@17-8-2009
10411820@unknown@formal@none@1@S@They are, almost universally, unsuited to cryptographic use as they do not evade the deterministic nature of modern computer equipment and software.@@@@1@22@@danf@17-8-2009
10411830@unknown@formal@none@1@S@A class of improved random number generators is termed [[Cryptographically secure pseudorandom number generator]]s, but even they require external to the software [[random seed]]s to work as intended.@@@@1@28@@danf@17-8-2009
10411840@unknown@formal@none@1@S@These can be obtained via [[extractor]]s, if done carefully.@@@@1@9@@danf@17-8-2009
10411850@unknown@formal@none@1@S@The measure of sufficient randomness in extractors is [[min-entropy]], a value related to Shannon entropy through [[Rényi entropy]]; Rényi entropy is also used in evaluating randomness in cryptographic systems.@@@@1@29@@danf@17-8-2009
10411860@unknown@formal@none@1@S@Although related, the distinctions among these measures mean that a [[random variable]] with high Shannon entropy is not necessarily satisfactory for use in an extractor and so for cryptography uses.@@@@1@30@@danf@17-8-2009
10411870@unknown@formal@none@1@S@===Miscellaneous applications===@@@@1@2@@danf@17-8-2009
10411880@unknown@formal@none@1@S@Information theory also has applications in [[Gambling and information theory|gambling and investing]], [[black hole information paradox|black holes]], [[bioinformatics]], and [[music]].@@@@1@20@@danf@17-8-2009
10420010@unknown@formal@none@1@S@Italian language@@@@1@2@@danf@17-8-2009
10420020@unknown@formal@none@1@S@'''Italian''' (, or ''lingua italiana'') is a [[Romance languages|Romance language]] spoken as a [[first language]] by about 63 million people, primarily in [[Italy]].@@@@1@23@@danf@17-8-2009
10420030@unknown@formal@none@1@S@In [[Switzerland]], Italian is one of four [[Linguistic geography of Switzerland|official language]]s.@@@@1@12@@danf@17-8-2009
10420040@unknown@formal@none@1@S@It is also the official language of [[San Marino]].@@@@1@9@@danf@17-8-2009
10420050@unknown@formal@none@1@S@It is the primary language of the [[Vatican City]].@@@@1@9@@danf@17-8-2009
10420060@unknown@formal@none@1@S@Standard Italian, adopted by the state after the [[unification of Italy]], is based on [[Tuscan dialect|Tuscan]] and is somewhat intermediate between [[Italo-Western|Italo-Dalmatian languages]] of the [[Mezzogiorno|South]] and [[Northern Italian dialects]] of the [[Northern Italy|North]].@@@@1@34@@danf@17-8-2009
10420070@unknown@formal@none@1@S@Unlike most other Romance languages, Italian has retained the contrast between short and [[consonant length|long consonants]] which existed in Latin.@@@@1@20@@danf@17-8-2009
10420080@unknown@formal@none@1@S@As in most [[Romance languages]], [[stress (linguistics)|stress]] is distinctive.@@@@1@9@@danf@17-8-2009
10420090@unknown@formal@none@1@S@Of the Romance languages, Italian is considered to be one of the closest resembling [[Latin]] in terms of [[vocabulary]].@@@@1@19@@danf@17-8-2009
10420100@unknown@formal@none@1@S@According to Ethnologue, lexical similarity is 89% with [[French language|French]], 87% with [[Catalan language|Catalan]], 85% with [[Sardinian language|Sardinian]], 82% with [[Spanish language|Spanish]], 78% with Rheto-Romance, and 77% with Romanian.@@@@1@29@@danf@17-8-2009
10420110@unknown@formal@none@1@S@It is affectionately called ''il parlar gentile'' (the gentle language) by its speakers.@@@@1@13@@danf@17-8-2009
10420120@unknown@formal@none@1@S@==Writing system==@@@@1@2@@danf@17-8-2009
10420130@unknown@formal@none@1@S@Italian is written using the [[Latin alphabet]].@@@@1@7@@danf@17-8-2009
10420140@unknown@formal@none@1@S@The letters ''J'', ''K'', ''W'', ''X'' and ''Y'' are not considered part of the standard [[Italian alphabet]], but appear in loanwords (such as ''jeans'', ''whisky'', ''taxi'').@@@@1@26@@danf@17-8-2009
10420150@unknown@formal@none@1@S@''X'' has become a commonly used letter in genuine Italian words with the prefix ''extra-''.@@@@1@15@@danf@17-8-2009
10420160@unknown@formal@none@1@S@''J'' in Italian is an old-fashioned orthographic variant of ''I'', appearing in the first name "Jacopo" as well as in some Italian place names, e.g., the towns of [[Bajardo]], [[Bojano]], [[Joppolo]], [[Jesolo]], [[Jesi]], among numerous others, and in the alternate spelling ''Mar Jonio'' (also spelled ''Mar Ionio'') for the [[Ionian Sea]].@@@@1@51@@danf@17-8-2009
10420170@unknown@formal@none@1@S@''J'' may also appear in many words from different dialects, but its use is discouraged in contemporary Italian, and it is not part of the standard 21-letter contemporary Italian alphabet.@@@@1@30@@danf@17-8-2009
10420180@unknown@formal@none@1@S@Each of these foreign letters had an Italian equivalent spelling: ''gi'' for ''j'', ''c'' or ''ch'' for ''k'', ''u'' or ''v'' for ''w'' (depending on what sound it makes), ''s'', ''ss'', or ''cs'' for ''x'', and ''i'' for ''y''.@@@@1@39@@danf@17-8-2009
10420190@unknown@formal@none@1@S@* Italian uses the [[acute accent]] over the letter ''E'' (as in ''perché'', why/because) to indicate a front mid-close vowel, and the [[grave accent]] (as in ''tè'', tea) to indicate a front mid-open vowel.@@@@1@34@@danf@17-8-2009
10420200@unknown@formal@none@1@S@The [[grave accent]] is also used on letters ''A'', ''I'', ''O'', and ''U'' to mark [[stress (linguistics)|stress]] when it falls on the final vowel of a word (for instance ''gioventù'', youth).@@@@1@31@@danf@17-8-2009
10420210@unknown@formal@none@1@S@Typically, the penultimate syllable is stressed.@@@@1@6@@danf@17-8-2009
10420220@unknown@formal@none@1@S@If syllables other than the last one are stressed, the accent is not mandatory, unlike in [[Spanish language|Spanish]], and, in virtually all cases, it is omitted.@@@@1@26@@danf@17-8-2009
10420230@unknown@formal@none@1@S@In some cases, when the word is ambiguous (as ''principi''), the accent mark is sometimes used in order to disambiguate its meaning (in this case, ''prìncipi'', princes, or ''princìpi'', principles).@@@@1@30@@danf@17-8-2009
10420240@unknown@formal@none@1@S@This is, however, not compulsory.@@@@1@5@@danf@17-8-2009
10420250@unknown@formal@none@1@S@Rare words with three or more syllables can confuse Italians themselves, and the pronunciation of [[Istanbul]] is a common example of a word in which placement of stress is not clearly established.@@@@1@32@@danf@17-8-2009
10420260@unknown@formal@none@1@S@Turkish, like French, tends to put the accent on ultimate syllable, but Italian doesn't.@@@@1@14@@danf@17-8-2009
10420270@unknown@formal@none@1@S@So we can hear "Istànbul" or "Ìstanbul".@@@@1@7@@danf@17-8-2009
10420280@unknown@formal@none@1@S@Another instance is the American State of [[Florida]]: the correct way to pronounce it in Italian is like in Spanish, "Florìda", but since there is an Italian word meaning the same ("flourishing"), "flòrida", and because of the influence of English, most Italians pronounce it that way.@@@@1@46@@danf@17-8-2009
10420290@unknown@formal@none@1@S@Dictionaries give the latter as an alternative pronunciation.@@@@1@8@@danf@17-8-2009
10420300@unknown@formal@none@1@S@* The letter ''H'' at the beginning of a word is used to distinguish ''ho'', ''hai'', ''ha'', ''hanno'' (present indicative of ''avere'', 'to have') from ''o'' ('or'), ''ai'' ('to the'), ''a'' ('to'), ''anno'' ('year').@@@@1@34@@danf@17-8-2009
10420310@unknown@formal@none@1@S@In the spoken language this letter is always silent for the cases given above.@@@@1@14@@danf@17-8-2009
10420320@unknown@formal@none@1@S@''H'' is also used in combinations with other letters (see below), but no [[phoneme]] {{IPA|[h]}} exists in Italian.@@@@1@18@@danf@17-8-2009
10420330@unknown@formal@none@1@S@In foreign words entered in common use, like "hotel" or "hovercraft", the H is commonly silent, so they are pronounced as {{IPA|/oˈtɛl/}} and {{IPA|/ˈɔverkraft/}}@@@@1@24@@danf@17-8-2009
10420340@unknown@formal@none@1@S@* The letter ''Z'' represents {{IPA|/ʣ/}}, for example: ''Zanzara'' {{IPA|/dzan'dzaɾa/}} (mosquito), or {{IPA|/ʦ/}}, for example: ''Nazione'' {{IPA|/naˈttsjone/}} (nation), depending on context, though there are few [[minimal pair]]s.@@@@1@27@@danf@17-8-2009
10420350@unknown@formal@none@1@S@The same goes for ''S'', which can represent {{IPA|/s/}} or {{IPA|/z/}}.@@@@1@11@@danf@17-8-2009
10420360@unknown@formal@none@1@S@However, these two phonemes are in [[complementary distribution]] everywhere except between two vowels in the same word, and even in such environment there are extremely few minimal pairs, so that this distinction is being lost in many varieties.@@@@1@38@@danf@17-8-2009
10420370@unknown@formal@none@1@S@* The letters ''C'' and ''G'' represent [[affricate]]s: [[Voiceless postalveolar affricate|{{IPA|/ʧ/}}]] as in "chair" and [[Voiced postalveolar affricate|{{IPA|/ʤ/}}]] as in "gem", respectively, before the [[front vowel]]s ''I'' and ''E''.@@@@1@29@@danf@17-8-2009
10420380@unknown@formal@none@1@S@They are pronounced as [[plosive]]s {{IPA|/k/}}, {{IPA|/g/}} (as in "call" and "gall") otherwise.@@@@1@13@@danf@17-8-2009
10420390@unknown@formal@none@1@S@Front/back vowel rules for ''C'' and ''G'' are similar in [[French language|French]], [[Romanian language|Romanian]], [[Spanish language|Spanish]], and to some extent [[English language|English]] (including [[Old English]]).@@@@1@25@@danf@17-8-2009
10420400@unknown@formal@none@1@S@[[swedish language|Swedish]] and [[Norwegian language|Norwegian]] have similar rules for ''K'' and ''G''.@@@@1@12@@danf@17-8-2009
10420410@unknown@formal@none@1@S@(See also [[palatalization]].)@@@@1@3@@danf@17-8-2009
10420420@unknown@formal@none@1@S@* However, an ''H'' can be added between ''C'' or ''G'' and ''E'' or ''I'' to represent a plosive, and an ''I'' can be added between ''C'' or ''G'' and ''A'', ''O'' or ''U'' to signal that the consonant is an affricate.@@@@1@42@@danf@17-8-2009
10420430@unknown@formal@none@1@S@For example:@@@@1@2@@danf@17-8-2009
10420440@unknown@formal@none@1@S@:Note that the ''H'' is [[silent letter|silent]] in the digraphs ''[[ch (digraph)|CH]]'' and ''[[gh (digraph)|GH]]'', as also the ''I'' in ''cia'', ''cio'', ''ciu'' and even ''cie'' is not pronounced as a separate vowel, unless it carries the primary stress.@@@@1@39@@danf@17-8-2009
10420450@unknown@formal@none@1@S@For example, it is silent in ''[[ciao]]'' {{IPA|/ˈʧa.o/}} and cielo {{IPA|/ˈʧɛ.lo/}}, but it is pronounced in ''farmacia'' {{IPA|/ˌfaɾ.ma.ˈʧi.a/}} and ''farmacie'' {{IPA|/ˌfaɾ.ma.ˈʧi.e/}}.@@@@1@21@@danf@17-8-2009
10420460@unknown@formal@none@1@S@* There are three other special [[digraph (orthography)|digraphs]] in Italian: ''[[gn (digraph)|GN]]'', ''GL'' and ''SC''.@@@@1@15@@danf@17-8-2009
10420470@unknown@formal@none@1@S@''GN'' represents [[Palatal nasal|{{IPA|/ɲ/}}]].@@@@1@4@@danf@17-8-2009
10420480@unknown@formal@none@1@S@''GL'' represents [[Palatal lateral approximant|{{IPA|/ʎ/}}]] only before ''i'', and never at the beginning of a word, except in the [[personal pronoun]] and [[definite article]] ''gli''.@@@@1@25@@danf@17-8-2009
10420490@unknown@formal@none@1@S@(Compare with [[Spanish language|Spanish]] ''ñ'' and ''ll'', [[Portuguese language|Portuguese]] ''nh'' and ''lh''.)@@@@1@12@@danf@17-8-2009
10420500@unknown@formal@none@1@S@''SC'' represents fricative [[Voiceless postalveolar fricative|{{IPA|/ʃ/}}]] before ''i'' or ''e''.@@@@1@10@@danf@17-8-2009
10420510@unknown@formal@none@1@S@Except in the speech of some Northern Italians, all of these are normally [[geminate]] between vowels.@@@@1@16@@danf@17-8-2009
10420520@unknown@formal@none@1@S@* In general, all letters or digraphs represent phonemes rather clearly, and, in standard varieties of Italian, there is little allophonic variation.@@@@1@22@@danf@17-8-2009
10420530@unknown@formal@none@1@S@The most notable exceptions are assimilation of /n/ in point of articulation before consonants, assimilatory voicing of /s/ to following voiced consonants, and vowel length (vowels are long in stressed open syllables, and short elsewhere) — compare with the enormous number of [[allophone]]s of the English phoneme /t/.@@@@1@48@@danf@17-8-2009
10420540@unknown@formal@none@1@S@Spelling is clearly phonemic and difficult to mistake given a clear pronunciation.@@@@1@12@@danf@17-8-2009
10420550@unknown@formal@none@1@S@Exceptions are generally only found in foreign borrowings.@@@@1@8@@danf@17-8-2009
10420560@unknown@formal@none@1@S@There are fewer cases of [[dyslexia]] than among speakers of languages such as English , and the concept of a spelling bee is strange to Italians.@@@@1@26@@danf@17-8-2009
10420570@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009
10420580@unknown@formal@none@1@S@The history of the Italian language is long, but the modern standard of the language was largely shaped by relatively recent events.@@@@1@22@@danf@17-8-2009
10420590@unknown@formal@none@1@S@The earliest surviving texts which can definitely be called Italian (or more accurately, vernacular, as opposed to its predecessor [[Vulgar Latin]]) are legal formulae from the region of [[province of Benevento|Benevento]] dating from 960-963.@@@@1@34@@danf@17-8-2009
10420600@unknown@formal@none@1@S@What would come to be thought of as Italian was first formalized in the first years of the 14th century through the works of [[Dante Alighieri]], who mixed southern Italian languages, especially [[Sicilian language|Sicilian]], with his native Tuscan in his epic poems known collectively as the ''[[Divine Comedy|Commedia]],'' to which [[Giovanni Boccaccio]] later affixed the title ''Divina''.@@@@1@57@@danf@17-8-2009
10420610@unknown@formal@none@1@S@Dante's much-loved works were read throughout Italy and his written dialect became the "canonical standard" that all educated Italians could understand.@@@@1@21@@danf@17-8-2009
10420620@unknown@formal@none@1@S@Dante is still credited with standardizing the Italian language and, thus, the dialect of [[Tuscany]] became the basis for what would become the official language of Italy.@@@@1@27@@danf@17-8-2009
10420630@unknown@formal@none@1@S@Italy has always had a distinctive dialect for each city since the cities were until recently thought of as [[city-state]]s.@@@@1@20@@danf@17-8-2009
10420640@unknown@formal@none@1@S@The latter now has considerable [[variety (linguistics)|variety]], however.@@@@1@8@@danf@17-8-2009
10420650@unknown@formal@none@1@S@As Tuscan-derived Italian came to be used throughout the nation, features of local speech were naturally adopted, producing various versions of Regional Italian.@@@@1@23@@danf@17-8-2009
10420660@unknown@formal@none@1@S@The most characteristic differences, for instance, between [[Romanesco|Roman Italian]] and [[Milanese|Milanese Italian]] are the [[consonant length|gemination]] of initial consonants and the pronunciation of stressed "e", and of "s" in some cases (e.g. ''va bene'' "all right": is pronounced {{IPA|[va ˈbːɛne]}} by a Roman, {{IPA|[va ˈbene]}} by a Milanese; ''a casa'' "at home": Roman {{IPA|[a ˈkːasa]}}, Milanese {{IPA|[a ˈkaza]}}).@@@@1@58@@danf@17-8-2009
10420670@unknown@formal@none@1@S@In contrast to the [[Northern Italian language|dialects of northern Italy]], [[southern Italian]] dialects were largely untouched by the Franco-[[Occitan language|Occitan]] influences introduced to Italy, mainly by [[bard]]s from [[France]], during the [[Middle Ages]].@@@@1@33@@danf@17-8-2009
10420680@unknown@formal@none@1@S@Even in the case of Northern Italian dialects, however, scholars are careful not to overstate the effects of outsiders on the natural indigenous developments of the languages.@@@@1@27@@danf@17-8-2009
10420690@unknown@formal@none@1@S@(See [[La Spezia-Rimini Line]].)@@@@1@4@@danf@17-8-2009
10420700@unknown@formal@none@1@S@The economic might and relative advanced development of [[Tuscany]] at the time ([[Late Middle Ages]]), gave its dialect weight, though Venetian remained widespread in medieval Italian commercial life.@@@@1@28@@danf@17-8-2009
10420710@unknown@formal@none@1@S@Also, the increasing cultural relevance of [[Florence, Italy|Florence]] during the periods of '[[Humanism|Umanesimo (Humanism)]]' and the [[Renaissance|Rinascimento (Renaissance)]] made its ''volgare'' (dialect), or rather a refined version of it, a standard in the arts.@@@@1@34@@danf@17-8-2009
10420720@unknown@formal@none@1@S@The re-discovery of Dante's ''[[De vulgari eloquentia]]'' and a renewed interest in linguistics in the 16th century sparked a debate which raged throughout Italy concerning which criteria should be chosen to establish a modern Italian standard to be used as much as a literary as a spoken language.@@@@1@48@@danf@17-8-2009
10420730@unknown@formal@none@1@S@Scholars were divided into three factions: the [[purism|purists]], headed by [[Pietro Bembo]] who in his ''[[Gli Asolani]]'' claimed that the language might only be based on the great literary classics (notably, [[Petrarch]], and Boccaccio but not Dante as Bembo believed that the Divine Comedy was not dignified enough as it used elements from other dialects), [[Niccolò Machiavelli]] and other [[Florence|Florentine]]s who preferred the version spoken by ordinary people in their own times, and the [[Courtesan]]s like [[Baldassarre Castiglione]] and [[Gian Giorgio Trissino]] who insisted that each local vernacular must contribute to the new standard.@@@@1@94@@danf@17-8-2009
10420740@unknown@formal@none@1@S@Eventually Bembo's ideas prevailed, the result being the publication of the first Italian dictionary in 1612 and the foundation of the [[Accademia della Crusca]] in Florence (1582-3), the official legislative body of the Italian language.@@@@1@35@@danf@17-8-2009
10420750@unknown@formal@none@1@S@Italian literature's first modern novel, [[The Betrothed|''I Promessi Sposi'']] (The Betrothed), by [[Alessandro Manzoni]] further defined the standard by "rinsing" his Milanese 'in the waters of the [[Arno River|Arno]]" ([[Florence]]'s river), as he states in the Preface to his 1840 edition.@@@@1@41@@danf@17-8-2009
10420760@unknown@formal@none@1@S@After unification a huge number of civil servants and soldiers recruited from all over the country introduced many more words and idioms from their home dialects ("[[ciao]]" is [[Venetian language|Venetian]], "[[panettone]]" is [[Milanese]] etc.).@@@@1@34@@danf@17-8-2009
10420770@unknown@formal@none@1@S@==Classification==@@@@1@1@@danf@17-8-2009
10420780@unknown@formal@none@1@S@Italian is most closely related to the other two Italo-Dalmatian languages, [[Sicilian language|Sicilian]] and the extinct [[Dalmatian language|Dalmatian]].@@@@1@18@@danf@17-8-2009
10420790@unknown@formal@none@1@S@The three are part of the [[Italo-Western languages|Italo-Western]] grouping of the [[Romance languages]], which are a subgroup of the [[Italic languages|Italic]] branch of [[Indo-European language family|Indo-European]].@@@@1@26@@danf@17-8-2009
10420800@unknown@formal@none@1@S@==Geographic distribution==@@@@1@2@@danf@17-8-2009
10420810@unknown@formal@none@1@S@The total speakers of Italian as maternal language are between 60 and 70 million.@@@@1@14@@danf@17-8-2009
10420820@unknown@formal@none@1@S@The speakers who use Italian as second or cultural language are estimated around 110-120 million .@@@@1@16@@danf@17-8-2009
10420830@unknown@formal@none@1@S@Italian is the official language of [[Italy]] and [[San Marino]], and one of the official languages of [[Switzerland]], spoken mainly in [[Canton Ticino|Ticino]] and [[Graubünden|Grigioni]] cantons, a region referred to as [[Italian Switzerland]].@@@@1@33@@danf@17-8-2009
10420840@unknown@formal@none@1@S@It is also the second official language in some areas of [[Istria]], in [[Slovenia]] and [[Croatia]], where an Italian minority exists.@@@@1@21@@danf@17-8-2009
10420850@unknown@formal@none@1@S@It is the primary language of the [[Vatican City]] and is widely used and taught in [[Monaco]] and [[Malta]].@@@@1@19@@danf@17-8-2009
10420860@unknown@formal@none@1@S@It is also widely understood in France with over one million speakers (especially in [[Corsica]] and the [[County of Nice]], areas that historically spoke [[Italian dialects]] before annexation to [[France]]), and in [[Albania]].@@@@1@33@@danf@17-8-2009
10420870@unknown@formal@none@1@S@Italian is also spoken by some in former Italian colonies in [[Africa]] ([[Libya]], [[Somalia]] and [[Eritrea]]).@@@@1@16@@danf@17-8-2009
10420880@unknown@formal@none@1@S@However, its use has sharply dropped off since the colonial period.@@@@1@11@@danf@17-8-2009
10420890@unknown@formal@none@1@S@In [[Eritrea]] [[Italian Language|Italian]] is widely understood .@@@@1@8@@danf@17-8-2009
10420900@unknown@formal@none@1@S@In fact, for fifty years, during the colonial period, Italian was the language of instruction, but [[as of 1997]], there is only one Italian language school remaining, with 470 pupils.@@@@1@30@@danf@17-8-2009
10420910@unknown@formal@none@1@S@In [[Somalia]] Italian used to be a major language but due to the civil war and lack of education only the older generation still uses it.@@@@1@26@@danf@17-8-2009
10420920@unknown@formal@none@1@S@Italian and [[Italian dialects]] are widely used by Italian immigrants and many of their descendants (see ''[[Italians]]'') living throughout [[Western Europe]] (especially [[France]], [[Germany]], [[Belgium]], [[Switzerland]], the [[Britalian|United Kingdom]] and [[Luxembourg]]), the [[Italian Americans|United States]], [[Italian Canadians|Canada]], [[Italian Australians|Australia]], and [[Latin America]] (especially [[Uruguay]], [[Italian Brazilians|Brazil]], [[Argentina]], and [[Venezuela]]).@@@@1@49@@danf@17-8-2009
10420930@unknown@formal@none@1@S@In the United States, Italian speakers are most commonly found in four cities: [[Boston]] (7,000), [[Chicago]] (12,000), [[New York City]] (140,000), and [[Philadelphia]] (15,000).@@@@1@24@@danf@17-8-2009
10420940@unknown@formal@none@1@S@In Canada there are large Italian-speaking communities in [[Montreal]] (120,000) and [[Toronto]] (195,000).@@@@1@13@@danf@17-8-2009
10420950@unknown@formal@none@1@S@Italian is the second most commonly-spoken language in Australia, where 353,605 [[Italian Australian]]s, or 1.9% of the population, reported speaking Italian at home in the 2001 [[Census in Australia|Census]].@@@@1@29@@danf@17-8-2009
10420960@unknown@formal@none@1@S@In 2001 there were 130,000 Italian speakers in [[Melbourne]], and 90,000 in [[Sydney]].@@@@1@13@@danf@17-8-2009
10420970@unknown@formal@none@1@S@===Italian language education===@@@@1@3@@danf@17-8-2009
10420980@unknown@formal@none@1@S@Italian is widely taught in many schools around the world, but rarely as the first non-native language of pupils; in fact, Italian generally is the fourth or fifth most taught second-language in the world.@@@@1@34@@danf@17-8-2009
10420990@unknown@formal@none@1@S@In [[anglophone]] parts of [[Canada]], Italian is, after [[French language|French]], the third most taught language.@@@@1@15@@danf@17-8-2009
10421000@unknown@formal@none@1@S@In [[francophone]] Canada it is third after [[English language|English]].@@@@1@9@@danf@17-8-2009
10421010@unknown@formal@none@1@S@In the [[United States]] and the [[United Kingdom]], Italian ranks fourth (after [[Spanish language|Spanish]]-French-[[German language|German]] and French-German-Spanish respectively).@@@@1@18@@danf@17-8-2009
10421020@unknown@formal@none@1@S@Throughout the world, Italian is the fifth most taught non-native language, after [[English language|English]], French, Spanish, and German.@@@@1@18@@danf@17-8-2009
10421030@unknown@formal@none@1@S@In the [[European Union]], Italian is spoken as a mother tongue by 13% of the population (64 million, mainly in Italy itself) and as a second language by 3% (14 million); among EU member states, it is most likely to be desired (and therefore learned) as a second language in [[Malta]] (61%), [[Croatia]] (14%), [[Slovenia]] (12%), [[Austria]] (11%), [[Romania]] (8%), [[France]] (6%), and [[Greece]] (6%).@@@@1@65@@danf@17-8-2009
10421040@unknown@formal@none@1@S@It is also an important second language in [[Albania]] and [[Switzerland]], which are not EU members or candidates.@@@@1@18@@danf@17-8-2009
10421050@unknown@formal@none@1@S@===Influence and derived languages===@@@@1@4@@danf@17-8-2009
10421060@unknown@formal@none@1@S@From the late 19th to the mid 20th century, thousands of Italians settled in Argentina, Uruguay and southern Brazil, where they formed a very strong physical and cultural presence (see the [[Italian diaspora]]).@@@@1@33@@danf@17-8-2009
10421070@unknown@formal@none@1@S@In some cases, colonies were established where variants of [[Italian dialects]] were used, and some continue to use a derived dialect.@@@@1@21@@danf@17-8-2009
10421080@unknown@formal@none@1@S@An example is [[Rio Grande do Sul]], [[Brazil]], where [[Talian]] is used and in the town of [[Chipilo]] near Puebla, [[Mexico]] each continuing to use a derived form of [[Venetian language|Venetian]] dating back to the 19th century.@@@@1@37@@danf@17-8-2009
10421090@unknown@formal@none@1@S@Another example is [[Cocoliche]], an Italian-Spanish [[pidgin]] once spoken in [[Argentina]] and especially in [[Buenos Aires]], and [[Lunfardo]].@@@@1@18@@danf@17-8-2009
10421100@unknown@formal@none@1@S@[[Rioplatense Spanish]], and particularly the speech of the city of Buenos Aires, has intonation patterns that resemble those of Italian dialects, due to the fact that Argentina had a constant, large influx of Italian settlers since the second half of the nineteenth century; initially primarily from Northern Italy then, since the beginning of the twentieth century, mostly from Southern Italy.@@@@1@60@@danf@17-8-2009
10421110@unknown@formal@none@1@S@===Lingua Franca===@@@@1@2@@danf@17-8-2009
10421120@unknown@formal@none@1@S@Starting in late [[medieval]] times, Italian language variants replaced Latin to become the primary commercial language for much of Europe and Mediterranean Sea (especially the Tuscan and Venetian variants).@@@@1@29@@danf@17-8-2009
10421130@unknown@formal@none@1@S@This became solidified during the [[Renaissance]] with the strength of Italian banking and the rise of [[Renaissance humanism|humanism]] in the arts.@@@@1@21@@danf@17-8-2009
10421140@unknown@formal@none@1@S@During the period of the Renaissance, Italy held artistic sway over the rest of Europe.@@@@1@15@@danf@17-8-2009
10421150@unknown@formal@none@1@S@All educated European gentlemen were expected to make the [[Grand Tour]], visiting Italy to see its great historical monuments and works of art.@@@@1@23@@danf@17-8-2009
10421160@unknown@formal@none@1@S@It thus became expected that educated Europeans would learn at least some Italian; the English poet [[John Milton]], for instance, wrote some of his early poetry in Italian.@@@@1@28@@danf@17-8-2009
10421170@unknown@formal@none@1@S@In England, Italian became the second most common modern language to be learned, after [[French language|French]] (though the classical languages, [[Latin]] and [[Greek language|Greek]], came first).@@@@1@26@@danf@17-8-2009
10421180@unknown@formal@none@1@S@However, by the late eighteenth century, Italian tended to be replaced by [[German language|German]] as the second modern language on the curriculum.@@@@1@22@@danf@17-8-2009
10421190@unknown@formal@none@1@S@Yet Italian [[loanword]]s continue to be used in most other [[European languages]] in matters of art and music.@@@@1@18@@danf@17-8-2009
10421200@unknown@formal@none@1@S@Today, the Italian language continues to be used as a [[lingua franca]] in some environments.@@@@1@15@@danf@17-8-2009
10421210@unknown@formal@none@1@S@Within the [[Catholic church]] Italian is known by a large part of the ecclesiastic hierarchy, and is used in substitution of [[Latin]] in some official documents.@@@@1@26@@danf@17-8-2009
10421220@unknown@formal@none@1@S@The presence of Italian as the primary language in the [[Vatican City]] indicates not only use within the [[Holy See]], but also throughout the world where an episcopal seat is present.@@@@1@31@@danf@17-8-2009
10421230@unknown@formal@none@1@S@It continues to be used in [[music]] and [[opera]].@@@@1@9@@danf@17-8-2009
10421240@unknown@formal@none@1@S@Other examples where Italian is sometimes used as a means communication is in some sports (sometimes in [[Football (association)|football]] and [[motorsports]]) and in the [[design]] and [[fashion]] industries.@@@@1@28@@danf@17-8-2009
10421250@unknown@formal@none@1@S@==Dialects==@@@@1@1@@danf@17-8-2009
10421260@unknown@formal@none@1@S@In Italy, all [[Romance languages]] spoken as the vernacular, other than standard Italian and other unrelated, non-Italian languages, are termed "Italian dialects".@@@@1@22@@danf@17-8-2009
10421270@unknown@formal@none@1@S@Many Italian dialects are, in fact, historical languages in their own right.@@@@1@12@@danf@17-8-2009
10421280@unknown@formal@none@1@S@These include recognized language groups such as [[Friulian language|Friulian]], [[Neapolitan language|Neapolitan]], [[Sardinian language|Sardinian]], [[Sicilian language|Sicilian]], [[Venetian language|Venetian]], and others, and regional variants of these languages such as [[Calabrian languages|Calabrian]].@@@@1@29@@danf@17-8-2009
10421290@unknown@formal@none@1@S@The division between dialect and language has been used by scholars (such as by [[Francesco Bruni]]) to distinguish between the languages that made up the Italian [[koine]], and those which had very little or no part in it, such as [[Albanian language|Albanian]], [[Greek language|Greek]], [[German language|German]], [[Ladin language|Ladin]], and [[Occitan language|Occitan]], which are still spoken by minorities.@@@@1@57@@danf@17-8-2009
10421300@unknown@formal@none@1@S@Dialects are generally not used for general mass communication and are usually limited to native speakers in informal contexts.@@@@1@19@@danf@17-8-2009
10421310@unknown@formal@none@1@S@In the past, speaking in dialect was often deprecated as a sign of poor education.@@@@1@15@@danf@17-8-2009
10421320@unknown@formal@none@1@S@Younger generations, especially those under 35 (though it may vary in different areas), speak almost exclusively standard Italian in all situations, usually with local accents and idioms.@@@@1@27@@danf@17-8-2009
10421330@unknown@formal@none@1@S@Regional differences can be recognized by various factors: the openness of vowels, the length of the consonants, and influence of the local dialect (for example, ''annà'' replaces ''andare'' in the area of Rome for the infinitive "to go").@@@@1@38@@danf@17-8-2009
10421340@unknown@formal@none@1@S@==Sounds==@@@@1@1@@danf@17-8-2009
10421350@unknown@formal@none@1@S@{{IPA notice|lang=it}}@@@@1@2@@danf@17-8-2009
10421360@unknown@formal@none@1@S@===Vowels===@@@@1@1@@danf@17-8-2009
10421370@unknown@formal@none@1@S@Italian has seven [[vowel]] phonemes: {{IPA|/a/}}, {{IPA|/e/}}, {{IPA|/ɛ/}}, {{IPA|/i/}}, {{IPA|/o/}}, {{IPA|/ɔ/}}, {{IPA|/u/}}.@@@@1@12@@danf@17-8-2009
10421380@unknown@formal@none@1@S@The pairs {{IPA|/e/}}-{{IPA|/ɛ/}} and {{IPA|/o/}}-{{IPA|/ɔ/}} are seldom distinguished in writing and often confused, even though most varieties of Italian employ both phonemes consistently.@@@@1@23@@danf@17-8-2009
10421390@unknown@formal@none@1@S@Compare, for example: "perché" {{IPA|[perˈkɛ]}} (why, because) and "senti" {{IPA|[ˈsenti]}} (you listen, you are listening, listen!), employed by some northern speakers, with {{IPA|[perˈke]}} and {{IPA|[ˈsɛnti]}}, as pronounced by most central and southern speakers.@@@@1@33@@danf@17-8-2009
10421400@unknown@formal@none@1@S@As a result, the usage is strongly indicative of a person's origin.@@@@1@12@@danf@17-8-2009
10421410@unknown@formal@none@1@S@The standard (Tuscan) usage of these vowels is listed in vocabularies, and employed outside Tuscany mainly by specialists, especially actors and very few (television) journalists.@@@@1@25@@danf@17-8-2009
10421420@unknown@formal@none@1@S@These are truly different [[phonemes]], however: compare {{IPA|/ˈpeska/}} (fishing) and {{IPA|/ˈpɛska/}} (peach), both spelled ''pesca'' .@@@@1@16@@danf@17-8-2009
10421430@unknown@formal@none@1@S@Similarly {{IPA|/ˈbotte/}} ('barrel') and {{IPA|/ˈbɔtte/}} ('beatings'), both spelled ''botte'', discriminate {{IPA|/o/}} and {{IPA|/ɔ/}} .@@@@1@14@@danf@17-8-2009
10421440@unknown@formal@none@1@S@In general, vowel combinations usually pronounce each vowel separately.@@@@1@9@@danf@17-8-2009
10421450@unknown@formal@none@1@S@[[Diphthong]]s exist (e.g. ''uo'', ''iu'', ''ie'', ''ai''), but are limited to an unstressed ''u'' or ''i'' before or after a stressed vowel.@@@@1@22@@danf@17-8-2009
10421460@unknown@formal@none@1@S@The unstressed ''u'' in a diphthong approximates the English semivowel ''w'', the unstressed ''i'' approximates the semivowel ''y''.@@@@1@18@@danf@17-8-2009
10421470@unknown@formal@none@1@S@E.g.: ''buono'' {{IPA|[ˈbwɔno]}}, ''ieri'' {{IPA|[ˈjɛri]}}.@@@@1@5@@danf@17-8-2009
10421480@unknown@formal@none@1@S@[[Triphthong]]s exist in Italian as well, like "contin''uia''mo" ("we continue").@@@@1@10@@danf@17-8-2009
10421490@unknown@formal@none@1@S@Three vowel combinations exist only in the form semiconsonant ({{IPA|/j/}} or {{IPA|/w/}}), followed by a vowel, followed by a desinence vowel (usually {{IPA|/i/}}), as in ''miei'', ''suoi'', or two semiconsonants followed by a vowel, as the group ''-uia-'' exemplified above, or ''-iuo-'' in the word ''aiuola''.@@@@1@46@@danf@17-8-2009
10421500@unknown@formal@none@1@S@===Mobile diphthongs===@@@@1@2@@danf@17-8-2009
10421510@unknown@formal@none@1@S@Many Latin words with a short ''e'' or ''o'' have Italian counterparts with a mobile diphthong (''ie'' and ''uo'' respectively).@@@@1@20@@danf@17-8-2009
10421520@unknown@formal@none@1@S@When the vowel sound is stressed, it is pronounced and written as a diphthong; when not stressed, it is pronounced and written as a single vowel.@@@@1@26@@danf@17-8-2009
10421530@unknown@formal@none@1@S@So Latin ''focus'' gave rise to Italian ''fuoco'' (meaning both "fire" and "optical focus"): when unstressed, as in ''focale'' ("focal") the "o" remains alone.@@@@1@24@@danf@17-8-2009
10421540@unknown@formal@none@1@S@Latin ''pes'' (more precisely its accusative form ''pedem'') is the source of Italian ''piede'' (foot): but unstressed "e" was left unchanged in ''pedone'' (pedestrian) and ''pedale'' (pedal).@@@@1@27@@danf@17-8-2009
10421550@unknown@formal@none@1@S@From Latin ''iocus'' comes Italian ''giuoco'' ("play", "game"), though in this case ''gioco'' is more common: ''giocare'' means "to play (a game)".@@@@1@22@@danf@17-8-2009
10421560@unknown@formal@none@1@S@From Latin ''homo'' comes Italian ''uomo'' (man), but also ''umano'' (human) and ''ominide'' (hominid).@@@@1@14@@danf@17-8-2009
10421570@unknown@formal@none@1@S@From Latin ''ovum'' comes Italian ''uovo'' (egg) and ''ovaie'' (ovaries).@@@@1@10@@danf@17-8-2009
10421580@unknown@formal@none@1@S@(The same phenomenon occurs in [[Spanish language|Spanish]]: ''juego'' (play, game) and ''jugar'' (to play), ''nieve'' (snow) and ''nevar'' (to snow)).@@@@1@20@@danf@17-8-2009
10421590@unknown@formal@none@1@S@===Consonants===@@@@1@1@@danf@17-8-2009
10421600@unknown@formal@none@1@S@Two symbols in a table cell denote the voiceless and voiced consonant, respectively.@@@@1@13@@danf@17-8-2009
10421610@unknown@formal@none@1@S@Nasals undergo assimilation when followed by a consonant, e.g., when preceding a velar ({{IPA|/k/}} or {{IPA|/g/}}) only {{IPA|[ŋ]}} appears, etc.@@@@1@20@@danf@17-8-2009
10421620@unknown@formal@none@1@S@Italian has geminate, or double, consonants, which are distinguished by [[Consonant length|length]].@@@@1@12@@danf@17-8-2009
10421630@unknown@formal@none@1@S@Length is distinctive for all consonants except for {{IPA|/ʃ/}}, {{IPA|/ʦ/}}, {{IPA|/ʣ/}}, {{IPA|/ʎ/}} {{IPA|/ɲ/}}, which are always geminate, and {{IPA|/z/}} which is always single.@@@@1@23@@danf@17-8-2009
10421640@unknown@formal@none@1@S@Geminate plosives and affricates are realised as lengthened closures.@@@@1@9@@danf@17-8-2009
10421650@unknown@formal@none@1@S@Geminate fricatives, nasals, and {{IPA|/l/}} are realized as lengthened [[continuant]]s.@@@@1@10@@danf@17-8-2009
10421660@unknown@formal@none@1@S@The flap consonant {{IPA|/ɾː/}} is typically dialectal, and it is called ''erre moscia''.@@@@1@13@@danf@17-8-2009
10421670@unknown@formal@none@1@S@The correct standard pronunciation is {{IPA|[r]}}.@@@@1@6@@danf@17-8-2009
10421680@unknown@formal@none@1@S@Of special interest to the linguistic study of Italian is the ''[[Tuscan gorgia|Gorgia Toscana]]'', or "Tuscan Throat", the weakening or [[lenition]] of certain [[:wiktionary:intervocalic|intervocalic]] consonants in [[Tuscan dialect]]s.@@@@1@28@@danf@17-8-2009
10421690@unknown@formal@none@1@S@See also [[Syntactic doubling]].@@@@1@4@@danf@17-8-2009
10421700@unknown@formal@none@1@S@===Assimilation===@@@@1@1@@danf@17-8-2009
10421710@unknown@formal@none@1@S@Italian has few diphthongs, so most unfamiliar diphthongs that are heard in foreign words (in particular, those beginning with vowel "a", "e", or "o") will be assimilated as the corresponding [[diaeresis]] (i.e., the vowel sounds will be pronounced separately).@@@@1@39@@danf@17-8-2009
10421720@unknown@formal@none@1@S@Italian [[phonotactics]] do not usually permit polysyllabic nouns and verbs to end with consonants, excepting poetry and song, so foreign words may receive extra terminal vowel sounds.@@@@1@27@@danf@17-8-2009
10421730@unknown@formal@none@1@S@==Grammar==@@@@1@1@@danf@17-8-2009
10421740@unknown@formal@none@1@S@===Common variations in the writing systems===@@@@1@6@@danf@17-8-2009
10421750@unknown@formal@none@1@S@Some variations in the usage of the writing system may be present in practical use.@@@@1@15@@danf@17-8-2009
10421760@unknown@formal@none@1@S@These are scorned by educated people, but they are so common in certain contexts that knowledge of them may be useful.@@@@1@21@@danf@17-8-2009
10421770@unknown@formal@none@1@S@* Usage of ''x'' instead of ''per'': this is very common among teenagers and in [[Text messaging|SMS]] abbreviations.@@@@1@18@@danf@17-8-2009
10421780@unknown@formal@none@1@S@The multiplication operator is pronounced "per" in Italian, and so it is sometimes used to replace the word "per", which means "for"; thus, for example, "per te" ("for you") is shortened to "x te" (compare with English "4 U").@@@@1@39@@danf@17-8-2009
10421790@unknown@formal@none@1@S@Words containing ''per'' can also have it replaced with ''x'': for example, ''perché'' (both "why" and "because") is often shortened as ''xché'' or ''xké'' or ''x' ''(see below).@@@@1@28@@danf@17-8-2009
10421800@unknown@formal@none@1@S@This usage might be useful to jot down quick notes or to fit more text into the low character limit of an SMS, but it is considered unacceptable in formal writing.@@@@1@31@@danf@17-8-2009
10421810@unknown@formal@none@1@S@* Usage of foreign letters such as ''k'', ''j'' and ''y'', especially in nicknames and SMS language: ''ke'' instead of ''che'', ''Giusy'' instead of ''Giuseppina'' (or sometimes ''Giuseppe'').@@@@1@28@@danf@17-8-2009
10421820@unknown@formal@none@1@S@This is curiously mirrored in the usage of ''i'' in English names such as ''Staci'' instead of ''Stacey'', or in the usage of ''c'' in [[Northern Europe]] (''Jacob'' instead of ''Jakob'').@@@@1@31@@danf@17-8-2009
10421830@unknown@formal@none@1@S@The use of "k" instead of "ch" or "c" to represent a plosive sound is documented in some historical texts from before the standardization of the Italian language; however, that usage is no longer standard in Italian.@@@@1@37@@danf@17-8-2009
10421840@unknown@formal@none@1@S@Possibly because it is associated with the [[German language]], the letter "k" has sometimes also been used in satire to suggest that a political figure is an authoritarian or even a "pseudo-nazi": [[Francesco Cossiga]] was famously nicknamed ''Kossiga'' by rioting students during his tenure as minister of internal affairs.@@@@1@49@@danf@17-8-2009
10421850@unknown@formal@none@1@S@[Cf. the [[alternative political spelling#"K" replacing "C"|politicized spelling ''Amerika'']] in the USA.]@@@@1@12@@danf@17-8-2009
10421860@unknown@formal@none@1@S@* Usage of the following abbreviations is limited to the electronic communications media and is deprecated in all other cases: '''nn''' instead of ''non'' (not), '''cmq''' instead of ''comunque'' (anyway, however), '''cm''' instead of ''come'' (how, like, as), '''d''' instead of ''di'' (of), '''(io/loro) sn''' instead of ''(io/loro) sono'' (I am/they are), '''(io) dv''' instead of ''(io) devo'' (I must/I have to) or instead of ''dove'' (where), '''(tu) 6''' instead of ''(tu) sei'' (you are).@@@@1@75@@danf@17-8-2009
10421870@unknown@formal@none@1@S@* Inexperienced typists often replace accents with apostrophes, such as in ''perche''' instead of ''perché''.@@@@1@15@@danf@17-8-2009
10421880@unknown@formal@none@1@S@Uppercase ''[[È]]'' is particularly rare, as it is absent from the [[Keyboard layout#Italian|Italian keyboard layout]], and is very often written as ''E''' (even though there are [[:it:Aiuto:Manuale di stile#Scrivere .C3.88|several ways]] of producing the uppercase È on a computer).@@@@1@39@@danf@17-8-2009
10421890@unknown@formal@none@1@S@This never happens in books or other professionally typeset material.@@@@1@10@@danf@17-8-2009
10421900@unknown@formal@none@1@S@==Samples==@@@@1@1@@danf@17-8-2009
10421910@unknown@formal@none@1@S@==Examples==@@@@1@1@@danf@17-8-2009
10421920@unknown@formal@none@1@S@*Cheers: "Salute!"@@@@1@2@@danf@17-8-2009
10421930@unknown@formal@none@1@S@*English: ''inglese'' {{IPA|/iŋˈglese/}}@@@@1@3@@danf@17-8-2009
10421940@unknown@formal@none@1@S@*Good-bye: ''arrivederci'' {{IPA|/arriveˈdertʃi/}}@@@@1@3@@danf@17-8-2009
10421950@unknown@formal@none@1@S@*Hello: ''[[ciao]]'' {{IPA|/ˈtʃao/}}@@@@1@3@@danf@17-8-2009
10421960@unknown@formal@none@1@S@*Good day: ''buon giorno'' {{IPA|/bwɔnˈdʒorno/}}@@@@1@5@@danf@17-8-2009
10421970@unknown@formal@none@1@S@*Good evening: ''buona sera'' {{IPA|/bwɔnaˈsera/}}@@@@1@5@@danf@17-8-2009
10421980@unknown@formal@none@1@S@*Yes: ''sì'' {{IPA|/si/}}@@@@1@3@@danf@17-8-2009
10421990@unknown@formal@none@1@S@*No: ''no'' {{IPA|/nɔ/}}@@@@1@3@@danf@17-8-2009
10422000@unknown@formal@none@1@S@*How are you? : Come stai {{IPA|/ˈkome ˈstai/}} (informal); Come sta {{IPA|/ˈkome 'sta/}} (formal)@@@@1@14@@danf@17-8-2009
10422010@unknown@formal@none@1@S@*Sorry: ''mi dispiace'' {{IPA|/mi disˈpjatʃe/}}@@@@1@5@@danf@17-8-2009
10422020@unknown@formal@none@1@S@*Excuse me: scusa {{IPA|/ˈskuza/}} (informal); scusi {{IPA|/ˈskuzi/}} (formal)@@@@1@8@@danf@17-8-2009
10422030@unknown@formal@none@1@S@*Again: ''di nuovo'', /{{IPA|di ˈnwɔvo}}/; ''ancora'' /{{IPA|aŋˈkora}}/@@@@1@7@@danf@17-8-2009
10422040@unknown@formal@none@1@S@*Always: ''sempre'' /{{IPA|ˈsɛmpre}}/@@@@1@3@@danf@17-8-2009
10422050@unknown@formal@none@1@S@*When: ''quando'' {{IPA|/ˈkwando/}}@@@@1@3@@danf@17-8-2009
10422060@unknown@formal@none@1@S@*Where: ''dove'' {{IPA|/'dove/}}@@@@1@3@@danf@17-8-2009
10422070@unknown@formal@none@1@S@*Why/Because: ''perché'' {{IPA|/perˈke/}}@@@@1@3@@danf@17-8-2009
10422080@unknown@formal@none@1@S@*How: ''come'' {{IPA|/'kome/}}@@@@1@3@@danf@17-8-2009
10422090@unknown@formal@none@1@S@*How much is it?: ''quanto costa?''@@@@1@6@@danf@17-8-2009
10422100@unknown@formal@none@1@S@{{IPA|/ˈkwanto/}}@@@@1@1@@danf@17-8-2009
10422110@unknown@formal@none@1@S@*Thank you!: ''grazie!''@@@@1@3@@danf@17-8-2009
10422120@unknown@formal@none@1@S@{{IPA|/ˈgrattsie/}}@@@@1@1@@danf@17-8-2009
10422130@unknown@formal@none@1@S@*Bon appetit: ''buon appetito'' {{IPA|/ˌbwɔn appeˈtito/}}@@@@1@6@@danf@17-8-2009
10422140@unknown@formal@none@1@S@*You're welcome!: ''prego!''@@@@1@3@@danf@17-8-2009
10422150@unknown@formal@none@1@S@{{IPA|/ˈprɛgo/}}@@@@1@1@@danf@17-8-2009
10422160@unknown@formal@none@1@S@*I love you: ''Ti amo'' {{IPA|/ti ˈamo/}}, ''Ti voglio bene'' {{IPA|/ti ˈvɔʎʎo ˈbɛne/}}.@@@@1@13@@danf@17-8-2009
10422170@unknown@formal@none@1@S@The difference is that you use "Ti amo" when you are in a romantic relationship, "Ti voglio bene" in any other occasion (to parents, to relatives, to friends...)@@@@1@28@@danf@17-8-2009
10422180@unknown@formal@none@1@S@Counting to twenty:@@@@1@3@@danf@17-8-2009
10422190@unknown@formal@none@1@S@*One: ''uno'' {{IPA|/ˈuno/}}@@@@1@3@@danf@17-8-2009
10422200@unknown@formal@none@1@S@*Two: ''due'' {{IPA|/ˈdue/}}@@@@1@3@@danf@17-8-2009
10422210@unknown@formal@none@1@S@*Three: ''tre'' {{IPA|/tre/}}@@@@1@3@@danf@17-8-2009
10422220@unknown@formal@none@1@S@*Four: ''quattro'' {{IPA|/ˈkwattro/}}@@@@1@3@@danf@17-8-2009
10422230@unknown@formal@none@1@S@*Five: ''cinque'' {{IPA|/ˈʧiŋkwe/}}@@@@1@3@@danf@17-8-2009
10422240@unknown@formal@none@1@S@*Six: ''sei'' {{IPA|/ˈsɛi/}}@@@@1@3@@danf@17-8-2009
10422250@unknown@formal@none@1@S@*Seven: ''sette'' {{IPA|/ˈsɛtte/}}@@@@1@3@@danf@17-8-2009
10422260@unknown@formal@none@1@S@*Eight: ''otto'' {{IPA|/ˈɔtto/}}@@@@1@3@@danf@17-8-2009
10422270@unknown@formal@none@1@S@*Nine: ''nove'' {{IPA|/ˈnɔve/}}@@@@1@3@@danf@17-8-2009
10422280@unknown@formal@none@1@S@*Ten: ''dieci'' {{IPA|/ˈdjɛʧi/}}@@@@1@3@@danf@17-8-2009
10422290@unknown@formal@none@1@S@*Eleven: ''undici'' {{IPA|/ˈundiʧi/}}@@@@1@3@@danf@17-8-2009
10422300@unknown@formal@none@1@S@*Twelve: ''dodici'' {{IPA|/ˈdodiʧi/}}@@@@1@3@@danf@17-8-2009
10422310@unknown@formal@none@1@S@*Thirteen: ''tredici'' {{IPA|/ˈtrediʧi/}}@@@@1@3@@danf@17-8-2009
10422320@unknown@formal@none@1@S@*Fourteen: ''quattordici'' {{IPA|/kwat'tordiʧi/}}@@@@1@3@@danf@17-8-2009
10422330@unknown@formal@none@1@S@*Fifteen: ''quindici'' {{IPA|/ˈkwindiʧi/}}@@@@1@3@@danf@17-8-2009
10422340@unknown@formal@none@1@S@*Sixteen: ''sedici'' {{IPA|/ˈsediʧi/}}@@@@1@3@@danf@17-8-2009
10422350@unknown@formal@none@1@S@*Seventeen: ''diciassette'' {{IPA|/diʧas'sɛtte/}}@@@@1@3@@danf@17-8-2009
10422360@unknown@formal@none@1@S@*Eighteen: ''diciotto'' {{IPA|/di'ʧɔtto/}}@@@@1@3@@danf@17-8-2009
10422370@unknown@formal@none@1@S@*Nineteen: ''diciannove'' {{IPA|/diʧan'nɔve/}}@@@@1@3@@danf@17-8-2009
10422380@unknown@formal@none@1@S@*Twenty: ''venti'' {{IPA|/'venti/}}@@@@1@3@@danf@17-8-2009
10422390@unknown@formal@none@1@S@The days of the week:@@@@1@5@@danf@17-8-2009
10422400@unknown@formal@none@1@S@*Monday: ''lunedì'' {{IPA|/lune'di/}}@@@@1@3@@danf@17-8-2009
10422410@unknown@formal@none@1@S@*Tuesday: ''martedì'' {{IPA|/marte'di/}}@@@@1@3@@danf@17-8-2009
10422420@unknown@formal@none@1@S@*Wednesday: ''mercoledì'' {{IPA|/merkole'di/}}@@@@1@3@@danf@17-8-2009
10422430@unknown@formal@none@1@S@*Thursday: ''giovedì'' {{IPA|/dʒove'di/}}@@@@1@3@@danf@17-8-2009
10422440@unknown@formal@none@1@S@*Friday: ''venerdì'' {{IPA|/vener'di/}}@@@@1@3@@danf@17-8-2009
10422450@unknown@formal@none@1@S@*Saturday: ''sabato'' {{IPA|/ˈsabato/}}@@@@1@3@@danf@17-8-2009
10422460@unknown@formal@none@1@S@*Sunday: ''domenica'' {{IPA|/do'menika/}}@@@@1@3@@danf@17-8-2009
10422470@unknown@formal@none@1@S@==Sample texts==@@@@1@2@@danf@17-8-2009
10422480@unknown@formal@none@1@S@There is a recording of [[Dante]]'s [[Divine Comedy]] read by [[Lino Pertile]] available at http://etcweb.princeton.edu/dante/pdp/@@@@1@15@@danf@17-8-2009