10050010@unknown@formal@none@1@S@
Artificial neural network
@@@@1@3@@danf@17-8-2009 10050020@unknown@formal@none@1@S@An '''artificial neural network (ANN)''', often just called a "neural network" (NN), is a [[mathematical model]] or [[computational model]] based on [[biological neural networks]].@@@@1@24@@danf@17-8-2009 10050030@unknown@formal@none@1@S@It consists of an interconnected group of [[artificial neuron]]s and processes information using a [[connectionism|connectionist]] approach to [[computation]].@@@@1@18@@danf@17-8-2009 10050040@unknown@formal@none@1@S@In most cases an ANN is an [[adaptive system]] that changes its structure based on external or internal information that flows through the network during the learning phase.@@@@1@28@@danf@17-8-2009 10050050@unknown@formal@none@1@S@In more practical terms neural networks are [[non-linear]] [[statistical]] [[data modeling]] tools.@@@@1@12@@danf@17-8-2009 10050060@unknown@formal@none@1@S@They can be used to model complex relationships between inputs and outputs or to [[Pattern recognition|find patterns]] in data.@@@@1@19@@danf@17-8-2009 10050070@unknown@formal@none@1@S@==Background==@@@@1@1@@danf@17-8-2009 10050080@unknown@formal@none@1@S@There is no precise agreed-upon definition among researchers as to what a [[neural network]] is, but most would agree that it involves a network of simple processing elements ([[artificial neuron|neurons]]), which can exhibit complex global behavior, determined by the connections between the processing elements and element parameters.@@@@1@47@@danf@17-8-2009 10050090@unknown@formal@none@1@S@The original inspiration for the technique was from examination of the [[central nervous system]] and the neurons (and their [[axons]], [[dendrites]] and [[synapses]]) which constitute one of its most significant information processing elements (see [[Neuroscience]]).@@@@1@35@@danf@17-8-2009 10050100@unknown@formal@none@1@S@In a neural network model, simple [[Node (neural networks)|nodes]] (called variously "neurons", "neurodes", "PEs" ("processing elements") or "units") are connected together to form a network of nodes — hence the term "neural network."@@@@1@33@@danf@17-8-2009 10050110@unknown@formal@none@1@S@While a neural network does not have to be adaptive per se, its practical use comes with algorithms designed to alter the strength (weights) of the connections in the network to produce a desired signal flow.@@@@1@36@@danf@17-8-2009 10050120@unknown@formal@none@1@S@These networks are also similar to the [[biological neural networks]] in the sense that functions are performed collectively and in parallel by the units, rather than there being a clear delineation of subtasks to which various units are assigned (see also [[connectionism]]).@@@@1@42@@danf@17-8-2009 10050130@unknown@formal@none@1@S@Currently, the term Artificial Neural Network (ANN) tends to refer mostly to neural network models employed in [[statistics]], [[cognitive psychology]] and [[artificial intelligence]].@@@@1@23@@danf@17-8-2009 10050140@unknown@formal@none@1@S@[[Neural network]] models designed with emulation of the [[central nervous system]] (CNS) in mind are a subject of [[theoretical neuroscience]] ([[computational neuroscience]]).@@@@1@22@@danf@17-8-2009 10050150@unknown@formal@none@1@S@In modern [[Neural network software|software implementations]] of artificial neural networks the approach inspired by biology has more or less been abandoned for a more practical approach based on statistics and signal processing.@@@@1@32@@danf@17-8-2009 10050160@unknown@formal@none@1@S@In some of these systems neural networks, or parts of neural networks (such as [[artificial neuron]]s) are used as components in larger systems that combine both adaptive and non-adaptive elements.@@@@1@30@@danf@17-8-2009 10050170@unknown@formal@none@1@S@While the more general approach of such [[adaptive systems]] is more suitable for real-world problem solving, it has far less to do with the traditional artificial intelligence connectionist models.@@@@1@29@@danf@17-8-2009 10050180@unknown@formal@none@1@S@What they do, however, have in common is the principle of non-linear, distributed, parallel and local processing and adaptation.@@@@1@19@@danf@17-8-2009 10050190@unknown@formal@none@1@S@===Models===@@@@1@1@@danf@17-8-2009 10050200@unknown@formal@none@1@S@Neural network models in artificial intelligence are usually referred to as artificial neural networks (ANNs); these are essentially simple mathematical models defining a function f : X \\rightarrow Y .@@@@1@31@@danf@17-8-2009 10050210@unknown@formal@none@1@S@Each type of ANN model corresponds to a ''class'' of such functions.@@@@1@12@@danf@17-8-2009 10050220@unknown@formal@none@1@S@====The ''network'' in ''artificial neural network''====@@@@1@6@@danf@17-8-2009 10050230@unknown@formal@none@1@S@The word ''network'' in the term 'artificial neural network' arises because the function f(x) is defined as a composition of other functions g_i(x), which can further be defined as a composition of other functions.@@@@1@34@@danf@17-8-2009 10050240@unknown@formal@none@1@S@This can be conveniently represented as a network structure, with arrows depicting the dependencies between variables.@@@@1@16@@danf@17-8-2009 10050250@unknown@formal@none@1@S@A widely used type of composition is the ''nonlinear weighted sum'', where f (x) = K \\left(\\sum_i w_i g_i(x)\\right) , where K is some predefined function, such as the [[hyperbolic tangent]].@@@@1@31@@danf@17-8-2009 10050260@unknown@formal@none@1@S@It will be convenient for the following to refer to a collection of functions g_i as simply a vector g = (g_1, g_2, \\ldots, g_n).@@@@1@25@@danf@17-8-2009 10050270@unknown@formal@none@1@S@This figure depicts such a decomposition of f, with dependencies between variables indicated by arrows.@@@@1@15@@danf@17-8-2009 10050280@unknown@formal@none@1@S@These can be interpreted in two ways.@@@@1@7@@danf@17-8-2009 10050290@unknown@formal@none@1@S@The first view is the functional view: the input x is transformed into a 3-dimensional vector h, which is then transformed into a 2-dimensional vector g, which is finally transformed into f.@@@@1@32@@danf@17-8-2009 10050300@unknown@formal@none@1@S@This view is most commonly encountered in the context of [[Optimization (mathematics)|optimization]].@@@@1@12@@danf@17-8-2009 10050310@unknown@formal@none@1@S@The second view is the probabilistic view: the [[random variable]] F = f(G) depends upon the random variable G = g(H), which depends upon H=h(X), which depends upon the random variable X.@@@@1@33@@danf@17-8-2009 10050320@unknown@formal@none@1@S@This view is most commonly encountered in the context of [[graphical models]].@@@@1@12@@danf@17-8-2009 10050330@unknown@formal@none@1@S@The two views are largely equivalent.@@@@1@6@@danf@17-8-2009 10050340@unknown@formal@none@1@S@In either case, for this particular network architecture, the components of individual layers are independent of each other (e.g., the components of g are independent of each other given their input h).@@@@1@32@@danf@17-8-2009 10050350@unknown@formal@none@1@S@This naturally enables a degree of parallelism in the implementation.@@@@1@10@@danf@17-8-2009 10050360@unknown@formal@none@1@S@Networks such as the previous one are commonly called [[feedforward]], because their graph is a [[directed acyclic graph]].@@@@1@18@@danf@17-8-2009 10050370@unknown@formal@none@1@S@Networks with [[path (graph theory)|cycles]] are commonly called [[Recurrent_neural_network|recurrent]].@@@@1@9@@danf@17-8-2009 10050380@unknown@formal@none@1@S@Such networks are commonly depicted in the manner shown at the top of the figure, where f is shown as being dependent upon itself.@@@@1@24@@danf@17-8-2009 10050390@unknown@formal@none@1@S@However, there is an implied temporal dependence which is not shown.@@@@1@11@@danf@17-8-2009 10050400@unknown@formal@none@1@S@What this actually means in practice is that the value of f at some point in time t depends upon the values of f at zero or at one or more other points in time.@@@@1@35@@danf@17-8-2009 10050410@unknown@formal@none@1@S@The graphical model at the bottom of the figure illustrates the case: the value of f at time t only depends upon its last value.@@@@1@25@@danf@17-8-2009 10050420@unknown@formal@none@1@S@===Learning===@@@@1@1@@danf@17-8-2009 10050430@unknown@formal@none@1@S@However interesting such functions may be in themselves, what has attracted the most interest in neural networks is the possibility of ''learning'', which in practice means the following:@@@@1@28@@danf@17-8-2009 10050440@unknown@formal@none@1@S@Given a specific ''task'' to solve, and a ''class'' of functions F, learning means using a set of ''observations'', in order to find f^* \\in F which solves the task in an ''optimal sense''.@@@@1@34@@danf@17-8-2009 10050450@unknown@formal@none@1@S@This entails defining a [[cost function]] C : F \\rightarrow \\mathbb{R} such that, for the optimal solution f^*, C(f^*) \\leq C(f) \\forall f \\in F (no solution has a cost less than the cost of the optimal solution).@@@@1@38@@danf@17-8-2009 10050460@unknown@formal@none@1@S@The [[cost function]] C is an important concept in learning, as it is a measure of how far away we are from an optimal solution to the problem that we want to solve.@@@@1@33@@danf@17-8-2009 10050470@unknown@formal@none@1@S@Learning algorithms search through the solution space in order to find a function that has the smallest possible cost.@@@@1@19@@danf@17-8-2009 10050480@unknown@formal@none@1@S@For applications where the solution is dependent on some data, the cost must necessarily be a ''function of the observations'', otherwise we would not be modelling anything related to the data.@@@@1@31@@danf@17-8-2009 10050490@unknown@formal@none@1@S@It is frequently defined as a [[statistic]] to which only approximations can be made.@@@@1@14@@danf@17-8-2009 10050500@unknown@formal@none@1@S@As a simple example consider the problem of finding the model f which minimizes C=E\\left[(f(x) - y)^2\\right], for data pairs (x,y) drawn from some distribution \\mathcal{D}.@@@@1@26@@danf@17-8-2009 10050510@unknown@formal@none@1@S@In practical situations we would only have N samples from \\mathcal{D} and thus, for the above example, we would only minimize \\hat{C}=\\frac{1}{N}\\sum_{i=1}^N (f(x_i)-y_i)^2.@@@@1@23@@danf@17-8-2009 10050520@unknown@formal@none@1@S@Thus, the cost is minimized over a sample of the data rather than the true data distribution.@@@@1@17@@danf@17-8-2009 10050530@unknown@formal@none@1@S@When N \\rightarrow \\infty some form of online learning must be used, where the cost is partially minimized as each new example is seen.@@@@1@24@@danf@17-8-2009 10050540@unknown@formal@none@1@S@While online learning is often used when \\mathcal{D} is fixed, it is most useful in the case where the distribution changes slowly over time.@@@@1@24@@danf@17-8-2009 10050550@unknown@formal@none@1@S@In neural network methods, some form of online learning is frequently also used for finite datasets.@@@@1@16@@danf@17-8-2009 10050560@unknown@formal@none@1@S@====Choosing a cost function====@@@@1@4@@danf@17-8-2009 10050570@unknown@formal@none@1@S@While it is possible to arbitrarily define some [[ad hoc]] cost function, frequently a particular cost will be used either because it has desirable properties (such as convexity) or because it arises naturally from a particular formulation of the problem (i.e., In a probabilistic formulation the posterior probability of the model can be used as an inverse cost).@@@@1@58@@danf@17-8-2009 10050580@unknown@formal@none@1@S@'''Ultimately, the cost function will depend on the task we wish to perform'''.@@@@1@13@@danf@17-8-2009 10050590@unknown@formal@none@1@S@The three main categories of learning tasks are overviewed below.@@@@1@10@@danf@17-8-2009 10050600@unknown@formal@none@1@S@===Learning paradigms===@@@@1@2@@danf@17-8-2009 10050610@unknown@formal@none@1@S@There are three major learning paradigms, each corresponding to a particular abstract learning task.@@@@1@14@@danf@17-8-2009 10050620@unknown@formal@none@1@S@These are [[supervised learning]], [[unsupervised learning]] and [[reinforcement learning]].@@@@1@9@@danf@17-8-2009 10050630@unknown@formal@none@1@S@Usually any given type of network architecture can be employed in any of those tasks.@@@@1@15@@danf@17-8-2009 10050640@unknown@formal@none@1@S@====Supervised learning====@@@@1@2@@danf@17-8-2009 10050650@unknown@formal@none@1@S@In [[supervised learning]], we are given a set of example pairs (x, y), x \\in X, y \\in Y and the aim is to find a function f : X \\rightarrow Y in the allowed class of functions that matches the examples.@@@@1@45@@danf@17-8-2009 10050660@unknown@formal@none@1@S@In other words, we wish to ''infer'' the mapping implied by the data; the cost function is related to the mismatch between our mapping and the data and it implicitly contains prior knowledge about the problem domain.@@@@1@37@@danf@17-8-2009 10050670@unknown@formal@none@1@S@A commonly used cost is the [[mean-squared error]] which tries to minimize the average error between the network's output, f(x), and the target value y over all the example pairs.@@@@1@30@@danf@17-8-2009 10050680@unknown@formal@none@1@S@When one tries to minimise this cost using [[gradient descent]] for the class of neural networks called [[Multilayer perceptron|Multi-Layer Perceptrons]], one obtains the common and well-known [[Backpropagation|backpropagation algorithm]] for training neural networks.@@@@1@32@@danf@17-8-2009 10050690@unknown@formal@none@1@S@Tasks that fall within the paradigm of supervised learning are [[pattern recognition]] (also known as classification) and [[Regression analysis|regression]] (also known as function approximation).@@@@1@24@@danf@17-8-2009 10050700@unknown@formal@none@1@S@The supervised learning paradigm is also applicable to sequential data (e.g., for speech and gesture recognition).@@@@1@16@@danf@17-8-2009 10050710@unknown@formal@none@1@S@This can be thought of as learning with a "teacher," in the form of a function that provides continuous feedback on the quality of solutions obtained thus far.@@@@1@28@@danf@17-8-2009 10050720@unknown@formal@none@1@S@====Unsupervised learning====@@@@1@2@@danf@17-8-2009 10050730@unknown@formal@none@1@S@In [[unsupervised learning]] we are given some data x, and the cost function to be minimized can be any function of the data x and the network's output, f.@@@@1@29@@danf@17-8-2009 10050740@unknown@formal@none@1@S@The cost function is dependent on the task (what we are trying to model) and our ''a priori'' assumptions (the implicit properties of our model, its parameters and the observed variables).@@@@1@31@@danf@17-8-2009 10050750@unknown@formal@none@1@S@As a trivial example, consider the model f(x) = a, where a is a constant and the cost C=E[(x - f(x))^2].@@@@1@21@@danf@17-8-2009 10050760@unknown@formal@none@1@S@Minimizing this cost will give us a value of a that is equal to the mean of the data.@@@@1@19@@danf@17-8-2009 10050770@unknown@formal@none@1@S@The cost function can be much more complicated.@@@@1@8@@danf@17-8-2009 10050780@unknown@formal@none@1@S@Its form depends on the application: For example in compression it could be related to the [[mutual information]] between x and y.@@@@1@22@@danf@17-8-2009 10050790@unknown@formal@none@1@S@In statistical modelling, it could be related to the [[posterior probability]] of the model given the data.@@@@1@17@@danf@17-8-2009 10050800@unknown@formal@none@1@S@(Note that in both of those examples those quantities would be maximized rather than minimised).@@@@1@15@@danf@17-8-2009 10050810@unknown@formal@none@1@S@Tasks that fall within the paradigm of unsupervised learning are in general [[estimation]] problems; the applications include [[Data clustering|clustering]], the estimation of [[statistical distributions]], [[Data compression|compression]] and [[Bayesian spam filtering|filtering]].@@@@1@30@@danf@17-8-2009 10050820@unknown@formal@none@1@S@====Reinforcement learning====@@@@1@2@@danf@17-8-2009 10050830@unknown@formal@none@1@S@In [[reinforcement learning]], data x is usually not given, but generated by an agent's interactions with the environment.@@@@1@18@@danf@17-8-2009 10050840@unknown@formal@none@1@S@At each point in time t, the agent performs an action y_t and the environment generates an observation x_t and an instantaneous cost c_t, according to some (usually unknown) dynamics.@@@@1@30@@danf@17-8-2009 10050850@unknown@formal@none@1@S@The aim is to discover a ''policy'' for selecting actions that minimizes some measure of a long-term cost, i.e. the expected cumulative cost.@@@@1@23@@danf@17-8-2009 10050860@unknown@formal@none@1@S@The environment's dynamics and the long-term cost for each policy are usually unknown, but can be estimated.@@@@1@17@@danf@17-8-2009 10050870@unknown@formal@none@1@S@More formally, the environment is modeled as a [[Markov decision process]] (MDP) with states {s_1,...,s_n}\\in S and actions {a_1,...,a_m} \\in A with the following probability distributions: the instantaneous cost distribution P(c_t|s_t), the observation distribution P(x_t|s_t) and the transition P(s_{t+1}|s_t, a_t), while a policy is defined as conditional distribution over actions given the observations.@@@@1@53@@danf@17-8-2009 10050880@unknown@formal@none@1@S@Taken together, the two define a [[Markov chain]] (MC).@@@@1@9@@danf@17-8-2009 10050890@unknown@formal@none@1@S@The aim is to discover the policy that minimizes the cost, i.e. the MC for which the cost is minimal.@@@@1@20@@danf@17-8-2009 10050900@unknown@formal@none@1@S@ANNs are frequently used in reinforcement learning as part of the overall algorithm.@@@@1@13@@danf@17-8-2009 10050910@unknown@formal@none@1@S@Tasks that fall within the paradigm of reinforcement learning are control problems, [[game|games]] and other [[sequential decision making]] tasks.@@@@1@19@@danf@17-8-2009 10050920@unknown@formal@none@1@S@See also: [[dynamic programming]], [[stochastic control]]@@@@1@6@@danf@17-8-2009 10050930@unknown@formal@none@1@S@===Learning algorithms===@@@@1@2@@danf@17-8-2009 10050940@unknown@formal@none@1@S@Training a neural network model essentially means selecting one model from the set of allowed models (or, in a [[Bayesian]] framework, determining a distribution over the set of allowed models) that minimises the cost criterion.@@@@1@35@@danf@17-8-2009 10050950@unknown@formal@none@1@S@There are numerous algorithms available for training neural network models; most of them can be viewed as a straightforward application of [[Optimization (mathematics)|optimization]] theory and [[statistical estimation]].@@@@1@27@@danf@17-8-2009 10050960@unknown@formal@none@1@S@Most of the algorithms used in training artificial neural networks are employing some form of [[gradient descent]].@@@@1@17@@danf@17-8-2009 10050970@unknown@formal@none@1@S@This is done by simply taking the derivative of the cost function with respect to the network parameters and then changing those parameters in a [[gradient-related]] direction.@@@@1@27@@danf@17-8-2009 10050980@unknown@formal@none@1@S@[[Evolutionary methods]], [[simulated annealing]], and [[expectation-maximization]] and [[non-parametric methods]] are among other commonly used methods for training neural networks.@@@@1@19@@danf@17-8-2009 10050990@unknown@formal@none@1@S@See also [[machine learning]].@@@@1@4@@danf@17-8-2009 10051000@unknown@formal@none@1@S@Temporal perceptual learning rely on finding temporal relationships in sensory signal streams.@@@@1@12@@danf@17-8-2009 10051010@unknown@formal@none@1@S@In an environment, statistically salient temporal correlations can be found by monitoring the arrival times of sensory signals.@@@@1@18@@danf@17-8-2009 10051020@unknown@formal@none@1@S@This is done by the [[perceptual network]].@@@@1@7@@danf@17-8-2009 10051030@unknown@formal@none@1@S@==Employing artificial neural networks==@@@@1@4@@danf@17-8-2009 10051040@unknown@formal@none@1@S@Perhaps the greatest advantage of ANNs is their ability to be used as an arbitrary function approximation mechanism which 'learns' from observed data.@@@@1@23@@danf@17-8-2009 10051050@unknown@formal@none@1@S@However, using them is not so straightforward and a relatively good understanding of the underlying theory is essential.@@@@1@18@@danf@17-8-2009 10051060@unknown@formal@none@1@S@*Choice of model: This will depend on the data representation and the application.@@@@1@13@@danf@17-8-2009 10051070@unknown@formal@none@1@S@Overly complex models tend to lead to problems with learning.@@@@1@10@@danf@17-8-2009 10051080@unknown@formal@none@1@S@*Learning algorithm: There are numerous tradeoffs between learning algorithms.@@@@1@9@@danf@17-8-2009 10051090@unknown@formal@none@1@S@Almost any algorithm will work well with the ''correct [[hyperparameter]]s'' for training on a particular fixed dataset.@@@@1@17@@danf@17-8-2009 10051100@unknown@formal@none@1@S@However selecting and tuning an algorithm for training on unseen data requires a significant amount of experimentation.@@@@1@17@@danf@17-8-2009 10051110@unknown@formal@none@1@S@*Robustness: If the model, cost function and learning algorithm are selected appropriately the resulting ANN can be extremely robust.@@@@1@19@@danf@17-8-2009 10051120@unknown@formal@none@1@S@With the correct implementation ANNs can be used naturally in [[online algorithm|online learning]] and large dataset applications.@@@@1@17@@danf@17-8-2009 10051130@unknown@formal@none@1@S@Their simple implementation and the existence of mostly local dependencies exhibited in the structure allows for fast, parallel implementations in hardware.@@@@1@21@@danf@17-8-2009 10051140@unknown@formal@none@1@S@==Applications==@@@@1@1@@danf@17-8-2009 10051150@unknown@formal@none@1@S@The utility of artificial neural network models lies in the fact that they can be used to infer a function from observations.@@@@1@22@@danf@17-8-2009 10051160@unknown@formal@none@1@S@This is particularly useful in applications where the complexity of the data or task makes the design of such a function by hand impractical.@@@@1@24@@danf@17-8-2009 10051170@unknown@formal@none@1@S@===Real life applications===@@@@1@3@@danf@17-8-2009 10051180@unknown@formal@none@1@S@The tasks to which artificial neural networks are applied tend to fall within the following broad categories:@@@@1@17@@danf@17-8-2009 10051190@unknown@formal@none@1@S@*[[Function approximation]], or [[regression analysis]], including [[time series prediction]] and modeling.@@@@1@11@@danf@17-8-2009 10051200@unknown@formal@none@1@S@*[[Statistical classification|Classification]], including [[Pattern recognition|pattern]] and sequence recognition, [[novelty detection]] and sequential decision making.@@@@1@14@@danf@17-8-2009 10051210@unknown@formal@none@1@S@*[[Data processing]], including filtering, clustering, blind source separation and compression.@@@@1@10@@danf@17-8-2009 10051220@unknown@formal@none@1@S@Application areas include system identification and control (vehicle control, process control), game-playing and decision making (backgammon, chess, racing), pattern recognition (radar systems, face identification, object recognition and more), sequence recognition (gesture, speech, handwritten text recognition), medical diagnosis, financial applications (automated trading systems), [[data mining]] (or knowledge discovery in databases, "KDD"), visualization and [[e-mail spam]] filtering.@@@@1@55@@danf@17-8-2009 10051230@unknown@formal@none@1@S@==Neural network software==@@@@1@3@@danf@17-8-2009 10051240@unknown@formal@none@1@S@'''Neural network software''' is used to [[Simulation|simulate]], [[research]], [[technology development|develop]] and apply artificial neural networks, [[biological neural network]]s and in some cases a wider array of [[adaptive system]]s.@@@@1@28@@danf@17-8-2009 10051250@unknown@formal@none@1@S@See also [[logistic regression]].@@@@1@4@@danf@17-8-2009 10051260@unknown@formal@none@1@S@==Types of neural networks==@@@@1@4@@danf@17-8-2009 10051270@unknown@formal@none@1@S@===Feedforward neural network===@@@@1@3@@danf@17-8-2009 10051280@unknown@formal@none@1@S@The feedforward neural network was the first and arguably simplest type of artificial neural network devised.@@@@1@16@@danf@17-8-2009 10051290@unknown@formal@none@1@S@In this network, the information moves in only one direction, forward, from the input nodes, through the hidden nodes (if any) and to the output nodes.@@@@1@26@@danf@17-8-2009 10051300@unknown@formal@none@1@S@There are no cycles or loops in the network.@@@@1@9@@danf@17-8-2009 10051310@unknown@formal@none@1@S@===Radial basis function (RBF) network===@@@@1@5@@danf@17-8-2009 10051320@unknown@formal@none@1@S@Radial Basis Functions are powerful techniques for interpolation in multidimensional space.@@@@1@11@@danf@17-8-2009 10051330@unknown@formal@none@1@S@A RBF is a function which has built into a distance criterion with respect to a centre.@@@@1@17@@danf@17-8-2009 10051340@unknown@formal@none@1@S@Radial basis functions have been applied in the area of neural networks where they may be used as a replacement for the sigmoidal hidden layer transfer characteristic in Multi-Layer Perceptrons.@@@@1@30@@danf@17-8-2009 10051350@unknown@formal@none@1@S@RBF networks have two layers of processing: In the first, input is mapped onto each RBF in the 'hidden' layer.@@@@1@20@@danf@17-8-2009 10051360@unknown@formal@none@1@S@The RBF chosen is usually a Gaussian.@@@@1@7@@danf@17-8-2009 10051370@unknown@formal@none@1@S@In regression problems the output layer is then a linear combination of hidden layer values representing mean predicted output.@@@@1@19@@danf@17-8-2009 10051380@unknown@formal@none@1@S@The interpretation of this output layer value is the same as a regression model in statistics.@@@@1@16@@danf@17-8-2009 10051390@unknown@formal@none@1@S@In classification problems the output layer is typically a [[sigmoid function]] of a linear combination of hidden layer values, representing a posterior probability.@@@@1@23@@danf@17-8-2009 10051400@unknown@formal@none@1@S@Performance in both cases is often improved by shrinkage techniques, known as [[ridge regression]] in classical statistics and known to correspond to a prior belief in small parameter values (and therefore smooth output functions) in a Bayesian framework.@@@@1@38@@danf@17-8-2009 10051410@unknown@formal@none@1@S@RBF networks have the advantage of not suffering from local minima in the same way as Multi-Layer Perceptrons.@@@@1@18@@danf@17-8-2009 10051420@unknown@formal@none@1@S@This is because the only parameters that are adjusted in the learning process are the linear mapping from hidden layer to output layer.@@@@1@23@@danf@17-8-2009 10051430@unknown@formal@none@1@S@Linearity ensures that the error surface is quadratic and therefore has a single easily found minimum.@@@@1@16@@danf@17-8-2009 10051440@unknown@formal@none@1@S@In regression problems this can be found in one matrix operation.@@@@1@11@@danf@17-8-2009 10051450@unknown@formal@none@1@S@In classification problems the fixed non-linearity introduced by the sigmoid output function is most efficiently dealt with using [[iteratively re-weighted least squares]].@@@@1@22@@danf@17-8-2009 10051460@unknown@formal@none@1@S@RBF networks have the disadvantage of requiring good coverage of the input space by radial basis functions.@@@@1@17@@danf@17-8-2009 10051470@unknown@formal@none@1@S@RBF centres are determined with reference to the distribution of the input data, but without reference to the prediction task.@@@@1@20@@danf@17-8-2009 10051480@unknown@formal@none@1@S@As a result, representational resources may be wasted on areas of the input space that are irrelevant to the learning task.@@@@1@21@@danf@17-8-2009 10051490@unknown@formal@none@1@S@A common solution is to associate each data point with its own centre, although this can make the linear system to be solved in the final layer rather large, and requires shrinkage techniques to avoid overfitting.@@@@1@36@@danf@17-8-2009 10051500@unknown@formal@none@1@S@Associating each input datum with an RBF leads naturally to kernel methods such as [[Support Vector Machine]]s and Gaussian Processes (the RBF is the kernel function).@@@@1@26@@danf@17-8-2009 10051510@unknown@formal@none@1@S@All three approaches use a non-linear kernel function to project the input data into a space where the learning problem can be solved using a linear model.@@@@1@27@@danf@17-8-2009 10051520@unknown@formal@none@1@S@Like Gaussian Processes, and unlike SVMs, RBF networks are typically trained in a Maximum Likelihood framework by maximizing the probability (minimizing the error) of the data under the model.@@@@1@29@@danf@17-8-2009 10051530@unknown@formal@none@1@S@SVMs take a different approach to avoiding overfitting by maximizing instead a margin.@@@@1@13@@danf@17-8-2009 10051540@unknown@formal@none@1@S@RBF networks are outperformed in most classification applications by SVMs.@@@@1@10@@danf@17-8-2009 10051550@unknown@formal@none@1@S@In regression applications they can be competitive when the dimensionality of the input space is relatively small.@@@@1@17@@danf@17-8-2009 10051560@unknown@formal@none@1@S@===Kohonen self-organizings network===@@@@1@3@@danf@17-8-2009 10051570@unknown@formal@none@1@S@The self-organizing map (SOM) invented by [[Teuvo Kohonen]] uses a form of [[unsupervised learning]].@@@@1@14@@danf@17-8-2009 10051580@unknown@formal@none@1@S@A set of artificial neurons learn to map points in an input space to coordinates in an output space.@@@@1@19@@danf@17-8-2009 10051590@unknown@formal@none@1@S@The input space can have different dimensions and topology from the output space, and the SOM will attempt to preserve these.@@@@1@21@@danf@17-8-2009 10051600@unknown@formal@none@1@S@===Recurrent network===@@@@1@2@@danf@17-8-2009 10051610@unknown@formal@none@1@S@Contrary to feedforward networks, [[recurrent neural network]]s (RNs) are models with bi-directional data flow.@@@@1@14@@danf@17-8-2009 10051620@unknown@formal@none@1@S@While a feedforward network propagates data linearly from input to output, RNs also propagate data from later processing stages to earlier stages.@@@@1@22@@danf@17-8-2009 10051630@unknown@formal@none@1@S@====Simple recurrent network====@@@@1@3@@danf@17-8-2009 10051640@unknown@formal@none@1@S@A ''simple recurrent network'' (SRN) is a variation on the Multi-Layer Perceptron, sometimes called an "Elman network" due to its invention by [[Jeff Elman]].@@@@1@24@@danf@17-8-2009 10051650@unknown@formal@none@1@S@A three-layer network is used, with the addition of a set of "context units" in the input layer.@@@@1@18@@danf@17-8-2009 10051660@unknown@formal@none@1@S@There are connections from the middle (hidden) layer to these context units fixed with a weight of one.@@@@1@18@@danf@17-8-2009 10051670@unknown@formal@none@1@S@At each time step, the input is propagated in a standard feed-forward fashion, and then a learning rule (usually back-propagation) is applied.@@@@1@22@@danf@17-8-2009 10051680@unknown@formal@none@1@S@The fixed back connections result in the context units always maintaining a copy of the previous values of the hidden units (since they propagate over the connections before the learning rule is applied).@@@@1@33@@danf@17-8-2009 10051690@unknown@formal@none@1@S@Thus the network can maintain a sort of state, allowing it to perform such tasks as sequence-prediction that are beyond the power of a standard Multi-Layer Perceptron.@@@@1@27@@danf@17-8-2009 10051700@unknown@formal@none@1@S@In a ''fully recurrent network'', every neuron receives inputs from every other neuron in the network.@@@@1@16@@danf@17-8-2009 10051710@unknown@formal@none@1@S@These networks are not arranged in layers.@@@@1@7@@danf@17-8-2009 10051720@unknown@formal@none@1@S@Usually only a subset of the neurons receive external inputs in addition to the inputs from all the other neurons, and another disjunct subset of neurons report their output externally as well as sending it to all the neurons.@@@@1@39@@danf@17-8-2009 10051730@unknown@formal@none@1@S@These distinctive inputs and outputs perform the function of the input and output layers of a feed-forward or simple recurrent network, and also join all the other neurons in the recurrent processing.@@@@1@32@@danf@17-8-2009 10051740@unknown@formal@none@1@S@====Hopfield network====@@@@1@2@@danf@17-8-2009 10051750@unknown@formal@none@1@S@The [[Hopfield network]] is a recurrent neural network in which all connections are symmetric.@@@@1@14@@danf@17-8-2009 10051760@unknown@formal@none@1@S@Invented by [[John Hopfield]] in 1982, this network guarantees that its dynamics will converge.@@@@1@14@@danf@17-8-2009 10051770@unknown@formal@none@1@S@If the connections are trained using [[Hebbian learning]] then the Hopfield network can perform as robust content-addressable (or [[associative memory|associative]]) memory, resistant to connection alteration.@@@@1@25@@danf@17-8-2009 10051780@unknown@formal@none@1@S@====Echo state network====@@@@1@3@@danf@17-8-2009 10051790@unknown@formal@none@1@S@The [[echo state network]] (ESN) is a [[recurrent neural network]] with a sparsely connected random hidden layer.@@@@1@17@@danf@17-8-2009 10051800@unknown@formal@none@1@S@The weights of output neurons are the only part of the network that can change and be learned.@@@@1@18@@danf@17-8-2009 10051810@unknown@formal@none@1@S@ESN are good to (re)produce temporal patterns.@@@@1@7@@danf@17-8-2009 10051820@unknown@formal@none@1@S@====Long short term memory network====@@@@1@5@@danf@17-8-2009 10051830@unknown@formal@none@1@S@The [[Long short term memory]] is an artificial neural net structure that unlike traditional RNNs doesn't have the problem of vanishing gradients.@@@@1@22@@danf@17-8-2009 10051840@unknown@formal@none@1@S@It can therefore use long delays and can handle signals that have a mix of low and high frequency components.@@@@1@20@@danf@17-8-2009 10051850@unknown@formal@none@1@S@===Stochastic neural networks===@@@@1@3@@danf@17-8-2009 10051860@unknown@formal@none@1@S@A [[stochastic neural network]] differs from a typical neural network because it introduces random variations into the network.@@@@1@18@@danf@17-8-2009 10051870@unknown@formal@none@1@S@In a probabilistic view of neural networks, such random variations can be viewed as a form of [[statistical sampling]], such as [[Monte Carlo sampling]].@@@@1@24@@danf@17-8-2009 10051880@unknown@formal@none@1@S@====Boltzmann machine====@@@@1@2@@danf@17-8-2009 10051890@unknown@formal@none@1@S@The [[Boltzmann machine]] can be thought of as a noisy Hopfield network.@@@@1@12@@danf@17-8-2009 10051900@unknown@formal@none@1@S@Invented by [[Geoff Hinton]] and [[Terry Sejnowski]] in 1985, the Boltzmann machine is important because it is one of the first neural networks to demonstrate learning of latent variables (hidden units).@@@@1@31@@danf@17-8-2009 10051910@unknown@formal@none@1@S@Boltzmann machine learning was at first slow to simulate, but the [[contrastive divergence algorithm]] of Geoff Hinton (circa 2000) allows models such as Boltzmann machines and ''products of experts'' to be trained much faster.@@@@1@34@@danf@17-8-2009 10051920@unknown@formal@none@1@S@===Modular neural networks===@@@@1@3@@danf@17-8-2009 10051930@unknown@formal@none@1@S@Biological studies showed that the human brain functions not as a single massive network, but as a collection of small networks.@@@@1@21@@danf@17-8-2009 10051940@unknown@formal@none@1@S@This realisation gave birth to the concept of [[modular neural networks]], in which several small networks cooperate or compete to solve problems.@@@@1@22@@danf@17-8-2009 10051950@unknown@formal@none@1@S@====Committee of machines====@@@@1@3@@danf@17-8-2009 10051960@unknown@formal@none@1@S@A [[Committee machine|committee of machines]] (CoM) is a collection of different neural networks that together "vote" on a given example.@@@@1@20@@danf@17-8-2009 10051970@unknown@formal@none@1@S@This generally gives a much better result compared to other neural network models.@@@@1@13@@danf@17-8-2009 10051980@unknown@formal@none@1@S@In fact in many cases, starting with the same architecture and training but using different initial random weights gives vastly different networks.@@@@1@22@@danf@17-8-2009 10051990@unknown@formal@none@1@S@A CoM tends to stabilize the result.@@@@1@7@@danf@17-8-2009 10052000@unknown@formal@none@1@S@The CoM is similar to the general [[machine learning]] ''[[bootstrap Aggregating|bagging]]'' method, except that the necessary variety of machines in the committee is obtained by training from different random starting weights rather than training on different randomly selected subsets of the training data.@@@@1@43@@danf@17-8-2009 10052010@unknown@formal@none@1@S@====Associative neural network (ASNN)====@@@@1@4@@danf@17-8-2009 10052020@unknown@formal@none@1@S@The ASNN is an extension of the ''committee of machines'' that goes beyond a simple/weighted average of different models.@@@@1@19@@danf@17-8-2009 10052025@unknown@formal@none@1@S@[http://cogprints.soton.ac.uk/documents/disk0/00/00/14/41/index.html ASNN] represents a combination of an ensemble of feed-forward neural networks and the k-nearest neighbor technique ([[Nearest_neighbor_(pattern_recognition)|kNN]]).@@@@1@18@@danf@17-8-2009 10052030@unknown@formal@none@1@S@It uses the correlation between ensemble responses as a measure of '''distance''' amid the analyzed cases for the kNN.@@@@1@19@@danf@17-8-2009 10052040@unknown@formal@none@1@S@This corrects the bias of the neural network ensemble.@@@@1@9@@danf@17-8-2009 10052050@unknown@formal@none@1@S@An associative neural network has a memory that can coincide with the training set.@@@@1@14@@danf@17-8-2009 10052060@unknown@formal@none@1@S@If new data becomes available, the network instantly improves its predictive ability and provides data approximation (self-learn the data) without a need to retrain the ensemble.@@@@1@26@@danf@17-8-2009 10052070@unknown@formal@none@1@S@Another important feature of ASNN is the possibility to interpret neural network results by analysis of correlations between data cases in the space of models.@@@@1@25@@danf@17-8-2009 10052080@unknown@formal@none@1@S@The method is demonstrated at [http://www.vcclab.org/lab/asnn www.vcclab.org], where you can either use it online or download it.@@@@1@17@@danf@17-8-2009 10052090@unknown@formal@none@1@S@===Other types of networks===@@@@1@4@@danf@17-8-2009 10052100@unknown@formal@none@1@S@These special networks do not fit in any of the previous categories.@@@@1@12@@danf@17-8-2009 10052110@unknown@formal@none@1@S@====Holographic associative memory====@@@@1@3@@danf@17-8-2009 10052120@unknown@formal@none@1@S@[[Holographic associative memory|''Holographic associative memory'']] represents a family of analog, correlation-based, associative, stimulus-response memories, where information is mapped onto the phase orientation of complex numbers operating.@@@@1@26@@danf@17-8-2009 10052130@unknown@formal@none@1@S@====Instantaneously trained networks====@@@@1@3@@danf@17-8-2009 10052140@unknown@formal@none@1@S@''[[Instantaneously trained neural networks]]'' (ITNNs) were inspired by the phenomenon of short-term learning that seems to occur instantaneously.@@@@1@18@@danf@17-8-2009 10052150@unknown@formal@none@1@S@In these networks the weights of the hidden and the output layers are mapped directly from the training vector data.@@@@1@20@@danf@17-8-2009 10052160@unknown@formal@none@1@S@Ordinarily, they work on binary data, but versions for continuous data that require small additional processing are also available.@@@@1@19@@danf@17-8-2009 10052170@unknown@formal@none@1@S@====Spiking neural networks====@@@@1@3@@danf@17-8-2009 10052180@unknown@formal@none@1@S@[[Spiking neural network]]s (SNNs) are models which explicitly take into account the timing of inputs.@@@@1@15@@danf@17-8-2009 10052190@unknown@formal@none@1@S@The network input and output are usually represented as series of spikes (delta function or more complex shapes).@@@@1@18@@danf@17-8-2009 10052200@unknown@formal@none@1@S@SNNs have an advantage of being able to process information in the [[time domain]] (signals that vary over time).@@@@1@19@@danf@17-8-2009 10052210@unknown@formal@none@1@S@They are often implemented as recurrent networks.@@@@1@7@@danf@17-8-2009 10052220@unknown@formal@none@1@S@SNNs are also a form of [[pulse computer]].@@@@1@8@@danf@17-8-2009 10052230@unknown@formal@none@1@S@Networks of spiking neurons — and the temporal correlations of neural assemblies in such networks — have been used to model figure/ground separation and region linking in the visual system (see e.g. Reitboeck et.al.in Haken and Stadler: Synergetics of the Brain.@@@@1@41@@danf@17-8-2009 10052240@unknown@formal@none@1@S@Berlin, 1989).@@@@1@2@@danf@17-8-2009 10052250@unknown@formal@none@1@S@Gerstner and Kistler have a freely available online textbook on [http://diwww.epfl.ch/~gerstner/BUCH.html Spiking Neuron Models].@@@@1@14@@danf@17-8-2009 10052260@unknown@formal@none@1@S@Spiking neural networks with axonal conduction delays exhibit polychronization, and hence could have a potentially unlimited memory capacity.@@@@1@18@@danf@17-8-2009 10052270@unknown@formal@none@1@S@In June 2005 [[IBM]] announced construction of a [[Blue Gene]] [[supercomputer]] dedicated to the simulation of a large recurrent spiking neural network [http://domino.research.ibm.com/comm/pr.nsf/pages/news.20050606_CognitiveIntelligence.html].@@@@1@23@@danf@17-8-2009 10052280@unknown@formal@none@1@S@====Dynamic neural networks====@@@@1@3@@danf@17-8-2009 10052290@unknown@formal@none@1@S@[[Dynamic neural network]]s not only deal with nonlinear multivariate behaviour, but also include (learning of) time-dependent behaviour such as various transient phenomena and delay effects.@@@@1@25@@danf@17-8-2009 10052300@unknown@formal@none@1@S@====Cascading neural networks====@@@@1@3@@danf@17-8-2009 10052310@unknown@formal@none@1@S@''Cascade-Correlation'' is an architecture and [[supervised learning]] [[algorithm]] developed by [[Scott Fahlman]] and [[Christian Lebiere]].@@@@1@15@@danf@17-8-2009 10052320@unknown@formal@none@1@S@Instead of just adjusting the weights in a network of fixed topology, Cascade-Correlation begins with a minimal network, then automatically trains and adds new hidden units one by one, creating a multi-layer structure.@@@@1@33@@danf@17-8-2009 10052330@unknown@formal@none@1@S@Once a new hidden unit has been added to the network, its input-side weights are frozen.@@@@1@16@@danf@17-8-2009 10052340@unknown@formal@none@1@S@This unit then becomes a permanent feature-detector in the network, available for producing outputs or for creating other, more complex feature detectors.@@@@1@22@@danf@17-8-2009 10052350@unknown@formal@none@1@S@The Cascade-Correlation architecture has several advantages over existing algorithms: it learns very quickly, the network determines its own size and topology, it retains the structures it has built even if the training set changes, and it requires no [[back-propagation]] of error signals through the connections of the network.@@@@1@48@@danf@17-8-2009 10052360@unknown@formal@none@1@S@See: [[Cascade correlation algorithm]].@@@@1@4@@danf@17-8-2009 10052370@unknown@formal@none@1@S@====Neuro-fuzzy networks====@@@@1@2@@danf@17-8-2009 10052380@unknown@formal@none@1@S@A neuro-fuzzy network is a [[fuzzy inference system]] in the body of an artificial neural network.@@@@1@16@@danf@17-8-2009 10052390@unknown@formal@none@1@S@Depending on the ''FIS'' type, there are several layers that simulate the processes involved in a ''fuzzy inference'' like fuzzification, inference, aggregation and defuzzification.@@@@1@24@@danf@17-8-2009 10052400@unknown@formal@none@1@S@Embedding an ''FIS'' in a general structure of an ''ANN'' has the benefit of using available ''ANN'' training methods to find the parameters of a fuzzy system.@@@@1@27@@danf@17-8-2009 10052410@unknown@formal@none@1@S@====Holosemantic neural networks====@@@@1@3@@danf@17-8-2009 10052420@unknown@formal@none@1@S@The holosemantic neural network invented by Manfred Hoffleisch uses a kind a genetic algorithm to build a multidimensional structure.@@@@1@19@@danf@17-8-2009 10052430@unknown@formal@none@1@S@It takes into account the timing of inputs.@@@@1@8@@danf@17-8-2009 10052440@unknown@formal@none@1@S@====Compositional pattern-producing networks====@@@@1@3@@danf@17-8-2009 10052450@unknown@formal@none@1@S@[[Compositional pattern-producing network]]s (CPPNs) are a variation of ANNs which differ in their set of activation functions and how they are applied.@@@@1@22@@danf@17-8-2009 10052460@unknown@formal@none@1@S@While typical ANNs often contain only [[sigmoid function]]s (and sometimes [[Gaussian function]]s), CPPNs can include both types of functions and many others.@@@@1@22@@danf@17-8-2009 10052470@unknown@formal@none@1@S@Furthermore, unlike typical ANNs, CPPNs are applied across the entire space of possible inputs so that they can represent a complete image.@@@@1@22@@danf@17-8-2009 10052480@unknown@formal@none@1@S@Since they are compositions of functions, CPPNs in effect encode images at infinite resolution and can be sampled for a particular display at whatever resolution is optimal.@@@@1@27@@danf@17-8-2009 10052490@unknown@formal@none@1@S@==Theoretical properties==@@@@1@2@@danf@17-8-2009 10052500@unknown@formal@none@1@S@===Computational power===@@@@1@2@@danf@17-8-2009 10052510@unknown@formal@none@1@S@The multi-layer perceptron (MLP) is a universal function approximator, as proven by the [[Cybenko theorem]].@@@@1@15@@danf@17-8-2009 10052520@unknown@formal@none@1@S@However, the proof is not constructive regarding the number of neurons required or the settings of the weights.@@@@1@18@@danf@17-8-2009 10052530@unknown@formal@none@1@S@Work by [[Hava T. Siegelmann]] and [[Eduardo D. Sontag]] has provided a proof that a specific recurrent architecture with rational valued weights (as opposed to the commonly used floating point approximations) has the full power of a [[Universal Turing Machine]].@@@@1@40@@danf@17-8-2009 10052540@unknown@formal@none@1@S@They have further shown that the use of irrational values for weights results in a machine with trans-Turing power.@@@@1@19@@danf@17-8-2009 10052550@unknown@formal@none@1@S@===Capacity===@@@@1@1@@danf@17-8-2009 10052560@unknown@formal@none@1@S@Artificial neural network models have a property called 'capacity', which roughly corresponds to their ability to model any given function.@@@@1@20@@danf@17-8-2009 10052570@unknown@formal@none@1@S@It is related to the amount of information that can be stored in the network and to the notion of complexity.@@@@1@21@@danf@17-8-2009 10052580@unknown@formal@none@1@S@===Convergence===@@@@1@1@@danf@17-8-2009 10052590@unknown@formal@none@1@S@Nothing can be said in general about convergence since it depends on a number of factors.@@@@1@16@@danf@17-8-2009 10052600@unknown@formal@none@1@S@Firstly, there may exist many local minima.@@@@1@7@@danf@17-8-2009 10052610@unknown@formal@none@1@S@This depends on the cost function and the model.@@@@1@9@@danf@17-8-2009 10052620@unknown@formal@none@1@S@Secondly, the optimization method used might not be guaranteed to converge when far away from a local minimum.@@@@1@18@@danf@17-8-2009 10052630@unknown@formal@none@1@S@Thirdly, for a very large amount of data or parameters, some methods become impractical.@@@@1@14@@danf@17-8-2009 10052640@unknown@formal@none@1@S@In general, it has been found that theoretical guarantees regarding convergence are not always a very reliable guide to practical application.@@@@1@21@@danf@17-8-2009 10052650@unknown@formal@none@1@S@===Generalisation and statistics===@@@@1@3@@danf@17-8-2009 10052660@unknown@formal@none@1@S@In applications where the goal is to create a system that generalises well in unseen examples, the problem of overtraining has emerged.@@@@1@22@@danf@17-8-2009 10052670@unknown@formal@none@1@S@This arises in overcomplex or overspecified systems when the capacity of the network significantly exceeds the needed free parameters.@@@@1@19@@danf@17-8-2009 10052680@unknown@formal@none@1@S@There are two schools of thought for avoiding this problem: The first is to use cross-validation and similar techniques to check for the presence of overtraining and optimally select hyperparameters such as to minimise the generalisation error.@@@@1@37@@danf@17-8-2009 10052690@unknown@formal@none@1@S@The second is to use some form of ''regularisation''.@@@@1@9@@danf@17-8-2009 10052700@unknown@formal@none@1@S@This is a concept that emerges naturally in a probabilistic (Bayesian) framework, where the regularisation can be performed by putting a larger prior probability over simpler models; but also in statistical learning theory, where the goal is to minimise over two quantities: the 'empirical risk' and the 'structural risk', which roughly correspond to the error over the training set and the predicted error in unseen data due to overfitting.@@@@1@69@@danf@17-8-2009 10052710@unknown@formal@none@1@S@Supervised neural networks that use an [[Mean squared error|MSE]] cost function can use formal statistical methods to determine the confidence of the trained model.@@@@1@24@@danf@17-8-2009 10052720@unknown@formal@none@1@S@The MSE on a validation set can be used as an estimate for variance.@@@@1@14@@danf@17-8-2009 10052730@unknown@formal@none@1@S@This value can then be used to calculate the [[confidence interval]] of the output of the network, assuming a [[normal distribution]].@@@@1@21@@danf@17-8-2009 10052740@unknown@formal@none@1@S@A confidence analysis made this way is statistically valid as long as the output [[probability distribution]] stays the same and the network is not modified.@@@@1@25@@danf@17-8-2009 10052750@unknown@formal@none@1@S@By assigning a softmax activation function on the output layer of the neural network (or a softmax component in a component-based neural network) for categorical target variables, the outputs can be interpreted as posterior probabilities.@@@@1@35@@danf@17-8-2009 10052760@unknown@formal@none@1@S@This is very useful in classification as it gives a certainty measure on classifications.@@@@1@14@@danf@17-8-2009 10052770@unknown@formal@none@1@S@The softmax activation function: y_i=\\frac{e^{x_i}}{\\sum_{j=1}^c e^{x_j}}@@@@1@6@@danf@17-8-2009 10052780@unknown@formal@none@1@S@===Dynamic properties===@@@@1@2@@danf@17-8-2009 10052790@unknown@formal@none@1@S@Various techniques originally developed for studying disordered magnetic systems (i.e. the [[spin glass]]) have been successfully applied to simple neural network architectures, such as the Hopfield network.@@@@1@27@@danf@17-8-2009 10052800@unknown@formal@none@1@S@Influential work by E. Gardner and B. Derrida has revealed many interesting properties about perceptrons with real-valued synaptic weights, while later work by W. Krauth and M. Mezard has extended these principles to binary-valued synapses.@@@@1@35@@danf@17-8-2009 10060010@unknown@formal@none@1@S@
Association for Computational Linguistics
@@@@1@4@@danf@17-8-2009 10060020@unknown@formal@none@1@S@The '''Association for Computational Linguistics''' ('''ACL''') is the international scientific and professional society for people working on problems involving [[natural language and computation]].@@@@1@23@@danf@17-8-2009 10060030@unknown@formal@none@1@S@An annual meeting is held each summer in locations where significant [[computational linguistics]] research is carried out.@@@@1@17@@danf@17-8-2009 10060040@unknown@formal@none@1@S@It was founded in [[1962]], originally named the '''Association for Machine Translation and Computational Linguistics''' ('''AMTCL''').@@@@1@16@@danf@17-8-2009 10060050@unknown@formal@none@1@S@It became the ACL in [[1968]].@@@@1@6@@danf@17-8-2009 10060060@unknown@formal@none@1@S@The ACL has [[Europe]]an and [[North America]]n chapters, the [[European Chapter of the Association for Computational Linguistics]] (EACL) and the [[North American Chapter of the Association for Computational Linguistics]] (NAACL).@@@@1@30@@danf@17-8-2009 10060070@unknown@formal@none@1@S@The ACL journal, [[Computational Linguistics (journal)|''Computational Linguistics'']], continues to be the primary forum for research on computational linguistics and [[natural language processing]].@@@@1@22@@danf@17-8-2009 10060080@unknown@formal@none@1@S@Since [[1988]], the journal has been published for the ACL by [[MIT Press]].@@@@1@13@@danf@17-8-2009 10060090@unknown@formal@none@1@S@The ACL book series, [[Studies_in_NLP|''Studies in Natural Language Processing'']], is published by [[Cambridge University Press]].@@@@1@15@@danf@17-8-2009 10060100@unknown@formal@none@1@S@==Special Interest Groups==@@@@1@3@@danf@17-8-2009 10060110@unknown@formal@none@1@S@ACL has a large number of Special Interest Groups (SIGs), focusing on specific areas of natural language processing.@@@@1@18@@danf@17-8-2009 10060120@unknown@formal@none@1@S@Some current SIGs within ACL are:@@@@1@6@@danf@17-8-2009 10060130@unknown@formal@none@1@S@*Linguistic data and corpus-based approaches: [http://www.aclweb.org/sigdat SIGDAT]@@@@1@7@@danf@17-8-2009 10060140@unknown@formal@none@1@S@*Dialogue Processing: [http://www.aclweb.org/sigdial SIGDIAL]@@@@1@4@@danf@17-8-2009 10060150@unknown@formal@none@1@S@*Natural Language Generation: [http://www.siggen.org SIGGEN]@@@@1@5@@danf@17-8-2009 10060160@unknown@formal@none@1@S@*Lexicon: [http://www.aclweb.org/siglex SIGLEX]@@@@1@3@@danf@17-8-2009 10060170@unknown@formal@none@1@S@*Mathematics of Language: [http://molweb.org SIGMOL]@@@@1@5@@danf@17-8-2009 10060180@unknown@formal@none@1@S@*Computational Morphology and Phonology: [http://salad.cs.swarthmore.edu/sigphon SIGMORPHON]@@@@1@6@@danf@17-8-2009 10060190@unknown@formal@none@1@S@*Computational Semantics: [http://www.aclweb.org/sigsem/ SIGSEM]@@@@1@4@@danf@17-8-2009 10070010@unknown@formal@none@1@S@
Babel Fish (website)
@@@@1@3@@danf@17-8-2009 10070020@unknown@formal@none@1@S@'''Babel Fish''' is a [[World Wide Web|web]]-based application on [[Yahoo!]] that [[machine translation|machine translates]] text or web pages from one of several languages into another.@@@@1@25@@danf@17-8-2009 10070030@unknown@formal@none@1@S@Developed by [[AltaVista]], the application is named after the fictional animal used for instantaneous [[language]] [[translation]] in [[Douglas Adams]]'s series ''[[The Hitchhiker's Guide to the Galaxy]].''@@@@1@26@@danf@17-8-2009 10070040@unknown@formal@none@1@S@In turn the fish is a reference to the [[biblical]] account of the city of [[Babel]] and the [[Confusion of tongues|various languages]] said to have arisen there.@@@@1@27@@danf@17-8-2009 10070050@unknown@formal@none@1@S@The translation technology for Babel Fish is provided by [[SYSTRAN]], whose technology also powers a number of other sites and portals.@@@@1@21@@danf@17-8-2009 10070060@unknown@formal@none@1@S@It translates among [[English language|English]], [[Simplified Chinese]], [[Traditional Chinese]], [[Dutch language|Dutch]], [[French language|French]], [[German language|German]], [[Greek language|Greek]], [[Italian language|Italian]], [[Japanese language|Japanese]], [[Korean language|Korean]], [[Portuguese language|Portuguese]], [[Russian language|Russian]], and [[Spanish language|Spanish]].@@@@1@30@@danf@17-8-2009 10070070@unknown@formal@none@1@S@The service makes no claim to produce a perfect translation.@@@@1@10@@danf@17-8-2009 10070080@unknown@formal@none@1@S@A number of humour sites have sprung up that use the Babel Fish service to translate back and forth between one or more languages (a so-called [[round-trip translation]]).@@@@1@28@@danf@17-8-2009 10070090@unknown@formal@none@1@S@After a long existence at babelfish.altavista.com, the site was moved on May 9 2008 to babelfish.yahoo.com.@@@@1@16@@danf@17-8-2009 10080010@unknown@formal@none@1@S@
Bioinformatics
@@@@1@1@@danf@17-8-2009 10080020@unknown@formal@none@1@S@'''Bioinformatics''' and '''computational biology''' involve the use of techniques including [[applied mathematics]], [[informatics]], [[statistics]], [[computer science]], [[artificial intelligence]], [[chemistry]], and [[biochemistry]] to solve [[biology|biological]] problems usually on the [[molecular]] level.@@@@1@30@@danf@17-8-2009 10080030@unknown@formal@none@1@S@The core principle of these techniques is using computing resources in order to solve problems on scales of magnitude far too great for human discernment.@@@@1@25@@danf@17-8-2009 10080040@unknown@formal@none@1@S@Research in computational biology often overlaps with [[systems biology]].@@@@1@9@@danf@17-8-2009 10080050@unknown@formal@none@1@S@Major research efforts in the field include [[sequence alignment]], [[gene finding]], [[genome assembly]], [[protein structural alignment|protein structure alignment]], [[protein structure prediction]], prediction of [[gene expression]] and [[protein-protein interactions]], and the modeling of [[evolution]].@@@@1@33@@danf@17-8-2009 10080060@unknown@formal@none@1@S@==Introduction==@@@@1@1@@danf@17-8-2009 10080070@unknown@formal@none@1@S@The terms '''''bioinformatics''''' and ''[[computational biology]]'' are often used interchangeably.@@@@1@10@@danf@17-8-2009 10080080@unknown@formal@none@1@S@However ''bioinformatics'' more properly refers to the creation and advancement of algorithms, computational and statistical techniques, and theory to solve formal and practical problems arising from the management and analysis of biological data.@@@@1@33@@danf@17-8-2009 10080090@unknown@formal@none@1@S@Computational biology, on the other hand, refers to hypothesis-driven investigation of a specific biological problem using computers, carried out with experimental or simulated data, with the primary goal of discovery and the advancement of biological knowledge.@@@@1@36@@danf@17-8-2009 10080100@unknown@formal@none@1@S@Put more simply, bioinformatics is concerned with the information while computational biology is concerned with the hypotheses.@@@@1@17@@danf@17-8-2009 10080110@unknown@formal@none@1@S@A similar distinction is made by [[NIH|National Institutes of Health]] in their [http://www.bisti.nih.gov/CompuBioDef.pdf working definitions of Bioinformatics and Computational Biology], where it is further emphasized that there is a tight coupling of developments and knowledge between the more hypothesis-driven research in computational biology and technique-driven research in bioinformatics.@@@@1@48@@danf@17-8-2009 10080120@unknown@formal@none@1@S@Bioinformatics is also often specified as an applied subfield of the more general discipline of [[Biomedical informatics]].@@@@1@17@@danf@17-8-2009 10080130@unknown@formal@none@1@S@A common thread in projects in bioinformatics and computational biology is the use of mathematical tools to extract useful information from data produced by high-throughput biological techniques such as [[genome sequencing]].@@@@1@31@@danf@17-8-2009 10080140@unknown@formal@none@1@S@A representative problem in bioinformatics is the assembly of high-quality genome sequences from fragmentary "shotgun" DNA [[sequencing]].@@@@1@17@@danf@17-8-2009 10080150@unknown@formal@none@1@S@Other common problems include the study of [[gene regulation]] to perform [[expression profiling]] using data from [[DNA microarray|microarrays]] or [[mass spectrometry]].@@@@1@21@@danf@17-8-2009 10080160@unknown@formal@none@1@S@==Major research areas==@@@@1@3@@danf@17-8-2009 10080170@unknown@formal@none@1@S@===Sequence analysis===@@@@1@2@@danf@17-8-2009 10080180@unknown@formal@none@1@S@Since the [[Phi-X174 phage|Phage Φ-X174]] was [[sequencing|sequenced]] in 1977, the [[DNA sequence]]s of hundreds of organisms have been decoded and stored in databases.@@@@1@23@@danf@17-8-2009 10080190@unknown@formal@none@1@S@The information is analyzed to determine genes that encode [[polypeptides]], as well as regulatory sequences.@@@@1@15@@danf@17-8-2009 10080200@unknown@formal@none@1@S@A comparison of genes within a [[species]] or between different species can show similarities between protein functions, or relations between species (the use of [[molecular systematics]] to construct [[phylogenetic tree]]s).@@@@1@30@@danf@17-8-2009 10080210@unknown@formal@none@1@S@With the growing amount of data, it long ago became impractical to analyze DNA sequences manually.@@@@1@16@@danf@17-8-2009 10080220@unknown@formal@none@1@S@Today, [[computer program]]s are used to search the [[genome]] of thousands of organisms, containing billions of [[nucleotide]]s.@@@@1@17@@danf@17-8-2009 10080230@unknown@formal@none@1@S@These programs would compensate for mutations (exchanged, deleted or inserted bases) in the DNA sequence, in order to identify sequences that are related, but not identical.@@@@1@26@@danf@17-8-2009 10080240@unknown@formal@none@1@S@A variant of this [[sequence alignment]] is used in the sequencing process itself.@@@@1@13@@danf@17-8-2009 10080250@unknown@formal@none@1@S@The so-called [[shotgun sequencing]] technique (which was used, for example, by [[The Institute for Genomic Research]] to sequence the first bacterial genome, ''Haemophilus influenzae'') does not give a sequential list of nucleotides, but instead the sequences of thousands of small DNA fragments (each about 600-800 nucleotides long).@@@@1@47@@danf@17-8-2009 10080260@unknown@formal@none@1@S@The ends of these fragments overlap and, when aligned in the right way, make up the complete genome.@@@@1@18@@danf@17-8-2009 10080270@unknown@formal@none@1@S@Shotgun sequencing yields sequence data quickly, but the task of assembling the fragments can be quite complicated for larger genomes.@@@@1@20@@danf@17-8-2009 10080280@unknown@formal@none@1@S@In the case of the [[Human Genome Project]], it took several months of CPU time (on a circa-2000 vintage [[DEC Alpha]] computer) to assemble the fragments.@@@@1@26@@danf@17-8-2009 10080290@unknown@formal@none@1@S@Shotgun sequencing is the method of choice for virtually all genomes sequenced today, and genome assembly algorithms are a critical area of bioinformatics research.@@@@1@24@@danf@17-8-2009 10080300@unknown@formal@none@1@S@Another aspect of bioinformatics in sequence analysis is the automatic [[gene finding|search for genes]] and regulatory sequences within a genome.@@@@1@20@@danf@17-8-2009 10080310@unknown@formal@none@1@S@Not all of the nucleotides within a genome are genes.@@@@1@10@@danf@17-8-2009 10080320@unknown@formal@none@1@S@Within the genome of higher organisms, large parts of the DNA do not serve any obvious purpose.@@@@1@17@@danf@17-8-2009 10080330@unknown@formal@none@1@S@This so-called [[junk DNA]] may, however, contain unrecognized functional elements.@@@@1@10@@danf@17-8-2009 10080340@unknown@formal@none@1@S@Bioinformatics helps to bridge the gap between genome and [[proteome]] projects--for example, in the use of DNA sequences for protein identification.@@@@1@21@@danf@17-8-2009 10080350@unknown@formal@none@1@S@''See also:'' [[sequence analysis]], [[sequence profiling tool]], [[sequence motif]].@@@@1@9@@danf@17-8-2009 10080360@unknown@formal@none@1@S@===Genome annotation===@@@@1@2@@danf@17-8-2009 10080370@unknown@formal@none@1@S@In the context of [[genomics]], '''annotation''' is the process of marking the genes and other biological features in a DNA sequence.@@@@1@21@@danf@17-8-2009 10080380@unknown@formal@none@1@S@The first genome annotation software system was designed in 1995 by Dr. Owen White, who was part of the team that sequenced and analyzed the first genome of a free-living organism to be decoded, the bacterium ''[[Haemophilus influenzae]]''.@@@@1@38@@danf@17-8-2009 10080390@unknown@formal@none@1@S@Dr. White built a software system to find the genes (places in the DNA sequence that encode a protein), the transfer RNA, and other features, and to make initial assignments of function to those genes.@@@@1@35@@danf@17-8-2009 10080400@unknown@formal@none@1@S@Most current genome annotation systems work similarly, but the programs available for analysis of genomic DNA are constantly changing and improving.@@@@1@21@@danf@17-8-2009 10080410@unknown@formal@none@1@S@===Computational evolutionary biology===@@@@1@3@@danf@17-8-2009 10080420@unknown@formal@none@1@S@[[Evolutionary biology]] is the study of the origin and descent of [[species]], as well as their change over time.@@@@1@19@@danf@17-8-2009 10080430@unknown@formal@none@1@S@Informatics has assisted evolutionary biologists in several key ways; it has enabled researchers to:@@@@1@14@@danf@17-8-2009 10080440@unknown@formal@none@1@S@*trace the evolution of a large number of organisms by measuring changes in their [[DNA]], rather than through [[physical taxonomy]] or physiological observations alone,@@@@1@24@@danf@17-8-2009 10080450@unknown@formal@none@1@S@*more recently, compare entire [[genomes]], which permits the study of more complex evolutionary events, such as [[gene duplication]], [[lateral gene transfer]], and the prediction of factors important in bacterial [[speciation]],@@@@1@30@@danf@17-8-2009 10080460@unknown@formal@none@1@S@*build complex computational models of populations to predict the outcome of the system over time@@@@1@15@@danf@17-8-2009 10080470@unknown@formal@none@1@S@*track and share information on an increasingly large number of species and organisms@@@@1@13@@danf@17-8-2009 10080480@unknown@formal@none@1@S@Future work endeavours to reconstruct the now more complex [[Evolutionary tree|tree of life]].@@@@1@13@@danf@17-8-2009 10080490@unknown@formal@none@1@S@The area of research within [[computer science]] that uses [[genetic algorithm]]s is sometimes confused with [[computational evolutionary biology]], but the two areas are unrelated.@@@@1@24@@danf@17-8-2009 10080500@unknown@formal@none@1@S@===Measuring biodiversity===@@@@1@2@@danf@17-8-2009 10080510@unknown@formal@none@1@S@[[Biodiversity]] of an ecosystem might be defined as the total genomic complement of a particular environment, from all of the species present, whether it is a biofilm in an abandoned mine, a drop of sea water, a scoop of soil, or the entire [[biosphere]] of the planet [[Earth]].@@@@1@48@@danf@17-8-2009 10080520@unknown@formal@none@1@S@Databases are used to collect the [[species]] names, descriptions, distributions, genetic information, status and size of [[population]]s, [[Habitat (ecology)|habitat]] needs, and how each organism interacts with other species.@@@@1@28@@danf@17-8-2009 10080530@unknown@formal@none@1@S@Specialized [[computer software|software]] programs are used to find, visualize, and analyze the information, and most importantly, communicate it to other people.@@@@1@21@@danf@17-8-2009 10080540@unknown@formal@none@1@S@Computer simulations model such things as population dynamics, or calculate the cumulative genetic health of a breeding pool (in [[agriculture]]) or endangered population (in [[conservation ecology|conservation]]).@@@@1@26@@danf@17-8-2009 10080550@unknown@formal@none@1@S@One very exciting potential of this field is that entire [[DNA]] sequences, or [[genome]]s of [[endangered species]] can be preserved, allowing the results of Nature's genetic experiment to be remembered ''[[in silico]]'', and possibly reused in the future, even if that species is eventually lost.@@@@1@45@@danf@17-8-2009 10080560@unknown@formal@none@1@S@''Important projects:'' [http://www.sp2000.org/ Species 2000 project]; [http://www.ubio.org/ uBio Project].@@@@1@9@@danf@17-8-2009 10080570@unknown@formal@none@1@S@===Analysis of gene expression===@@@@1@4@@danf@17-8-2009 10080580@unknown@formal@none@1@S@The [[gene expression|expression]] of many genes can be determined by measuring [[mRNA]] levels with multiple techniques including [[DNA microarray|microarrays]], [[expressed sequence tag|expressed cDNA sequence tag]] (EST) sequencing, [[serial analysis of gene expression]] (SAGE) tag sequencing, [[massively parallel signature sequencing]] (MPSS), or various applications of multiplexed in-situ hybridization.@@@@1@47@@danf@17-8-2009 10080590@unknown@formal@none@1@S@All of these techniques are extremely noise-prone and/or subject to bias in the biological measurement, and a major research area in computational biology involves developing statistical tools to separate [[signal (information theory)|signal]] from [[noise]] in high-throughput gene expression studies.@@@@1@39@@danf@17-8-2009 10080600@unknown@formal@none@1@S@Such studies are often used to determine the genes implicated in a disorder: one might compare microarray data from cancerous [[epithelial]] cells to data from non-cancerous cells to determine the transcripts that are up-regulated and down-regulated in a particular population of cancer cells.@@@@1@43@@danf@17-8-2009 10080610@unknown@formal@none@1@S@===Analysis of regulation===@@@@1@3@@danf@17-8-2009 10080620@unknown@formal@none@1@S@Regulation is the complex orchestration of events starting with an extracellular signal such as a [[hormone]] and leading to an increase or decrease in the activity of one or more [[protein]]s.@@@@1@31@@danf@17-8-2009 10080630@unknown@formal@none@1@S@Bioinformatics techniques have been applied to explore various steps in this process.@@@@1@12@@danf@17-8-2009 10080640@unknown@formal@none@1@S@For example, [[promoter analysis]] involves the identification and study of [[sequence motif]]s in the DNA surrounding the coding region of a gene.@@@@1@22@@danf@17-8-2009 10080650@unknown@formal@none@1@S@These motifs influence the extent to which that region is transcribed into mRNA.@@@@1@13@@danf@17-8-2009 10080660@unknown@formal@none@1@S@Expression data can be used to infer gene regulation: one might compare [[microarray]] data from a wide variety of states of an organism to form hypotheses about the genes involved in each state.@@@@1@33@@danf@17-8-2009 10080670@unknown@formal@none@1@S@In a single-cell organism, one might compare stages of the [[cell cycle]], along with various stress conditions (heat shock, starvation, etc.).@@@@1@21@@danf@17-8-2009 10080680@unknown@formal@none@1@S@One can then apply [[cluster analysis|clustering algorithms]] to that expression data to determine which genes are co-expressed.@@@@1@17@@danf@17-8-2009 10080690@unknown@formal@none@1@S@For example, the upstream regions (promoters) of co-expressed genes can be searched for over-represented [[regulatory elements]].@@@@1@16@@danf@17-8-2009 10080700@unknown@formal@none@1@S@===Analysis of protein expression===@@@@1@4@@danf@17-8-2009 10080710@unknown@formal@none@1@S@[[Protein microarray]]s and high throughput (HT) [[mass spectrometry]] (MS) can provide a snapshot of the proteins present in a biological sample.@@@@1@21@@danf@17-8-2009 10080720@unknown@formal@none@1@S@Bioinformatics is very much involved in making sense of protein microarray and HT MS data; the former approach faces similar problems as with microarrays targeted at mRNA, the latter involves the problem of matching large amounts of mass data against predicted masses from protein sequence databases, and the complicated statistical analysis of samples where multiple, but incomplete peptides from each protein are detected.@@@@1@63@@danf@17-8-2009 10080730@unknown@formal@none@1@S@===Analysis of mutations in cancer===@@@@1@5@@danf@17-8-2009 10080740@unknown@formal@none@1@S@In cancer, the genomes of affected cells are rearranged in complex or even unpredictable ways.@@@@1@15@@danf@17-8-2009 10080750@unknown@formal@none@1@S@Massive sequencing efforts are used to identify previously unknown [[point mutation]]s in a variety of [[gene]]s in [[cancer]].@@@@1@18@@danf@17-8-2009 10080760@unknown@formal@none@1@S@Bioinformaticians continue to produce specialized automated systems to manage the sheer volume of sequence data produced, and they create new algorithms and software to compare the sequencing results to the growing collection of [[human genome]] sequences and [[germline]] polymorphisms.@@@@1@39@@danf@17-8-2009 10080770@unknown@formal@none@1@S@New physical detection technology are employed, such as [[oligonucleotide]] microarrays to identify chromosomal gains and losses (called [[comparative genomic hybridization]]), and [[single nucleotide polymorphism]] arrays to detect known ''point mutations''.@@@@1@30@@danf@17-8-2009 10080780@unknown@formal@none@1@S@These detection methods simultaneously measure several hundred thousand sites throughout the genome, and when used in high-throughput to measure thousands of samples, generate [[terabyte]]s of data per experiment.@@@@1@28@@danf@17-8-2009 10080790@unknown@formal@none@1@S@Again the massive amounts and new types of data generate new opportunities for bioinformaticians.@@@@1@14@@danf@17-8-2009 10080800@unknown@formal@none@1@S@The data is often found to contain considerable variability, or [[noise]], and thus [[Hidden Markov model]] and [[change-point analysis]] methods are being developed to infer real [[copy number variation|copy number]] changes.@@@@1@31@@danf@17-8-2009 10080810@unknown@formal@none@1@S@Another type of data that requires novel informatics development is the analysis of lesions found to be recurrent among many tumors .@@@@1@22@@danf@17-8-2009 10080820@unknown@formal@none@1@S@===Prediction of protein structure===@@@@1@4@@danf@17-8-2009 10080830@unknown@formal@none@1@S@Protein structure prediction is another important application of bioinformatics.@@@@1@9@@danf@17-8-2009 10080840@unknown@formal@none@1@S@The [[amino acid]] sequence of a protein, the so-called [[primary structure]], can be easily determined from the sequence on the gene that codes for it.@@@@1@25@@danf@17-8-2009 10080850@unknown@formal@none@1@S@In the vast majority of cases, this primary structure uniquely determines a structure in its native environment.@@@@1@17@@danf@17-8-2009 10080860@unknown@formal@none@1@S@(Of course, there are exceptions, such as the [[bovine spongiform encephalopathy]] - aka [[Mad Cow Disease]] - [[prion]].)@@@@1@18@@danf@17-8-2009 10080870@unknown@formal@none@1@S@Knowledge of this structure is vital in understanding the function of the protein.@@@@1@13@@danf@17-8-2009 10080880@unknown@formal@none@1@S@For lack of better terms, structural information is usually classified as one of ''[[secondary structure|secondary]]'', ''[[tertiary structure|tertiary]]'' and ''[[quaternary structure|quaternary]]'' structure.@@@@1@21@@danf@17-8-2009 10080890@unknown@formal@none@1@S@A viable general solution to such predictions remains an open problem.@@@@1@11@@danf@17-8-2009 10080900@unknown@formal@none@1@S@As of now, most efforts have been directed towards heuristics that work most of the time.@@@@1@16@@danf@17-8-2009 10080910@unknown@formal@none@1@S@One of the key ideas in bioinformatics is the notion of [[homology (biology)|homology]].@@@@1@13@@danf@17-8-2009 10080920@unknown@formal@none@1@S@In the genomic branch of bioinformatics, homology is used to predict the function of a gene: if the sequence of gene ''A'', whose function is known, is homologous to the sequence of gene ''B,'' whose function is unknown, one could infer that B may share A's function.@@@@1@47@@danf@17-8-2009 10080930@unknown@formal@none@1@S@In the structural branch of bioinformatics, homology is used to determine which parts of a protein are important in structure formation and interaction with other proteins.@@@@1@26@@danf@17-8-2009 10080940@unknown@formal@none@1@S@In a technique called homology modeling, this information is used to predict the structure of a protein once the structure of a homologous protein is known.@@@@1@26@@danf@17-8-2009 10080950@unknown@formal@none@1@S@This currently remains the only way to predict protein structures reliably.@@@@1@11@@danf@17-8-2009 10080960@unknown@formal@none@1@S@One example of this is the similar protein homology between hemoglobin in humans and the hemoglobin in legumes ([[leghemoglobin]]).@@@@1@19@@danf@17-8-2009 10080970@unknown@formal@none@1@S@Both serve the same purpose of transporting oxygen in the organism.@@@@1@11@@danf@17-8-2009 10080980@unknown@formal@none@1@S@Though both of these proteins have completely different amino acid sequences, their protein structures are virtually identical, which reflects their near identical purposes.@@@@1@23@@danf@17-8-2009 10080990@unknown@formal@none@1@S@Other techniques for predicting protein structure include protein threading and ''de novo'' (from scratch) physics-based modeling.@@@@1@16@@danf@17-8-2009 10081000@unknown@formal@none@1@S@''See also:'' [[structural motif]] and [[structural domain]].@@@@1@7@@danf@17-8-2009 10081010@unknown@formal@none@1@S@=== Comparative genomics ===@@@@1@4@@danf@17-8-2009 10081020@unknown@formal@none@1@S@The core of comparative genome analysis is the establishment of the correspondence between [[genes]] (orthology analysis) or other genomic features in different organisms.@@@@1@23@@danf@17-8-2009 10081030@unknown@formal@none@1@S@It is these intergenomic maps that make it possible to trace the evolutionary processes responsible for the divergence of two genomes.@@@@1@21@@danf@17-8-2009 10081040@unknown@formal@none@1@S@A multitude of evolutionary events acting at various organizational levels shape genome evolution.@@@@1@13@@danf@17-8-2009 10081050@unknown@formal@none@1@S@At the lowest level, point mutations affect individual nucleotides.@@@@1@9@@danf@17-8-2009 10081060@unknown@formal@none@1@S@At a higher level, large chromosomal segments undergo duplication, lateral transfer, inversion, transposition, deletion and insertion.@@@@1@16@@danf@17-8-2009 10081070@unknown@formal@none@1@S@Ultimately, whole genomes are involved in processes of hybridization, polyploidization and [[endosymbiosis]], often leading to rapid speciation.@@@@1@17@@danf@17-8-2009 10081080@unknown@formal@none@1@S@The complexity of genome evolution poses many exciting challenges to developers of mathematical models and algorithms, who have recourse to a spectra of algorithmic, statistical and mathematical techniques, ranging from exact, [[heuristics]], fixed parameter and [[approximation algorithms]] for problems based on parsimony models to [[Markov Chain Monte Carlo]] algorithms for [[Bayesian analysis]] of problems based on probabilistic models.@@@@1@58@@danf@17-8-2009 10081090@unknown@formal@none@1@S@Many of these studies are based on the homology detection and protein families computation.@@@@1@14@@danf@17-8-2009 10081100@unknown@formal@none@1@S@===Modeling biological systems===@@@@1@3@@danf@17-8-2009 10081110@unknown@formal@none@1@S@Systems biology involves the use of [[computer simulation]]s of [[cell (biology)|cellular]] subsystems (such as the [[metabolic network|networks of metabolites]] and [[enzyme]]s which comprise [[metabolism]], [[signal transduction]] pathways and [[gene regulatory network]]s) to both analyze and visualize the complex connections of these cellular processes.@@@@1@43@@danf@17-8-2009 10081120@unknown@formal@none@1@S@[[Artificial life]] or virtual evolution attempts to understand evolutionary processes via the computer simulation of simple (artificial) life forms.@@@@1@19@@danf@17-8-2009 10081130@unknown@formal@none@1@S@===High-throughput image analysis===@@@@1@3@@danf@17-8-2009 10081140@unknown@formal@none@1@S@Computational technologies are used to accelerate or fully automate the processing, quantification and analysis of large amounts of high-information-content [[biomedical imagery]].@@@@1@21@@danf@17-8-2009 10081150@unknown@formal@none@1@S@Modern image analysis systems augment an observer's ability to make measurements from a large or complex set of images, by improving [[accuracy]], [[Objectivity (science)|objectivity]], or speed.@@@@1@26@@danf@17-8-2009 10081160@unknown@formal@none@1@S@A fully developed analysis system may completely replace the observer.@@@@1@10@@danf@17-8-2009 10081170@unknown@formal@none@1@S@Although these systems are not unique to biomedical imagery, biomedical imaging is becoming more important for both [[diagnostics]] and research.@@@@1@20@@danf@17-8-2009 10081180@unknown@formal@none@1@S@Some examples are:@@@@1@3@@danf@17-8-2009 10081190@unknown@formal@none@1@S@* high-throughput and high-fidelity quantification and sub-cellular localization ([[high-content screening]], [[cytohistopathology]])@@@@1@11@@danf@17-8-2009 10081200@unknown@formal@none@1@S@* [[morphometrics]]@@@@1@2@@danf@17-8-2009 10081210@unknown@formal@none@1@S@* clinical image analysis and visualization@@@@1@6@@danf@17-8-2009 10081220@unknown@formal@none@1@S@* determining the real-time air-flow patterns in breathing lungs of living animals@@@@1@12@@danf@17-8-2009 10081230@unknown@formal@none@1@S@* quantifying occlusion size in real-time imagery from the development of and recovery during arterial injury@@@@1@16@@danf@17-8-2009 10081240@unknown@formal@none@1@S@* making behavioral observations from extended video recordings of laboratory animals@@@@1@11@@danf@17-8-2009 10081250@unknown@formal@none@1@S@* infrared measurements for metabolic activity determination@@@@1@7@@danf@17-8-2009 10081260@unknown@formal@none@1@S@===Protein-protein docking===@@@@1@2@@danf@17-8-2009 10081270@unknown@formal@none@1@S@In the last two decades, tens of thousands of protein three-dimensional structures have been determined by [[X-ray crystallography]] and [[Protein nuclear magnetic resonance spectroscopy]] (protein NMR).@@@@1@26@@danf@17-8-2009 10081280@unknown@formal@none@1@S@One central question for the biological scientist is whether it is practical to predict possible protein-protein interactions only based on these 3D shapes, without doing [[protein-protein interaction]] experiments.@@@@1@28@@danf@17-8-2009 10081290@unknown@formal@none@1@S@A variety of methods have been developed to tackle the [[Protein-protein docking]] problem, though it seems that there is still much place to work on in this field.@@@@1@28@@danf@17-8-2009 10081300@unknown@formal@none@1@S@===Software and Tools===@@@@1@3@@danf@17-8-2009 10081310@unknown@formal@none@1@S@Software tools for bioinformatics range from simple command-line tools, to more complex graphical programs and standalone web-services.@@@@1@17@@danf@17-8-2009 10081320@unknown@formal@none@1@S@The computational biology tool best-known among biologists is probably [[BLAST]], an algorithm for determining the similarity of arbitrary sequences against other sequences, possibly from curated databases of protein or DNA sequences.@@@@1@31@@danf@17-8-2009 10081330@unknown@formal@none@1@S@The [[National Center for Biotechnology Information|NCBI]] provides a popular web-based implementation that searches their databases.@@@@1@15@@danf@17-8-2009 10081340@unknown@formal@none@1@S@BLAST is one of a number of [[List of sequence alignment software|generally available programs]] for doing sequence alignment.@@@@1@18@@danf@17-8-2009 10081350@unknown@formal@none@1@S@===Web Services in Bioinformatics===@@@@1@4@@danf@17-8-2009 10081360@unknown@formal@none@1@S@[[SOAP]] and [[REST]]-based interfaces have been developed for a wide variety of bioinformatics applications allowing an application running on one computer in one part of the world to use algorithms, data and computing resources on servers in other parts of the world.@@@@1@42@@danf@17-8-2009 10081370@unknown@formal@none@1@S@The main advantages lay in the end user not having to deal with software and database maintenance overheads.@@@@1@18@@danf@17-8-2009 10081375@unknown@formal@none@1@S@Basic bioinformatics services are classified by the [[EBI]] into three categories: [[Sequence alignment software|SSS]] (Sequence Search Services), [[Multiple sequence alignment|MSA]] (Multiple Sequence Alignment) and [[Bioinformatics#Sequence_analysis|BSA]] (Biological Sequence Analysis).@@@@1@28@@danf@17-8-2009 10081380@unknown@formal@none@1@S@The availability of these [[service-oriented]] bioinformatics resources demonstrate the applicability of web based bioinformatics solutions, and range from a collection of standalone tools with a common data format under a single, standalone or web-based interface, to integrative, distributed and extensible [[bioinformatics workflow management systems]].@@@@1@44@@danf@17-8-2009 10090010@unknown@formal@none@1@S@
BLEU
@@@@1@1@@danf@17-8-2009 10090020@unknown@formal@none@1@S@:''This page is about the evaluation metric for machine translation.@@@@1@10@@danf@17-8-2009 10090030@unknown@formal@none@1@S@For other meanings, please see [[Bleu]].''@@@@1@6@@danf@17-8-2009 10090040@unknown@formal@none@1@S@'''BLEU''' ('''Bilingual Evaluation Understudy''') is a method for evaluating the quality of text which has been translated from one [[natural language]] to another using [[machine translation]].@@@@1@26@@danf@17-8-2009 10090050@unknown@formal@none@1@S@BLEU was one of the first [[software metric]]s to report high [[correlation]] with human judgements of quality.@@@@1@17@@danf@17-8-2009 10090060@unknown@formal@none@1@S@The metric is currently one of the most popular in the field.@@@@1@12@@danf@17-8-2009 10090070@unknown@formal@none@1@S@The central idea behind the metric is that, "the closer a machine translation is to a professional human translation, the better it is".@@@@1@23@@danf@17-8-2009 10090080@unknown@formal@none@1@S@The metric calculates scores for individual segments, generally [[Sentence (linguistics)|sentence]]s, and then averages these scores over the whole [[corpus]] in order to reach a final score.@@@@1@26@@danf@17-8-2009 10090090@unknown@formal@none@1@S@It has been shown to correlate highly with human judgements of quality at the corpus level.@@@@1@16@@danf@17-8-2009 10090100@unknown@formal@none@1@S@The quality of translation is indicated as a number between 0 and 1 and is measured as statistical closeness to a given set of good quality human reference translations.@@@@1@29@@danf@17-8-2009 10090110@unknown@formal@none@1@S@Therefore, it does not directly take into account translation intelligibility or grammatical correctness.@@@@1@13@@danf@17-8-2009 10090120@unknown@formal@none@1@S@The metric works by measuring the [[n-gram]] co-occurrence between a given translation and the set of reference translations and then taking the weighted [[geometric mean]].@@@@1@25@@danf@17-8-2009 10090130@unknown@formal@none@1@S@BLEU is specifically designed to approximate human judgement on a [[corpus]] level and performs badly if used to evaluate the quality of isolated sentences.@@@@1@24@@danf@17-8-2009 10090140@unknown@formal@none@1@S@==Algorithm==@@@@1@1@@danf@17-8-2009 10090150@unknown@formal@none@1@S@BLEU uses a modified form of [[precision]] to compare a candidate translation against multiple reference translations.@@@@1@16@@danf@17-8-2009 10090160@unknown@formal@none@1@S@The metric modifies simple precision since machine translation systems have been known to generate more words than appear in a reference text.@@@@1@22@@danf@17-8-2009 10090170@unknown@formal@none@1@S@This is illustrated in the following example from Papineni et al. (2002),@@@@1@12@@danf@17-8-2009 10090180@unknown@formal@none@1@S@In this example, the candidate text is given a unigram precision of,@@@@1@12@@danf@17-8-2009 10090190@unknown@formal@none@1@S@:P = \\frac{m}{w_{t}} = \\frac{7}{7} = 1@@@@1@7@@danf@17-8-2009 10090200@unknown@formal@none@1@S@Of the seven words in the candidate translation, all of them appear in the reference translations.@@@@1@16@@danf@17-8-2009 10090210@unknown@formal@none@1@S@This presents a problem for a metric, as the candidate translation above is complete nonsense, retaining none of the content of either of the references.@@@@1@25@@danf@17-8-2009 10090220@unknown@formal@none@1@S@The modification that BLEU makes is fairly straightforward.@@@@1@8@@danf@17-8-2009 10090230@unknown@formal@none@1@S@For each word in the candidate translation, the algorithm takes the maximum total count in the reference translations.@@@@1@18@@danf@17-8-2009 10090240@unknown@formal@none@1@S@Taking the example above, the word 'the' appears twice in reference 1, and once in reference 2.@@@@1@17@@danf@17-8-2009 10090250@unknown@formal@none@1@S@The largest value is taken, in this case '2' as the "maximum reference count".@@@@1@14@@danf@17-8-2009 10090260@unknown@formal@none@1@S@For each of the words in the candidate translation, the count of the word is compared against the maximum reference count, and the lowest value is taken.@@@@1@27@@danf@17-8-2009 10090270@unknown@formal@none@1@S@In this case, the count of the word 'the' in the candidate translation is '7', while the maximum reference count for the word is '2'.@@@@1@25@@danf@17-8-2009 10090280@unknown@formal@none@1@S@This "modified count" is then divided by the total number of words in the candidate translation.@@@@1@16@@danf@17-8-2009 10090290@unknown@formal@none@1@S@In the above example, the modified unigram precision score would be,@@@@1@11@@danf@17-8-2009 10090300@unknown@formal@none@1@S@:P = \\frac{2}{7}@@@@1@3@@danf@17-8-2009 10090310@unknown@formal@none@1@S@The above method is used to calculate scores for each n.@@@@1@11@@danf@17-8-2009 10090320@unknown@formal@none@1@S@The value of n which has the "highest correlation with monolingual human judgements" was found to be 4.@@@@1@18@@danf@17-8-2009 10090330@unknown@formal@none@1@S@The unigram scores are found to account for the adequacy of the translation, in other words, how much information is retained in the translation.@@@@1@24@@danf@17-8-2009 10090340@unknown@formal@none@1@S@The longer n-gram scores account for the fluency of the translation, or to what extent it reads like "good English".@@@@1@20@@danf@17-8-2009 10090350@unknown@formal@none@1@S@The modification made to precision does not solve the problem of short translations.@@@@1@13@@danf@17-8-2009 10090360@unknown@formal@none@1@S@Short translations can produce very high precision scores, even using modified precision.@@@@1@12@@danf@17-8-2009 10090370@unknown@formal@none@1@S@An example of a candidate translation for the same references as above might be:@@@@1@14@@danf@17-8-2009 10090380@unknown@formal@none@1@S@:the cat@@@@1@2@@danf@17-8-2009 10090390@unknown@formal@none@1@S@In this example, the modified unigram precision would be,@@@@1@9@@danf@17-8-2009 10090400@unknown@formal@none@1@S@:P = \\frac{1}{2} + \\frac{1}{2} = \\frac{2}{2}@@@@1@7@@danf@17-8-2009 10090410@unknown@formal@none@1@S@as the word 'the' and the word 'cat' appear once each in the candidate, and the total number of words is two.@@@@1@22@@danf@17-8-2009 10090420@unknown@formal@none@1@S@The modified bigram precision would be 1 / 1 as the bigram, "the cat" appears once in the candidate.@@@@1@19@@danf@17-8-2009 10090430@unknown@formal@none@1@S@It has been pointed out that precision is usually twinned with [[recall]] to overcome this problem , as the unigram recall of this example would be 2 / 6 or 2 / 7.@@@@1@33@@danf@17-8-2009 10090440@unknown@formal@none@1@S@The problem being that as there are multiple reference translations, a bad translation could easily have an inflated recall, such as a translation which consisted of all the words in each of the references.@@@@1@34@@danf@17-8-2009 10090450@unknown@formal@none@1@S@In order to produce a score for the whole corpus, the modified precision scores for the segments are combined using the [[geometric mean]], multiplied by a brevity penalty, whose purpose is to prevent very short candidates from receiving too high a score.@@@@1@42@@danf@17-8-2009 10090460@unknown@formal@none@1@S@Let r be the total length of the reference corpus, and c the total length of the translation corpus.@@@@1@19@@danf@17-8-2009 10090470@unknown@formal@none@1@S@If c \\leq r, the brevity penalty applies and is defined to be e^{(1-r/c)}.@@@@1@14@@danf@17-8-2009 10090480@unknown@formal@none@1@S@(In the case of multiple reference sentences, r is taken to be the sum of the lengths of the sentences whose lengths are closest to the lengths of the candidate sentences.@@@@1@31@@danf@17-8-2009 10090490@unknown@formal@none@1@S@However, in the version of the metric used by [[NIST]], the short reference sentence is used.)@@@@1@16@@danf@17-8-2009 10090500@unknown@formal@none@1@S@==Performance==@@@@1@1@@danf@17-8-2009 10090510@unknown@formal@none@1@S@BLEU has frequently been reported as correlating well with human judgement, and certainly remains a benchmark for any new evaluation metric to beat.@@@@1@23@@danf@17-8-2009 10090520@unknown@formal@none@1@S@There are however a number of criticisms that have been voiced.@@@@1@11@@danf@17-8-2009 10090530@unknown@formal@none@1@S@It has been noted that while in theory capable of evaluating any language, BLEU does not in the present form work on languages without word boundaries.@@@@1@26@@danf@17-8-2009 10090540@unknown@formal@none@1@S@It has been argued that although BLEU certainly has significant advantages, there is no guarantee that an increase in BLEU score is an indicator of improved translation quality.@@@@1@28@@danf@17-8-2009 10090550@unknown@formal@none@1@S@As BLEU scores are taken at the corpus level, it is difficult to give a textual example.@@@@1@17@@danf@17-8-2009 10090560@unknown@formal@none@1@S@Nevertheless, they highlight two instances where BLEU seriously underperformed.@@@@1@9@@danf@17-8-2009 10090570@unknown@formal@none@1@S@These were the 2005 [[NIST]] evaluations where a number of different machine translation systems were tested, and their study of the [[SYSTRAN]] engine versus two engines using [[statistical machine translation]] (SMT) techniques.@@@@1@32@@danf@17-8-2009 10090580@unknown@formal@none@1@S@In the 2005 NIST evaluation, they report that the scores generated by BLEU failed to correspond to the scores produced in the human evaluations.@@@@1@24@@danf@17-8-2009 10090590@unknown@formal@none@1@S@The system which was ranked highest by the human judges was only ranked 6th by BLEU.@@@@1@16@@danf@17-8-2009 10090600@unknown@formal@none@1@S@In their study, they compared SMT systems with SYSTRAN, a knowledge based system.@@@@1@13@@danf@17-8-2009 10090610@unknown@formal@none@1@S@The scores from BLEU for SYSTRAN were substantially worse than the scores given to SYSTRAN by the human judges.@@@@1@19@@danf@17-8-2009 10090620@unknown@formal@none@1@S@They note that the SMT systems were trained using BLEU minimum error rate training, and point out that this could be one of the reasons behind the difference.@@@@1@28@@danf@17-8-2009 10090630@unknown@formal@none@1@S@They conclude by recommending that BLEU be used in a more restricted manner, for comparing the results from two similar systems, and for tracking "broad, incremental changes to a single system".@@@@1@31@@danf@17-8-2009 10100010@unknown@formal@none@1@S@
Business intelligence
@@@@1@2@@danf@17-8-2009 10100020@unknown@formal@none@1@S@'''Business intelligence''' ('''BI''') refers to technologies, applications and practices for the collection, integration, analysis, and presentation of business [[information]] and sometimes to the information itself.@@@@1@25@@danf@17-8-2009 10100030@unknown@formal@none@1@S@The purpose of business intelligence--a term that dates at least to 1958--is to support better business decision making.@@@@1@18@@danf@17-8-2009 10100040@unknown@formal@none@1@S@Thus, BI is also described as a [[decision support system]] (DSS):@@@@1@11@@danf@17-8-2009 10100050@unknown@formal@none@1@S@
BI is sometimes used interchangeably with briefing books, report and query tools and executive information systems.@@@@1@17@@danf@17-8-2009 10100060@unknown@formal@none@1@S@In general, business intelligence systems are data-driven DSS.
@@@@1@8@@danf@17-8-2009 10100070@unknown@formal@none@1@S@BI systems provide historical, current, and predictive views of business operations, most often using data that has been gathered into a [[data warehouse]] or a [[data mart]] and occasionally working from operational data.@@@@1@33@@danf@17-8-2009 10100080@unknown@formal@none@1@S@Software elements support the use of this information by assisting in the extraction, analysis, and reporting of information.@@@@1@18@@danf@17-8-2009 10100090@unknown@formal@none@1@S@Applications tackle sales, production, financial, and many other sources of business data for purposes that include, notably, [[business performance management]].@@@@1@20@@danf@17-8-2009 10100100@unknown@formal@none@1@S@Information may be gathered on comparable companies to produce [[benchmarking|benchmarks]].@@@@1@10@@danf@17-8-2009 10100110@unknown@formal@none@1@S@==History==@@@@1@1@@danf@17-8-2009 10100120@unknown@formal@none@1@S@Prior to the start of the [[Information Age]] in the late 20th century, businesses had to collect data from non-automated sources.@@@@1@21@@danf@17-8-2009 10100130@unknown@formal@none@1@S@Businesses then lacked the computing resources necessary to properly analyze the data, and as a result, companies often made business decisions primarily on the basis of [[intuition (knowledge)|intuition]].@@@@1@28@@danf@17-8-2009 10100140@unknown@formal@none@1@S@As businesses automated systems the amount of data increased but its collection remained difficult due to the inability of information to be moved between or within systems.@@@@1@27@@danf@17-8-2009 10100150@unknown@formal@none@1@S@Analysis of information informed for long-term decision making, but was slow and often required the use of instinct or expertise to make short-term decisions.@@@@1@24@@danf@17-8-2009 10100160@unknown@formal@none@1@S@Business intelligence was defined in 1958 by [[Hans Peter Luhn]], who wrote,@@@@1@12@@danf@17-8-2009 10100170@unknown@formal@none@1@S@
In this paper, business is a collection of activities carried on for whatever purpose, be it science, technology, commerce, industry, law, government, defense, et cetera.@@@@1@26@@danf@17-8-2009 10100180@unknown@formal@none@1@S@The communication facility serving the conduct of a business (in the broad sense) may be referred to as an intelligence system.@@@@1@21@@danf@17-8-2009 10100190@unknown@formal@none@1@S@The notion of intelligence is also defined here, in a more general sense, as "the ability to apprehend the interrelationships of presented facts in such a way as to guide action towards a desired goal."
@@@@1@35@@danf@17-8-2009 10100200@unknown@formal@none@1@S@In 1989 Howard Dresner, later a [[Gartner Group]] analyst, popularized BI as an umbrella term to describe "concepts and methods to improve business decision making by using fact-based support systems."@@@@1@30@@danf@17-8-2009 10100210@unknown@formal@none@1@S@In modern businesses the use of standards, automation and specialized software, including [[Online analytical processing|analytical tools]], allows large volumes of data to be [[Extract, transform, load|extracted, transformed, loaded]] and [[Data warehouse|warehoused]] to greatly increase the speed at which information becomes available for decision-making.@@@@1@43@@danf@17-8-2009 10100220@unknown@formal@none@1@S@===Key intelligence topics===@@@@1@3@@danf@17-8-2009 10100230@unknown@formal@none@1@S@Business intelligence often uses [[key performance indicators]] (KPIs) to assess the present state of business and to prescribe a course of action.@@@@1@22@@danf@17-8-2009 10100240@unknown@formal@none@1@S@Examples of KPIs are things such as lead conversion rate (in sales) and inventory turnover (in inventory management).@@@@1@18@@danf@17-8-2009 10100250@unknown@formal@none@1@S@Prior to the widespread adoption of computer and web applications, when information had to be manually input and calculated, performance data was often not available for weeks or months.@@@@1@29@@danf@17-8-2009 10100260@unknown@formal@none@1@S@Recently, banks have tried to make data available at shorter intervals and have reduced delays.@@@@1@15@@danf@17-8-2009 10100270@unknown@formal@none@1@S@The KPI methodology was further expanded with the Chief Performance Officer methodology which incorporated KPIs and root cause analysis into a single methodology.@@@@1@23@@danf@17-8-2009 10100280@unknown@formal@none@1@S@Businesses that face higher operational/[[credit risk]] loading, such as [[credit card]] companies and "wealth management" services, often make KPI-related data available weekly.@@@@1@22@@danf@17-8-2009 10100290@unknown@formal@none@1@S@In some cases, companies may even offer a daily analysis of data.@@@@1@12@@danf@17-8-2009 10100300@unknown@formal@none@1@S@This fast pace requires analysts to use [[information technology|IT]] [[system]]s to process this large volume of data.@@@@1@17@@danf@17-8-2009 10110010@unknown@formal@none@1@S@
Chatterbot
@@@@1@1@@danf@17-8-2009 10110020@unknown@formal@none@1@S@A '''chatterbot''' (or chatbot) is a type of conversational agent, a [[computer program]] designed to simulate an intelligent [[conversation]] with one or more human users via auditory or textual methods.@@@@1@30@@danf@17-8-2009 10110030@unknown@formal@none@1@S@In other words, a chatterbot is a computer program with artificial intelligence to talk to people through voices or typed words.@@@@1@21@@danf@17-8-2009 10110040@unknown@formal@none@1@S@Though many appear to be intelligently interpreting the human input prior to providing a response, most chatterbots simply scan for keywords within the input and pull a reply with the most matching keywords or the most similar wording pattern from a local [[database]].@@@@1@43@@danf@17-8-2009 10110050@unknown@formal@none@1@S@Chatterbots may also be referred to as ''talk bots'', ''chat bots'', or ''chatterboxes''.@@@@1@13@@danf@17-8-2009 10110060@unknown@formal@none@1@S@== Method of operation ==@@@@1@5@@danf@17-8-2009 10110070@unknown@formal@none@1@S@A good understanding of a conversation is required to carry on a meaningful dialog but most chatterbots do not attempt this.@@@@1@21@@danf@17-8-2009 10110080@unknown@formal@none@1@S@Instead they "converse" by recognizing cue words or phrases from the human user, which allows them to use pre-prepared or pre-calculated responses which can move the conversation on in an apparently meaningful way without requiring them to know what they are talking about.@@@@1@43@@danf@17-8-2009 10110090@unknown@formal@none@1@S@For example, if a human types, "I am feeling very worried lately," the chatterbot may be programmed to recognize the phrase "I am" and respond by replacing it with "Why are you" plus a question mark at the end, giving the answer, "Why are you feeling very worried lately?"@@@@1@49@@danf@17-8-2009 10110100@unknown@formal@none@1@S@A similar approach using keywords would be for the program to answer any comment including ''(Name of celebrity)'' with "I think they're great, don't you?"@@@@1@25@@danf@17-8-2009 10110110@unknown@formal@none@1@S@Humans, especially those unfamiliar with chatterbots, sometimes find the resulting conversations engaging.@@@@1@12@@danf@17-8-2009 10110120@unknown@formal@none@1@S@Critics of chatterbots call this engagement the [[ELIZA effect]].@@@@1@9@@danf@17-8-2009 10110130@unknown@formal@none@1@S@Some programs classified as chatterbots use other principles.@@@@1@8@@danf@17-8-2009 10110140@unknown@formal@none@1@S@One example is [[Jabberwacky]], which attempts to model the way humans learn new facts and language.@@@@1@16@@danf@17-8-2009 10110150@unknown@formal@none@1@S@[[Ellaz Systems|ELLA]] attempts to use [[natural language processing]] to make more useful responses from a human's input.@@@@1@17@@danf@17-8-2009 10110160@unknown@formal@none@1@S@Some programs that use natural language conversation, such as [[SHRDLU]], are not generally classified as chatterbots because they link their speech ability to knowledge of a simulated world.@@@@1@28@@danf@17-8-2009 10110170@unknown@formal@none@1@S@This type of link requires a more complex [[artificial intelligence]] (eg., a "vision" system) than standard chatterbots have.@@@@1@18@@danf@17-8-2009 10110180@unknown@formal@none@1@S@== Early chatterbots ==@@@@1@4@@danf@17-8-2009 10110190@unknown@formal@none@1@S@The classic early chatterbots are [[ELIZA]] and [[PARRY]].@@@@1@8@@danf@17-8-2009 10110200@unknown@formal@none@1@S@More recent programs are [[Racter]], [[Verbot]]s, [[Artificial Linguistic Internet Computer Entity|A.L.I.C.E.]], and [[Ellaz Systems|ELLA]].@@@@1@14@@danf@17-8-2009 10110210@unknown@formal@none@1@S@The growth of chatterbots as a research field has created an expansion in their purposes.@@@@1@15@@danf@17-8-2009 10110220@unknown@formal@none@1@S@While ELIZA and PARRY were used exclusively to simulate typed conversation, [[Racter]] was used to "write" a story called ''The Policeman's Beard is Half Constructed''.@@@@1@25@@danf@17-8-2009 10110230@unknown@formal@none@1@S@ELLA includes a collection of games and functional features to further extend the potential of chatterbots.@@@@1@16@@danf@17-8-2009 10110240@unknown@formal@none@1@S@The term "ChatterBot" was coined by [[Michael Loren Mauldin|Michael Mauldin]] (Creator of the first [[Verbot]], Julia) in 1994 to describe these conversational programs.@@@@1@23@@danf@17-8-2009 10110250@unknown@formal@none@1@S@== Malicious chatterbots ==@@@@1@4@@danf@17-8-2009 10110260@unknown@formal@none@1@S@Malicious chatterbots are frequently used to fill chat rooms with spam and advertising, or to entice people into revealing personal information, such as bank account numbers.@@@@1@26@@danf@17-8-2009 10110270@unknown@formal@none@1@S@They are commonly found on [[Yahoo! Messenger]], [[Windows Live Messenger]], [[AOL Instant Messenger]] and other [[instant messaging]] protocols.@@@@1@18@@danf@17-8-2009 10110280@unknown@formal@none@1@S@There has been a published report of a chatterbot used in a fake personal ad on a dating service's website.@@@@1@20@@danf@17-8-2009 10110290@unknown@formal@none@1@S@==Chatterbots in modern AI==@@@@1@4@@danf@17-8-2009 10110300@unknown@formal@none@1@S@Most modern AI research focuses on practical engineering tasks.@@@@1@9@@danf@17-8-2009 10110310@unknown@formal@none@1@S@This is known as weak AI and is distinguished from [[strong AI]], which would require [[sapience]] and reasoning abilities.@@@@1@19@@danf@17-8-2009 10110320@unknown@formal@none@1@S@One pertinent field of AI research is natural language.@@@@1@9@@danf@17-8-2009 10110330@unknown@formal@none@1@S@Usually weak AI fields employ specialised software or programming languages created for them.@@@@1@13@@danf@17-8-2009 10110340@unknown@formal@none@1@S@For example, one of the 'most-human' natural language chatterbots, [[Artificial Linguistic Internet Computer Entity|A.L.I.C.E.]], uses a programming language called AIML that is specific to its program, and its various clones, named Alicebots.@@@@1@32@@danf@17-8-2009 10110350@unknown@formal@none@1@S@Nevertheless, A.L.I.C.E. is still based on pattern matching without any reasoning.@@@@1@11@@danf@17-8-2009 10110360@unknown@formal@none@1@S@This is the same technique [[ELIZA]], the first chatterbot, was using back in 1966.@@@@1@14@@danf@17-8-2009 10110370@unknown@formal@none@1@S@Australian company MyCyberTwin also deals in strong AI, allowing users to create and sustain their own virtual personalities online.@@@@1@19@@danf@17-8-2009 10110380@unknown@formal@none@1@S@MyCyberTwin.com also works in a corporate setting, allowing companies to set up Virtual AI Assistants.@@@@1@15@@danf@17-8-2009 10110390@unknown@formal@none@1@S@Another notable program, known as [[Jabberwacky]], also deals in strong AI, as it is claimed to learn new responses based on user interactions, rather than being driven from a static database like many other existing chatterbots.@@@@1@36@@danf@17-8-2009 10110400@unknown@formal@none@1@S@Although such programs show initial promise, many of the existing results in trying to tackle the problem of natural language still appear fairly poor, and it seems reasonable to state that there is currently no general purpose conversational artificial intelligence.@@@@1@40@@danf@17-8-2009 10110410@unknown@formal@none@1@S@This has led some software developers to focus more on the practical aspect of chatterbot technology - information retrieval.@@@@1@19@@danf@17-8-2009 10110420@unknown@formal@none@1@S@A common rebuttal often used within the AI community against criticism of such approaches asks, "How do we know that humans don't also just follow some cleverly devised rules?" (in the way that Chatterbots do).@@@@1@35@@danf@17-8-2009 10110430@unknown@formal@none@1@S@Two famous examples of this line of argument against the rationale for the basis of the Turing test are John Searle's [[Chinese room]] argument and Ned Block's [[Intentional stance|Blockhead argument]].@@@@1@30@@danf@17-8-2009 10110440@unknown@formal@none@1@S@==Chatterbots/Virtual Assistants in Commercial Environments==@@@@1@5@@danf@17-8-2009 10110450@unknown@formal@none@1@S@Automated Conversational Systems have progressed and evolved far from the original designs of the first widely used chatbots.@@@@1@18@@danf@17-8-2009 10110460@unknown@formal@none@1@S@In the UK, large commercial entities such as Lloyds TSB, Royal Bank of Scotland, Renault, Citroën and One Railway are already utilizing Virtual Assistants to reduce expenditures on Call Centres and provide a first point of contact that can inform the user exactly of points of interest, provide support, capture data from the user and promote products for sale.@@@@1@59@@danf@17-8-2009 10110470@unknown@formal@none@1@S@In the UK, new projects and research are being conducted to introduce a Virtual Assistant into the classroom to assist the teacher.@@@@1@22@@danf@17-8-2009 10110480@unknown@formal@none@1@S@This project is the first of its kind and the chatbot VA in question is based on the Yhaken [http://www.elzware.com] chatbot design.@@@@1@22@@danf@17-8-2009 10110490@unknown@formal@none@1@S@The Yhaken template provides a further move forward in Automated Conversational Systems with features such as complex conversational routing and responses, well defined personality, a complex hierarchical construct with additional external reference points, emotional responses and in depth small talk, all to make the experience more interactive and involving for the user.@@@@1@52@@danf@17-8-2009 10110500@unknown@formal@none@1@S@==Annual contests for chatterbots==@@@@1@4@@danf@17-8-2009 10110510@unknown@formal@none@1@S@Many organizations tries to encourage and support developers all over the world to develop chatterbots that able to do variety of tasks and compete with each other through [[turing test]]s and more.@@@@1@32@@danf@17-8-2009 10110520@unknown@formal@none@1@S@Annual contests are organized at the following links:@@@@1@8@@danf@17-8-2009 10110530@unknown@formal@none@1@S@*[http://www.chatterboxchallenge.com The Chatterbox Challenge]@@@@1@4@@danf@17-8-2009 10110540@unknown@formal@none@1@S@*[http://www.loebner.net/Prizef/loebner-prize.html The Loebner Prize]@@@@1@4@@danf@17-8-2009 10120010@unknown@formal@none@1@S@
Computational linguistics
@@@@1@2@@danf@17-8-2009 10120020@unknown@formal@none@1@S@'''Computational linguistics''' is an [[interdisciplinary]] field dealing with the [[Statistics|statistical]] and/or rule-based modeling of [[natural language]] from a computational perspective.@@@@1@20@@danf@17-8-2009 10120030@unknown@formal@none@1@S@This modeling is not limited to any particular field of [[linguistics]].@@@@1@11@@danf@17-8-2009 10120040@unknown@formal@none@1@S@Traditionally, computational linguistics was usually performed by [[computer scientist]]s who had specialized in the application of computers to the processing of a [[natural language]].@@@@1@24@@danf@17-8-2009 10120050@unknown@formal@none@1@S@Recent research has shown that human language is much more complex than previously thought, so computational linguists often work as members of interdisciplinary teams, including linguists (specifically trained in linguistics), language experts (persons with some level of ability in the languages relevant to a given project), and computer scientists.@@@@1@49@@danf@17-8-2009 10120060@unknown@formal@none@1@S@In general computational linguistics draws upon the involvement of linguists, [[computer science|computer scientists]], experts in [[artificial intelligence]], [[cognitive psychology|cognitive psychologists]], [[math]]ematicians, and [[logic]]ians, amongst others.@@@@1@25@@danf@17-8-2009 10120070@unknown@formal@none@1@S@==Origins==@@@@1@1@@danf@17-8-2009 10120080@unknown@formal@none@1@S@Computational linguistics as a field predates [[artificial intelligence]], a field under which it is often grouped.@@@@1@16@@danf@17-8-2009 10120090@unknown@formal@none@1@S@Computational linguistics originated with efforts in the [[United States]] in the 1950s to use computers to automatically translate texts from foreign languages, particularly [[Russian language|Russian]] scientific journals, into English.@@@@1@29@@danf@17-8-2009 10120100@unknown@formal@none@1@S@Since computers had proven their ability to do [[arithmetic]] much faster and more accurately than humans, it was thought to be only a short matter of time before the technical details could be taken care of that would allow them the same remarkable capacity to process language.@@@@1@47@@danf@17-8-2009 10120110@unknown@formal@none@1@S@When [[machine translation]] (also known as mechanical translation) failed to yield accurate translations right away, automated processing of human languages was recognized as far more complex than had originally been assumed.@@@@1@31@@danf@17-8-2009 10120120@unknown@formal@none@1@S@Computational linguistics was born as the name of the new field of study devoted to developing [[algorithm]]s and [[software]] for intelligently processing language data.@@@@1@24@@danf@17-8-2009 10120130@unknown@formal@none@1@S@When artificial intelligence came into existence in the 1960s, the field of computational linguistics became that sub-division of artificial intelligence dealing with human-level comprehension and production of natural languages.@@@@1@29@@danf@17-8-2009 10120140@unknown@formal@none@1@S@In order to translate one language into another, it was observed that one had to understand the [[grammar]] of both languages, including both [[morphology (linguistics)|morphology]] (the grammar of word forms) and [[syntax]] (the grammar of sentence structure).@@@@1@37@@danf@17-8-2009 10120150@unknown@formal@none@1@S@In order to understand syntax, one had to also understand the [[semantics]] and the [[lexicon]] (or 'vocabulary'), and even to understand something of the [[pragmatics]] of language use.@@@@1@28@@danf@17-8-2009 10120160@unknown@formal@none@1@S@Thus, what started as an effort to translate between languages evolved into an entire discipline devoted to understanding how to represent and process natural languages using computers.@@@@1@27@@danf@17-8-2009 10120170@unknown@formal@none@1@S@==Subfields==@@@@1@1@@danf@17-8-2009 10120180@unknown@formal@none@1@S@Computational linguistics can be divided into major areas depending upon the medium of the language being processed, whether spoken or textual; and upon the task being performed, whether analyzing language (recognition) or synthesizing language (generation).@@@@1@35@@danf@17-8-2009 10120190@unknown@formal@none@1@S@[[Speech recognition]] and [[speech synthesis]] deal with how spoken language can be understood or created using computers.@@@@1@17@@danf@17-8-2009 10120200@unknown@formal@none@1@S@Parsing and generation are sub-divisions of computational linguistics dealing respectively with taking language apart and putting it together.@@@@1@18@@danf@17-8-2009 10120210@unknown@formal@none@1@S@Machine translation remains the sub-division of computational linguistics dealing with having computers translate between languages.@@@@1@15@@danf@17-8-2009 10120220@unknown@formal@none@1@S@Some of the areas of research that are studied by computational linguistics include:@@@@1@13@@danf@17-8-2009 10120230@unknown@formal@none@1@S@*Computer aided [[corpus linguistics]]@@@@1@4@@danf@17-8-2009 10120240@unknown@formal@none@1@S@*Design of [[parser]]s or [[phrase chunking|chunkers]] for [[natural language]]s@@@@1@9@@danf@17-8-2009 10120250@unknown@formal@none@1@S@*Design of taggers like [[Part-of-speech tagging|POS-taggers (part-of-speech taggers)]]@@@@1@8@@danf@17-8-2009 10120260@unknown@formal@none@1@S@*Definition of specialized logics like resource logics for [[Natural language processing|NLP]]@@@@1@11@@danf@17-8-2009 10120270@unknown@formal@none@1@S@*Research in the relation between formal and natural languages in general@@@@1@11@@danf@17-8-2009 10120280@unknown@formal@none@1@S@*[[Machine translation]], e.g. by a translating computer@@@@1@7@@danf@17-8-2009 10120290@unknown@formal@none@1@S@*[[Computational complexity]] of natural language, largely modeled on [[automata theory]], with the application of [[context-sensitive grammar]] and [[Linear bounded automaton|linearly-bounded]] [[Turing machine]]s.@@@@1@22@@danf@17-8-2009 10120300@unknown@formal@none@1@S@The [[Association for Computational Linguistics]] defines computational linguistics as:@@@@1@9@@danf@17-8-2009 10120310@unknown@formal@none@1@S@:...the scientific study of [[language]] from a computational perspective.@@@@1@9@@danf@17-8-2009 10120320@unknown@formal@none@1@S@Computational linguists are interested in providing [[computational model]]s of various kinds of linguistic phenomena.@@@@1@14@@danf@17-8-2009 10130010@unknown@formal@none@1@S@
Computer program
@@@@1@2@@danf@17-8-2009 10130020@unknown@formal@none@1@S@'''Computer programs''' (also '''[[Computer software|software programs]]''', or just '''programs''') are [[Instruction (computer science)|instructions]] for a [[computer]].@@@@1@16@@danf@17-8-2009 10130030@unknown@formal@none@1@S@A computer requires programs to function, and a computer program does nothing unless its instructions are executed by a [[Central processing unit|central processor]].@@@@1@23@@danf@17-8-2009 10130040@unknown@formal@none@1@S@Computer programs are usually [[executable]] programs or the [[source code]] from which executable programs are derived (e.g., [[compiler|compiled]]).@@@@1@18@@danf@17-8-2009 10130050@unknown@formal@none@1@S@Computer source code is often written by professional [[computer programmer]]s.@@@@1@10@@danf@17-8-2009 10130060@unknown@formal@none@1@S@Source code is written in a [[programming language]] that usually follows one of two main [[Programming paradigm|paradigms]]: [[imperative programming|imperative]] or [[declarative language|declarative]] programming.@@@@1@23@@danf@17-8-2009 10130070@unknown@formal@none@1@S@Source code may be converted into an [[executable file]] (sometimes called an executable program) by a [[compiler]].@@@@1@17@@danf@17-8-2009 10130080@unknown@formal@none@1@S@Alternatively, computer programs may be executed by a [[central processing unit]] with the aid of an [[Interpreter (computing)|interpreter]], or may be [[firmware|embedded]] directly into [[Computer hardware|hardware]].@@@@1@26@@danf@17-8-2009 10130090@unknown@formal@none@1@S@Computer programs may be categorized along functional lines: [[system software]] and [[application software]].@@@@1@13@@danf@17-8-2009 10130100@unknown@formal@none@1@S@And many computer programs may run simultaneously on a single computer, a process known as [[computer multitasking|multitasking]].@@@@1@17@@danf@17-8-2009 10130110@unknown@formal@none@1@S@==Programming==@@@@1@1@@danf@17-8-2009 10130120@unknown@formal@none@1@S@ main() {output_string("Hello world!");} @@@@1@5@@danf@17-8-2009 10130160@unknown@formal@none@1@S@Source code of a program written in the [[C programming language]]@@@@1@11@@danf@17-8-2009 10130170@unknown@formal@none@1@S@[[Computer programming]] is the iterative process of writing or editing [[source code]].@@@@1@12@@danf@17-8-2009 10130180@unknown@formal@none@1@S@Editing source code involves testing, analyzing, and refining.@@@@1@8@@danf@17-8-2009 10130190@unknown@formal@none@1@S@A person who practices this skill is referred to as a computer [[programmer]] or software developer.@@@@1@16@@danf@17-8-2009 10130200@unknown@formal@none@1@S@The sometimes lengthy process of computer programming is usually referred to as [[software development]].@@@@1@14@@danf@17-8-2009 10130210@unknown@formal@none@1@S@The term [[software engineering]] is becoming popular as the process is seen as an [[engineering]] discipline.@@@@1@16@@danf@17-8-2009 10130220@unknown@formal@none@1@S@=== Paradigms ===@@@@1@3@@danf@17-8-2009 10130230@unknown@formal@none@1@S@Computer programs can be categorized by the [[programming language]] [[Programming paradigm|paradigm]] used to produce them.@@@@1@15@@danf@17-8-2009 10130240@unknown@formal@none@1@S@Two of the main paradigms are [[imperative programming|imperative]] and [[declarative language|declarative]].@@@@1@11@@danf@17-8-2009 10130250@unknown@formal@none@1@S@Programs written using an imperative language specify an [[algorithm]] using declarations, expressions, and statements.@@@@1@14@@danf@17-8-2009 10130260@unknown@formal@none@1@S@A declaration associates a [[variable]] name with a [[datatype]].@@@@1@9@@danf@17-8-2009 10130270@unknown@formal@none@1@S@For example: var x: integer; .@@@@1@7@@danf@17-8-2009 10130280@unknown@formal@none@1@S@An expression yields a value.@@@@1@5@@danf@17-8-2009 10130290@unknown@formal@none@1@S@For example: 2 + 2 yields 4.@@@@1@9@@danf@17-8-2009 10130300@unknown@formal@none@1@S@Finally, a statement might assign an expression to a variable or use the value of a variable to alter the program's control flow.@@@@1@23@@danf@17-8-2009 10130310@unknown@formal@none@1@S@For example: x := 2 + 2; if x = 4 then do_something();@@@@1@13@@danf@17-8-2009 10130315@unknown@formal@none@1@S@One criticism of imperative languages is the side-effect of an assignment statement on a class of variables called non-local variables.@@@@1@20@@danf@17-8-2009 10130320@unknown@formal@none@1@S@Programs written using a declarative language specify the properties that have to be met by the output and do not specify any implementation details.@@@@1@24@@danf@17-8-2009 10130330@unknown@formal@none@1@S@Two broad categories of declarative languages are [[functional language]]s and [[logical language]]s.@@@@1@12@@danf@17-8-2009 10130340@unknown@formal@none@1@S@The principle behind functional languages (like [[Haskell (programming language)|Haskell]]) is to not allow side-effects, which makes it easier to reason about programs like mathematical functions.@@@@1@25@@danf@17-8-2009 10130350@unknown@formal@none@1@S@The principle behind logical languages (like [[Prolog]]) is to define the problem to be solved — the goal — and leave the detailed solution to the Prolog system itself.@@@@1@29@@danf@17-8-2009 10130360@unknown@formal@none@1@S@The goal is defined by providing a list of subgoals.@@@@1@10@@danf@17-8-2009 10130370@unknown@formal@none@1@S@Then each subgoal is defined by further providing a list of its subgoals, etc.@@@@1@14@@danf@17-8-2009 10130380@unknown@formal@none@1@S@If a path of subgoals fails to find a solution, then that subgoal is [[Backtracking|backtracked]] and another path is systematically attempted.@@@@1@21@@danf@17-8-2009 10130390@unknown@formal@none@1@S@The form in which a program is created may be textual or visual.@@@@1@13@@danf@17-8-2009 10130400@unknown@formal@none@1@S@In a [[visual language]] program, elements are graphically manipulated rather than textually specified.@@@@1@13@@danf@17-8-2009 10130410@unknown@formal@none@1@S@===Compilation or interpretation===@@@@1@3@@danf@17-8-2009 10130420@unknown@formal@none@1@S@A ''computer program'' in the form of a [[human-readable]], computer programming language is called [[source code]].@@@@1@16@@danf@17-8-2009 10130430@unknown@formal@none@1@S@Source code may be converted into an [[executable file|executable image]] by a [[compiler]] or executed immediately with the aid of an [[Interpreter (computing)|interpreter]].@@@@1@23@@danf@17-8-2009 10130440@unknown@formal@none@1@S@Compiled computer programs are commonly referred to as executables, binary images, or simply as [[binary file|binaries]] — a reference to the [[binary numeral system|binary]] [[file format]] used to store the executable code.@@@@1@32@@danf@17-8-2009 10130450@unknown@formal@none@1@S@Compilers are used to translate source code from a programming language into either [[object file|object code]] or [[machine code]].@@@@1@19@@danf@17-8-2009 10130460@unknown@formal@none@1@S@Object code needs further processing to become machine code, and machine code is the [[Central processing unit|Central Processing Unit]]'s native [[microcode|code]], ready for execution.@@@@1@24@@danf@17-8-2009 10130470@unknown@formal@none@1@S@Interpreted computer programs are either decoded and then immediately executed or are decoded into some efficient intermediate representation for future execution.@@@@1@21@@danf@17-8-2009 10130480@unknown@formal@none@1@S@[[BASIC]], [[Perl]], and [[Python (programming language)|Python]] are examples of immediately executed computer programs.@@@@1@13@@danf@17-8-2009 10130490@unknown@formal@none@1@S@Alternatively, [[Java (programming language)|Java]] computer programs are compiled ahead of time and stored as a machine independent code called [[bytecode]].@@@@1@20@@danf@17-8-2009 10130500@unknown@formal@none@1@S@Bytecode is then executed upon request by an interpreter called a [[virtual machine]].@@@@1@13@@danf@17-8-2009 10130510@unknown@formal@none@1@S@The main disadvantage of interpreters is computer programs run slower than if compiled.@@@@1@13@@danf@17-8-2009 10130520@unknown@formal@none@1@S@Interpreting code is slower than running the compiled version because the interpreter must [[decode]] each [[Statement (programming)|statement]] each time it is loaded and then perform the desired action.@@@@1@28@@danf@17-8-2009 10130530@unknown@formal@none@1@S@On the other hand, software development may be quicker using an interpreter because testing is immediate when the compilation step is omitted.@@@@1@22@@danf@17-8-2009 10130540@unknown@formal@none@1@S@Another disadvantage of interpreters is the interpreter must be present on the computer at the time the computer program is executed.@@@@1@21@@danf@17-8-2009 10130550@unknown@formal@none@1@S@Alternatively, compiled computer programs need not have the compiler present at the time of execution.@@@@1@15@@danf@17-8-2009 10130560@unknown@formal@none@1@S@No properties of a programming language require it to be exclusively compiled or exclusively interpreted.@@@@1@15@@danf@17-8-2009 10130570@unknown@formal@none@1@S@The categorization usually reflects the most popular method of language execution.@@@@1@11@@danf@17-8-2009 10130580@unknown@formal@none@1@S@For example, BASIC is thought of as an interpreted language and C a compiled language, despite the existence of BASIC compilers and C interpreters.@@@@1@24@@danf@17-8-2009 10130590@unknown@formal@none@1@S@===Self-modifying programs===@@@@1@2@@danf@17-8-2009 10130600@unknown@formal@none@1@S@A computer program in [[execution (computers)|execution]] is normally treated as being different from the [[data (computing)|data]] the program operates on.@@@@1@20@@danf@17-8-2009 10130610@unknown@formal@none@1@S@However, in some cases this distinction is blurred when a computer program modifies itself.@@@@1@14@@danf@17-8-2009 10130620@unknown@formal@none@1@S@The modified computer program is subsequently executed as part of the same program.@@@@1@13@@danf@17-8-2009 10130630@unknown@formal@none@1@S@[[Self-modifying code]] is possible for programs written in [[Lisp programming language|Lisp]], [[cobol|COBOL]], and [[Prolog]].@@@@1@14@@danf@17-8-2009 10130640@unknown@formal@none@1@S@==Execution and storage==@@@@1@3@@danf@17-8-2009 10130650@unknown@formal@none@1@S@Typically, computer programs are stored in [[non-volatile memory]] until requested either directly or indirectly to be [[execution (computers)|executed]] by the computer user.@@@@1@22@@danf@17-8-2009 10130660@unknown@formal@none@1@S@Upon such a request, the program is loaded into [[random access memory]], by a computer program called an [[operating system]], where it can be accessed directly by the central processor.@@@@1@30@@danf@17-8-2009 10130670@unknown@formal@none@1@S@The central processor then executes ("runs") the program, instruction by instruction, until termination.@@@@1@13@@danf@17-8-2009 10130680@unknown@formal@none@1@S@A program in execution is called a [[Process (computing)|process]].@@@@1@9@@danf@17-8-2009 10130690@unknown@formal@none@1@S@Termination is either by normal self-termination or by error — software or hardware error.@@@@1@14@@danf@17-8-2009 10130700@unknown@formal@none@1@S@===Embedded programs===@@@@1@2@@danf@17-8-2009 10130710@unknown@formal@none@1@S@Some computer programs are embedded into hardware.@@@@1@7@@danf@17-8-2009 10130720@unknown@formal@none@1@S@A [[stored-program computer]] requires an initial computer program stored in its [[read-only memory]] to [[booting|boot]].@@@@1@15@@danf@17-8-2009 10130730@unknown@formal@none@1@S@The boot process is to identify and initialize all aspects of the system, from [[Processor register|CPU registers]] to [[Device driver|device controllers]] to [[Volatile memory|memory]] contents.@@@@1@25@@danf@17-8-2009 10130740@unknown@formal@none@1@S@Following the initialization process, this initial computer program loads the [[operating system]] and sets the [[program counter]] to begin normal operations.@@@@1@21@@danf@17-8-2009 10130750@unknown@formal@none@1@S@Independent of the host computer, a [[Peripheral|hardware device]] might have embedded [[firmware]] to control its operation.@@@@1@16@@danf@17-8-2009 10130760@unknown@formal@none@1@S@Firmware is used when the computer program is rarely or never expected to change, or when the program must not be lost when the power is off.@@@@1@27@@danf@17-8-2009 10130770@unknown@formal@none@1@S@===Manual programming===@@@@1@2@@danf@17-8-2009 10130780@unknown@formal@none@1@S@Computer programs historically were manually input to the central processor via switches.@@@@1@12@@danf@17-8-2009 10130790@unknown@formal@none@1@S@An instruction was represented by a configuration of on/off settings.@@@@1@10@@danf@17-8-2009 10130800@unknown@formal@none@1@S@After setting the configuration, an execute button was pressed.@@@@1@9@@danf@17-8-2009 10130810@unknown@formal@none@1@S@This process was then repeated.@@@@1@5@@danf@17-8-2009 10130820@unknown@formal@none@1@S@Computer programs also historically were manually input via [[paper tape]] or [[punched cards]].@@@@1@13@@danf@17-8-2009 10130830@unknown@formal@none@1@S@After the medium was loaded, the starting address was set via switches and the execute button pressed.@@@@1@17@@danf@17-8-2009 10130840@unknown@formal@none@1@S@===Automatic program generation===@@@@1@3@@danf@17-8-2009 10130850@unknown@formal@none@1@S@[[Generative programming]] is a style of [[computer programming]] that creates [[source code]] through [[generic programming|generic]] [[class (computer science)|classes]], [[Prototype-based programming|prototypes]], [[template (programming)|template]]s, [[aspect (computer science)|aspect]]s, and [[Code generation (compiler)|code generator]]s to improve [[programmer]] productivity.@@@@1@34@@danf@17-8-2009 10130860@unknown@formal@none@1@S@Source code is generated with [[programming tool]]s such as a [[template processor]] or an [[Integrated development environment|Integrated Development Environment]].@@@@1@19@@danf@17-8-2009 10130870@unknown@formal@none@1@S@The simplest form of source code generator is a [[Macro (computer science)|macro]] processor, such as the [[C preprocessor]], which replaces patterns in source code according to relatively simple rules.@@@@1@29@@danf@17-8-2009 10130880@unknown@formal@none@1@S@[[Software engine]]s output source code or [[Markup language|markup code]] that simultaneously become the input to another [[Process (computing)|computer process]].@@@@1@19@@danf@17-8-2009 10130890@unknown@formal@none@1@S@The analogy is that of one process driving another process, with the computer code being burned as fuel.@@@@1@18@@danf@17-8-2009 10130900@unknown@formal@none@1@S@[[Application server]]s are software engines that deliver applications to [[client computer]]s.@@@@1@11@@danf@17-8-2009 10130910@unknown@formal@none@1@S@For example, a [[Wiki software|Wiki]] is an application server that allows users to build [[dynamic web page|dynamic content]] assembled from [[article (publishing)|articles]].@@@@1@22@@danf@17-8-2009 10130920@unknown@formal@none@1@S@Wikis generate [[HTML]], [[CSS]], [[Java (programming language)|Java]], and [[Javascript]] which are then [[Interpreter (computing)|interpreted]] by a [[web browser]].@@@@1@18@@danf@17-8-2009 10130930@unknown@formal@none@1@S@=== Simultaneous execution===@@@@1@3@@danf@17-8-2009 10130940@unknown@formal@none@1@S@Many operating systems support [[computer multitasking|multitasking]] which enables many computer programs to appear to be running simultaneously on a single computer.@@@@1@21@@danf@17-8-2009 10130950@unknown@formal@none@1@S@Operating systems may run multiple programs through [[process scheduling]] — a software mechanism to [[Context switch|switch]] the CPU among processes frequently so that users can [[Time-sharing|interact]] with each program while it is running.@@@@1@33@@danf@17-8-2009 10130960@unknown@formal@none@1@S@Within hardware, modern day multiprocessor computers or computers with multicore processors may run multiple programs.@@@@1@15@@danf@17-8-2009 10130970@unknown@formal@none@1@S@== Functional categories ==@@@@1@4@@danf@17-8-2009 10130980@unknown@formal@none@1@S@Computer programs may be categorized along functional lines.@@@@1@8@@danf@17-8-2009 10130990@unknown@formal@none@1@S@These functional categories are [[system software]] and [[application software]].@@@@1@9@@danf@17-8-2009 10131000@unknown@formal@none@1@S@System software includes the [[operating system]] which couples the [[computer hardware|computer's hardware]] with the application software.@@@@1@16@@danf@17-8-2009 10131010@unknown@formal@none@1@S@The purpose of the operating system is to provide an environment in which application software executes in a convenient and efficient manner.@@@@1@22@@danf@17-8-2009 10131020@unknown@formal@none@1@S@In addition to the operating system, system software includes [[Utility software|utility programs]] that help manage and tune the computer.@@@@1@19@@danf@17-8-2009 10131030@unknown@formal@none@1@S@If a computer program is not system software then it is application software.@@@@1@13@@danf@17-8-2009 10131040@unknown@formal@none@1@S@Application software includes [[middleware]], which couples the system software with the [[user interface]].@@@@1@13@@danf@17-8-2009 10131050@unknown@formal@none@1@S@Application software also includes utility programs that help users solve application problems, like the need for sorting.@@@@1@17@@danf@17-8-2009 10140010@unknown@formal@none@1@S@
Computer science
@@@@1@2@@danf@17-8-2009 10140020@unknown@formal@none@1@S@'''Computer science''' (or '''computing science''') is the study and the [[science]] of the theoretical foundations of [[information]] and [[computation]] and their implementation and application in [[computer|computer system]]s.@@@@1@27@@danf@17-8-2009 10140030@unknown@formal@none@1@S@Computer science has many sub-fields; some emphasize the computation of specific results (such as [[computer graphics]]), while others relate to properties of [[computational problem]]s (such as [[computational complexity theory]]).@@@@1@29@@danf@17-8-2009 10140040@unknown@formal@none@1@S@Still others focus on the challenges in implementing computations.@@@@1@9@@danf@17-8-2009 10140050@unknown@formal@none@1@S@For example, [[programming language theory]] studies approaches to describing computations, while [[computer programming]] applies specific [[programming language]]s to solve specific computational problems.@@@@1@22@@danf@17-8-2009 10140060@unknown@formal@none@1@S@A further subfield, [[human-computer interaction]], focuses on the challenges in making computers and computations useful, usable and universally accessible to [[humans|people]].@@@@1@21@@danf@17-8-2009 10140070@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009 10140080@unknown@formal@none@1@S@The early foundations of what would become computer science predate the invention of the modern [[digital computer]].@@@@1@17@@danf@17-8-2009 10140090@unknown@formal@none@1@S@Machines for calculating fixed numerical tasks, such as the [[abacus]], have existed since antiquity.@@@@1@14@@danf@17-8-2009 10140100@unknown@formal@none@1@S@[[Wilhelm Schickard]] built the first mechanical calculator in 1623.@@@@1@9@@danf@17-8-2009 10140110@unknown@formal@none@1@S@[[Charles Babbage]] designed a [[difference engine]] in [[Victorian era|Victorian]] times (between 1837 and 1901) helped by [[Ada Lovelace]].@@@@1@18@@danf@17-8-2009 10140120@unknown@formal@none@1@S@Around 1900, the [[IBM]] corporation sold [[Key_punch|punch-card machines]].@@@@1@8@@danf@17-8-2009 10140130@unknown@formal@none@1@S@However, all of these machines were constrained to perform a single task, or at best some subset of all possible tasks.@@@@1@21@@danf@17-8-2009 10140140@unknown@formal@none@1@S@During the 1940s, as newer and more powerful computing machines were developed, the term ''computer'' came to refer to the machines rather than their human predecessors.@@@@1@26@@danf@17-8-2009 10140150@unknown@formal@none@1@S@As it became clear that computers could be used for more than just mathematical calculations, the field of computer science broadened to study [[computation]] in general.@@@@1@26@@danf@17-8-2009 10140160@unknown@formal@none@1@S@Computer science began to be established as a distinct academic discipline in the 1960s, with the creation of the first computer science departments and degree programs.@@@@1@26@@danf@17-8-2009 10140170@unknown@formal@none@1@S@Since practical computers became available, many applications of computing have become distinct areas of study in their own right.@@@@1@19@@danf@17-8-2009 10140180@unknown@formal@none@1@S@Many initially believed it impossible that "computers themselves could actually be a scientific field of study" (Levy 1984, p. 11), though it was in the "late fifties" (Levy 1984, p.11) that it gradually became accepted among the greater academic population.@@@@1@40@@danf@17-8-2009 10140190@unknown@formal@none@1@S@It is the now well-known IBM brand that formed part of the computer science revolution during this time.@@@@1@18@@danf@17-8-2009 10140200@unknown@formal@none@1@S@'IBM' (short for International Business Machines) released the IBM 704 and later the IBM 709 computers, which were widely used during the exploration period of such devices.@@@@1@27@@danf@17-8-2009 10140210@unknown@formal@none@1@S@"Still, working with the IBM [computer] was frustrating...if you had misplaced as much as one letter in one instruction, the program would crash, and you would have to start the whole process over again" (Levy 1984, p.13).@@@@1@37@@danf@17-8-2009 10140220@unknown@formal@none@1@S@During the late 1950s, the computer science discipline was very much in its developmental stages, and such issues were commonplace.@@@@1@20@@danf@17-8-2009 10140230@unknown@formal@none@1@S@Time has seen significant improvements in the useability and effectiveness of computer science technology.@@@@1@14@@danf@17-8-2009 10140240@unknown@formal@none@1@S@Modern society has seen a significant shift from computers being used solely by experts or professionals to a more widespread user base.@@@@1@22@@danf@17-8-2009 10140250@unknown@formal@none@1@S@By the 1990s, computers became accepted as being the norm within everyday life.@@@@1@13@@danf@17-8-2009 10140260@unknown@formal@none@1@S@During this time data entry was a primary component of the use of computers, many preferring to streamline their business practices through the use of a computer.@@@@1@27@@danf@17-8-2009 10140270@unknown@formal@none@1@S@This also gave the additional benefit of removing the need of large amounts of documentation and file records which consumed much-needed physical space within offices.@@@@1@25@@danf@17-8-2009 10140280@unknown@formal@none@1@S@== Major achievements ==@@@@1@4@@danf@17-8-2009 10140290@unknown@formal@none@1@S@Despite its relatively short history as a formal academic discipline, computer science has made a number of fundamental contributions to [[science]] and [[society]].@@@@1@23@@danf@17-8-2009 10140300@unknown@formal@none@1@S@These include:@@@@1@2@@danf@17-8-2009 10140310@unknown@formal@none@1@S@;Applications within computer science@@@@1@4@@danf@17-8-2009 10140320@unknown@formal@none@1@S@* A formal definition of [[computation]] and [[computability]], and proof that there are computationally [[Undecidable problem|unsolvable]] and [[Intractable#Intractability|intractable]] problems.@@@@1@19@@danf@17-8-2009 10140330@unknown@formal@none@1@S@* The concept of a [[programming language]], a tool for the precise expression of methodological information at various levels of abstraction.@@@@1@21@@danf@17-8-2009 10140340@unknown@formal@none@1@S@;Applications outside of computing@@@@1@4@@danf@17-8-2009 10140350@unknown@formal@none@1@S@* Sparked the [[Digital Revolution]] which led to the current [[Information Age]] and the [[Internet]].@@@@1@15@@danf@17-8-2009 10140360@unknown@formal@none@1@S@* In [[cryptography]], [[Cryptanalysis of the Enigma|breaking the Enigma machine]] was an important factor contributing to the Allied victory in World War II.@@@@1@23@@danf@17-8-2009 10140370@unknown@formal@none@1@S@* [[Scientific computing]] enabled advanced study of the mind and mapping the human genome was possible with [[Human Genome Project]].@@@@1@20@@danf@17-8-2009 10140380@unknown@formal@none@1@S@[[Distributed computing]] projects like [[Folding\shome]] explore [[protein folding]].@@@@1@8@@danf@17-8-2009 10140390@unknown@formal@none@1@S@* [[Algorithmic trading]] has increased the [[Economic efficiency|efficiency]] and [[Market liquidity|liquidity]] of financial markets by using [[artificial intelligence]], [[machine learning]] and other [[statistics|statistical]] and [[Numerical analysis|numerical]] techniques on a large scale.@@@@1@31@@danf@17-8-2009 10140400@unknown@formal@none@1@S@== Relationship with other fields ==@@@@1@6@@danf@17-8-2009 10140410@unknown@formal@none@1@S@Despite its name, a significant amount of computer science does not involve the study of computers themselves.@@@@1@17@@danf@17-8-2009 10140420@unknown@formal@none@1@S@Because of this, several alternative names have been proposed.@@@@1@9@@danf@17-8-2009 10140430@unknown@formal@none@1@S@Danish scientist [[Peter Naur]] suggested the term ''datalogy'', to reflect the fact that the scientific discipline revolves around data and data treatment, while not necessarily involving computers.@@@@1@27@@danf@17-8-2009 10140440@unknown@formal@none@1@S@The first scientific institution to use the term was the Department of Datalogy at the University of Copenhagen, founded in 1969, with Peter Naur being the first professor in datalogy.@@@@1@30@@danf@17-8-2009 10140450@unknown@formal@none@1@S@The term is used mainly in the Scandinavian countries.@@@@1@9@@danf@17-8-2009 10140460@unknown@formal@none@1@S@Also, in the early days of computing, a number of terms for the and practitioners of the field of computing were suggested in the ''Communications are of the ACM''—''turingineer'', ''turologist'', ''flow-charts-man'', ''applied meta-mathematician'', and ''applied epistemologist''.@@@@1@36@@danf@17-8-2009 10140470@unknown@formal@none@1@S@Three months later in the same journal, ''comptologist'' was suggested, followed next year by ''hypologist''.@@@@1@15@@danf@17-8-2009 10140480@unknown@formal@none@1@S@Recently the term ''computics'' has been suggested.@@@@1@7@@danf@17-8-2009 10140490@unknown@formal@none@1@S@''Informatik'' was a term used in Europe with more frequency.@@@@1@10@@danf@17-8-2009 10140500@unknown@formal@none@1@S@The renowned computer scientist [[Edsger W. Dijkstra|Edsger Dijkstra]] stated, "Computer science is no more about computers than astronomy is about telescopes."@@@@1@21@@danf@17-8-2009 10140510@unknown@formal@none@1@S@The design and deployment of computers and computer systems is generally considered the province of disciplines other than computer science.@@@@1@20@@danf@17-8-2009 10140520@unknown@formal@none@1@S@For example, the study of [[computer hardware]] is usually considered part of [[computer engineering]], while the study of commercial [[computer system]]s and their deployment is often called [[information technology]] or [[information systems]].@@@@1@32@@danf@17-8-2009 10140530@unknown@formal@none@1@S@Computer science is sometimes criticized as being insufficiently scientific, a view espoused in the statement "Science is to computer science as hydrodynamics is to plumbing", credited to [[Stan Kelly-Bootle]] and others.@@@@1@31@@danf@17-8-2009 10140540@unknown@formal@none@1@S@However, there has been much cross-fertilization of ideas between the various computer-related disciplines.@@@@1@13@@danf@17-8-2009 10140550@unknown@formal@none@1@S@Computer science research has also often crossed into other disciplines, such as [[cognitive science]], [[economics]], [[mathematics]], [[physics]] (see [[quantum computing]]), and [[linguistics]].@@@@1@22@@danf@17-8-2009 10140560@unknown@formal@none@1@S@Computer science is considered by some to have a much closer relationship with [[mathematics]] than many scientific disciplines.@@@@1@18@@danf@17-8-2009 10140570@unknown@formal@none@1@S@Early computer science was strongly influenced by the work of mathematicians such as [[Kurt Gödel]] and [[Alan Turing]], and there continues to be a useful interchange of ideas between the two fields in areas such as [[mathematical logic]], [[category theory]], [[domain theory]], and [[algebra]].@@@@1@44@@danf@17-8-2009 10140580@unknown@formal@none@1@S@The relationship between computer science and [[software engineering]] is a contentious issue, which is further muddied by [[Debates within software engineering|disputes]] over what the term "software engineering" means, and how computer science is defined.@@@@1@34@@danf@17-8-2009 10140590@unknown@formal@none@1@S@[[David Parnas]], taking a cue from the relationship between other engineering and science disciplines, has claimed that the principal focus of computer science is studying the properties of computation in general, while the principal focus of software engineering is the design of specific computations to achieve practical goals, making the two separate but complementary disciplines.@@@@1@55@@danf@17-8-2009 10140600@unknown@formal@none@1@S@The academic, political, and funding aspects of computer science tend to have roots as to whether a department in the U.S. formed with either a mathematical emphasis or an engineering emphasis.@@@@1@31@@danf@17-8-2009 10140610@unknown@formal@none@1@S@In general, electrical engineering-based computer science departments have tended to succeed as computer science and/or engineering departments.@@@@1@17@@danf@17-8-2009 10140620@unknown@formal@none@1@S@Computer science departments with a mathematics emphasis and with a numerical orientation consider alignment [[computational science]].@@@@1@16@@danf@17-8-2009 10140630@unknown@formal@none@1@S@Both types of departments tend to make efforts to bridge the field educationally if not across all research.@@@@1@18@@danf@17-8-2009 10140640@unknown@formal@none@1@S@== Fields of computer science ==@@@@1@6@@danf@17-8-2009 10140650@unknown@formal@none@1@S@Computer science searches for concepts and [[formal proof]]s to explain and describe computational systems of interest.@@@@1@16@@danf@17-8-2009 10140660@unknown@formal@none@1@S@As with all sciences, these theories can then be utilised to synthesize practical engineering applications, which in turn may suggest new systems to be studied and analysed.@@@@1@27@@danf@17-8-2009 10140670@unknown@formal@none@1@S@While the [[ACM Computing Classification System]] can be used to split computer science up into different topics of fields, a more descriptive breakdown follows:@@@@1@24@@danf@17-8-2009 10140680@unknown@formal@none@1@S@=== Mathematical foundations ===@@@@1@4@@danf@17-8-2009 10140690@unknown@formal@none@1@S@; [[Mathematical logic]]@@@@1@3@@danf@17-8-2009 10140700@unknown@formal@none@1@S@: Boolean logic and other ways of modeling logical queries; the uses and limitations of formal proof methods.@@@@1@18@@danf@17-8-2009 10140710@unknown@formal@none@1@S@; [[Number theory]]@@@@1@3@@danf@17-8-2009 10140720@unknown@formal@none@1@S@: Theory of proofs and heuristics for finding proofs in the simple domain of integers.@@@@1@15@@danf@17-8-2009 10140730@unknown@formal@none@1@S@Used in [[cryptography]] as well as a test domain in [[artificial intelligence]].@@@@1@12@@danf@17-8-2009 10140740@unknown@formal@none@1@S@; [[Graph theory]]@@@@1@3@@danf@17-8-2009 10140750@unknown@formal@none@1@S@: Foundations for data structures and searching algorithms.@@@@1@8@@danf@17-8-2009 10140760@unknown@formal@none@1@S@; [[Type theory]]@@@@1@3@@danf@17-8-2009 10140770@unknown@formal@none@1@S@: Formal analysis of the types of data, and the use of these types to understand properties of programs, especially program safety.@@@@1@22@@danf@17-8-2009 10140780@unknown@formal@none@1@S@; [[Category theory]]@@@@1@3@@danf@17-8-2009 10140790@unknown@formal@none@1@S@: Category theory provides a means of capturing all of math and computation in a single synthesis.@@@@1@17@@danf@17-8-2009 10140800@unknown@formal@none@1@S@; [[Computational geometry]]@@@@1@3@@danf@17-8-2009 10140810@unknown@formal@none@1@S@: The study of [[algorithm]]s to solve problems stated in terms of [[geometry]].@@@@1@13@@danf@17-8-2009 10140820@unknown@formal@none@1@S@; [[Numerical analysis]]@@@@1@3@@danf@17-8-2009 10140830@unknown@formal@none@1@S@: Foundations for algorithms in discrete mathematics, as well as the study of the limitations of floating point computation, including [[round-off]] errors.@@@@1@22@@danf@17-8-2009 10140840@unknown@formal@none@1@S@=== Theory of computation ===@@@@1@5@@danf@17-8-2009 10140850@unknown@formal@none@1@S@; [[Automata theory]]@@@@1@3@@danf@17-8-2009 10140860@unknown@formal@none@1@S@: Different logical structures for solving problems.@@@@1@7@@danf@17-8-2009 10140870@unknown@formal@none@1@S@; [[Computability theory (computer science)|Computability theory]]@@@@1@6@@danf@17-8-2009 10140880@unknown@formal@none@1@S@: What is calculable with the current models of computers.@@@@1@10@@danf@17-8-2009 10140890@unknown@formal@none@1@S@Proofs developed by [[Alan Turing]] and others provide insight into the possibilities of what can be computed and what cannot.@@@@1@20@@danf@17-8-2009 10140900@unknown@formal@none@1@S@; [[Computational complexity theory]]@@@@1@4@@danf@17-8-2009 10140910@unknown@formal@none@1@S@: Fundamental bounds (especially time and storage space) on classes of computations; in practice, study of which problems a computer can solve with reasonable resources (while computability theory studies which problems can be solved at all).@@@@1@36@@danf@17-8-2009 10140920@unknown@formal@none@1@S@; [[Quantum computing|Quantum computing theory]]@@@@1@5@@danf@17-8-2009 10140930@unknown@formal@none@1@S@: Representation and manipulation of data using the quantum properties of particles and quantum mechanism.@@@@1@15@@danf@17-8-2009 10140940@unknown@formal@none@1@S@=== Algorithms and data structures ===@@@@1@6@@danf@17-8-2009 10140950@unknown@formal@none@1@S@; [[Analysis of algorithms]]@@@@1@4@@danf@17-8-2009 10140960@unknown@formal@none@1@S@: Time and space complexity of algorithms.@@@@1@7@@danf@17-8-2009 10140970@unknown@formal@none@1@S@; [[Algorithms]]@@@@1@2@@danf@17-8-2009 10140980@unknown@formal@none@1@S@: Formal logical processes used for computation, and the efficiency of these processes.@@@@1@13@@danf@17-8-2009 10140990@unknown@formal@none@1@S@=== Programming languages and compilers ===@@@@1@6@@danf@17-8-2009 10141000@unknown@formal@none@1@S@; [[Compiler]]s@@@@1@2@@danf@17-8-2009 10141010@unknown@formal@none@1@S@: Ways of translating computer programs, usually from [[high-level programming language|higher level]] languages to [[low-level programming language|lower level]] ones.@@@@1@19@@danf@17-8-2009 10141020@unknown@formal@none@1@S@; [[Interpreter (computing)|Interpreter]]s@@@@1@3@@danf@17-8-2009 10141030@unknown@formal@none@1@S@: A program that takes in as input a computer program and executes it.@@@@1@14@@danf@17-8-2009 10141040@unknown@formal@none@1@S@; [[Programming language]]s@@@@1@3@@danf@17-8-2009 10141050@unknown@formal@none@1@S@: Formal language paradigms for expressing algorithms, and the properties of these languages (e.g., what problems they are suited to solve).@@@@1@21@@danf@17-8-2009 10141060@unknown@formal@none@1@S@=== Concurrent, parallel, and distributed systems ===@@@@1@7@@danf@17-8-2009 10141070@unknown@formal@none@1@S@; [[Concurrency (computer science)|Concurrency]]@@@@1@4@@danf@17-8-2009 10141080@unknown@formal@none@1@S@: The theory and practice of simultaneous computation; data safety in any multitasking or multithreaded environment.@@@@1@16@@danf@17-8-2009 10141090@unknown@formal@none@1@S@; [[Distributed computing]]@@@@1@3@@danf@17-8-2009 10141100@unknown@formal@none@1@S@: Computing using multiple computing devices over a network to accomplish a common objective or task and thereby reducing the latency involved in single processor contributions for any task.@@@@1@29@@danf@17-8-2009 10141110@unknown@formal@none@1@S@; [[Parallel computing]]@@@@1@3@@danf@17-8-2009 10141120@unknown@formal@none@1@S@: Computing using multiple concurrent threads of execution.@@@@1@8@@danf@17-8-2009 10141130@unknown@formal@none@1@S@=== Software engineering ===@@@@1@4@@danf@17-8-2009 10141140@unknown@formal@none@1@S@; [[Algorithm design]]@@@@1@3@@danf@17-8-2009 10141150@unknown@formal@none@1@S@: Using ideas from algorithm theory to creatively design solutions to real tasks@@@@1@13@@danf@17-8-2009 10141160@unknown@formal@none@1@S@; [[Computer programming]]@@@@1@3@@danf@17-8-2009 10141170@unknown@formal@none@1@S@: The practice of using a programming language to implement algorithms@@@@1@11@@danf@17-8-2009 10141180@unknown@formal@none@1@S@; [[Formal methods]]@@@@1@3@@danf@17-8-2009 10141190@unknown@formal@none@1@S@: Mathematical approaches for describing and reasoning about software designs.@@@@1@10@@danf@17-8-2009 10141200@unknown@formal@none@1@S@; [[Reverse engineering]]@@@@1@3@@danf@17-8-2009 10141210@unknown@formal@none@1@S@: The application of the scientific method to the understanding of arbitrary existing software@@@@1@14@@danf@17-8-2009 10141220@unknown@formal@none@1@S@; [[Software development]]@@@@1@3@@danf@17-8-2009 10141230@unknown@formal@none@1@S@: The principles and practice of designing, developing, and testing programs, as well as proper engineering practices.@@@@1@17@@danf@17-8-2009 10141240@unknown@formal@none@1@S@=== System architecture ===@@@@1@4@@danf@17-8-2009 10141250@unknown@formal@none@1@S@; [[Computer architecture]]@@@@1@3@@danf@17-8-2009 10141260@unknown@formal@none@1@S@: The design, organization, optimization and verification of a computer system, mostly about [[CPU]]s and [[memory (computers)|memory]] subsystems (and the bus connecting them).@@@@1@23@@danf@17-8-2009 10141270@unknown@formal@none@1@S@; [[Computer organization]]@@@@1@3@@danf@17-8-2009 10141280@unknown@formal@none@1@S@: The implementation of computer architectures, in terms of descriptions of their specific [[electrical circuit]]ry@@@@1@15@@danf@17-8-2009 10141290@unknown@formal@none@1@S@; [[Operating system]]s@@@@1@3@@danf@17-8-2009 10141300@unknown@formal@none@1@S@: Systems for managing computer programs and providing the basis of a useable system.@@@@1@14@@danf@17-8-2009 10141310@unknown@formal@none@1@S@=== Communications ===@@@@1@3@@danf@17-8-2009 10141320@unknown@formal@none@1@S@; [[Computer audio]]@@@@1@3@@danf@17-8-2009 10141330@unknown@formal@none@1@S@: Algorithms and data structures for the creation, manipulation, storage, and transmission of [[digital audio]] recordings.@@@@1@16@@danf@17-8-2009 10141340@unknown@formal@none@1@S@Also important in [[voice recognition]] applications.@@@@1@6@@danf@17-8-2009 10141350@unknown@formal@none@1@S@; [[Computer networking|Networking]]@@@@1@3@@danf@17-8-2009 10141360@unknown@formal@none@1@S@: Algorithms and protocols for communicating data across different shared or dedicated media, often including [[error correction]].@@@@1@17@@danf@17-8-2009 10141370@unknown@formal@none@1@S@; [[Cryptography]]@@@@1@2@@danf@17-8-2009 10141380@unknown@formal@none@1@S@: Applies results from complexity, probability and number theory to invent and break codes.@@@@1@14@@danf@17-8-2009 10141390@unknown@formal@none@1@S@=== Databases ===@@@@1@3@@danf@17-8-2009 10141400@unknown@formal@none@1@S@; [[Data mining]]@@@@1@3@@danf@17-8-2009 10141410@unknown@formal@none@1@S@: Data mining is the extraction of relevant data from all sources of data.@@@@1@14@@danf@17-8-2009 10141420@unknown@formal@none@1@S@; [[Relational databases]]@@@@1@3@@danf@17-8-2009 10141430@unknown@formal@none@1@S@: Study of algorithms for searching and processing information in documents and databases; closely related to [[information retrieval]].@@@@1@18@@danf@17-8-2009 10141440@unknown@formal@none@1@S@; [[OLAP]]@@@@1@2@@danf@17-8-2009 10141450@unknown@formal@none@1@S@: Online Analytical Processing, or OLAP, is an approach to quickly provide answers to analytical queries that are multi-dimensional in nature.@@@@1@21@@danf@17-8-2009 10141460@unknown@formal@none@1@S@OLAP is part of the broader category [[business intelligence]], which also encompasses relational reporting and data mining.@@@@1@17@@danf@17-8-2009 10141470@unknown@formal@none@1@S@=== Artificial intelligence ===@@@@1@4@@danf@17-8-2009 10141480@unknown@formal@none@1@S@; [[Artificial intelligence]]@@@@1@3@@danf@17-8-2009 10141490@unknown@formal@none@1@S@: The implementation and study of systems that exhibit an autonomous intelligence or behaviour of their own.@@@@1@17@@danf@17-8-2009 10141500@unknown@formal@none@1@S@; [[Artificial life]]@@@@1@3@@danf@17-8-2009 10141510@unknown@formal@none@1@S@: The study of digital organisms to learn about biological systems and evolution.@@@@1@13@@danf@17-8-2009 10141520@unknown@formal@none@1@S@; [[Automated reasoning]]@@@@1@3@@danf@17-8-2009 10141530@unknown@formal@none@1@S@: Solving engines, such as used in [[Prolog]], which produce steps to a result given a query on a fact and rule database.@@@@1@23@@danf@17-8-2009 10141540@unknown@formal@none@1@S@; [[Computer vision]]@@@@1@3@@danf@17-8-2009 10141550@unknown@formal@none@1@S@: Algorithms for identifying three dimensional objects from one or more two dimensional pictures.@@@@1@14@@danf@17-8-2009 10141560@unknown@formal@none@1@S@; [[Machine learning]]@@@@1@3@@danf@17-8-2009 10141570@unknown@formal@none@1@S@: Automated creation of a set of rules and axioms based on input.@@@@1@13@@danf@17-8-2009 10141580@unknown@formal@none@1@S@; [[Natural language processing]]/[[Computational linguistics]]@@@@1@5@@danf@17-8-2009 10141590@unknown@formal@none@1@S@: Automated understanding and generation of human language@@@@1@8@@danf@17-8-2009 10141600@unknown@formal@none@1@S@; [[Robotics]]@@@@1@2@@danf@17-8-2009 10141610@unknown@formal@none@1@S@: Algorithms for controlling the behavior of robots.@@@@1@8@@danf@17-8-2009 10141620@unknown@formal@none@1@S@=== Visual rendering (or Computer graphics) ===@@@@1@7@@danf@17-8-2009 10141630@unknown@formal@none@1@S@; [[Computer graphics]]@@@@1@3@@danf@17-8-2009 10141640@unknown@formal@none@1@S@: Algorithms both for generating visual images synthetically, and for integrating or altering visual and spatial information sampled from the real world.@@@@1@22@@danf@17-8-2009 10141650@unknown@formal@none@1@S@; [[Image processing]]@@@@1@3@@danf@17-8-2009 10141660@unknown@formal@none@1@S@: Determining information from an image through computation.@@@@1@8@@danf@17-8-2009 10141670@unknown@formal@none@1@S@=== Human-Computer Interaction ===@@@@1@4@@danf@17-8-2009 10141680@unknown@formal@none@1@S@; [[Human computer interaction]]@@@@1@4@@danf@17-8-2009 10141690@unknown@formal@none@1@S@: The study of making computers and computations useful, usable and universally accessible to [[user (computing)|people]], including the study and design of computer interfaces through which people use computers.@@@@1@29@@danf@17-8-2009 10141700@unknown@formal@none@1@S@=== Scientific computing ===@@@@1@4@@danf@17-8-2009 10141710@unknown@formal@none@1@S@; [[Bioinformatics]]@@@@1@2@@danf@17-8-2009 10141720@unknown@formal@none@1@S@: The use of computer science to maintain, analyse, and store [[biological data]], and to assist in solving biological problems such as [[protein folding]], function prediction and [[phylogeny]].@@@@1@28@@danf@17-8-2009 10141730@unknown@formal@none@1@S@; [[Cognitive Science]]@@@@1@3@@danf@17-8-2009 10141740@unknown@formal@none@1@S@: Computational modelling of real minds@@@@1@6@@danf@17-8-2009 10141750@unknown@formal@none@1@S@; [[Computational chemistry]]@@@@1@3@@danf@17-8-2009 10141760@unknown@formal@none@1@S@: Computational modelling of theoretical chemistry in order to determine chemical structures and properties@@@@1@14@@danf@17-8-2009 10141770@unknown@formal@none@1@S@; [[Computational neuroscience]]@@@@1@3@@danf@17-8-2009 10141780@unknown@formal@none@1@S@: Computational modelling of real brains@@@@1@6@@danf@17-8-2009 10141790@unknown@formal@none@1@S@; [[Computational physics]]@@@@1@3@@danf@17-8-2009 10141800@unknown@formal@none@1@S@: Numerical simulations of large non-analytic systems@@@@1@7@@danf@17-8-2009 10141810@unknown@formal@none@1@S@; [[Numerical analysis|Numerical algorithms]]@@@@1@4@@danf@17-8-2009 10141820@unknown@formal@none@1@S@: Algorithms for the numerical solution of mathematical problems such as [[Root-finding algorithm|root-finding]], [[Numerical integration|integration]], the [[Numerical ordinary differential equations|solution of ordinary differential equations]] and the approximation/evaluation of [[special functions]].@@@@1@30@@danf@17-8-2009 10141830@unknown@formal@none@1@S@; [[Symbolic mathematics]]@@@@1@3@@danf@17-8-2009 10141840@unknown@formal@none@1@S@: Manipulation and solution of expressions in symbolic form, also known as [[Computer algebra]].@@@@1@14@@danf@17-8-2009 10141850@unknown@formal@none@1@S@=== Didactics of computer science/informatics ===@@@@1@6@@danf@17-8-2009 10141860@unknown@formal@none@1@S@The subfield didactics of computer science focuses on cognitive approaches of developing competencies of computer science and specific strategies for analysis, design, implementation and evaluation of excellent lessons in computer science.@@@@1@31@@danf@17-8-2009 10141870@unknown@formal@none@1@S@== Computer science education ==@@@@1@5@@danf@17-8-2009 10141880@unknown@formal@none@1@S@Some universities teach computer science as a theoretical study of computation and algorithmic reasoning.@@@@1@14@@danf@17-8-2009 10141890@unknown@formal@none@1@S@These programs often feature the [[theory of computation]], [[analysis of algorithms]], [[formal methods]], [[Concurrency (computer science)|concurrency theory]], [[databases]], [[computer graphics]] and [[systems analysis]], among others.@@@@1@25@@danf@17-8-2009 10141900@unknown@formal@none@1@S@They typically also teach [[computer programming]], but treat it as a vessel for the support of other fields of computer science rather than a central focus of high-level study.@@@@1@29@@danf@17-8-2009 10141910@unknown@formal@none@1@S@Other colleges and universities, as well as [[secondary school]]s and vocational programs that teach computer science, emphasize the practice of advanced [[computer programming]] rather than the theory of algorithms and computation in their computer science curricula.@@@@1@36@@danf@17-8-2009 10141920@unknown@formal@none@1@S@Such curricula tend to focus on those skills that are important to workers entering the software industry.@@@@1@17@@danf@17-8-2009 10141930@unknown@formal@none@1@S@The practical aspects of computer programming are often referred to as [[software engineering]].@@@@1@13@@danf@17-8-2009 10141940@unknown@formal@none@1@S@However, there is a lot of [[Debates within software engineering|disagreement]] over what the term "software engineering" actually means, and whether it is the same thing as programming.@@@@1@27@@danf@17-8-2009 10150010@unknown@formal@none@1@S@
Corpus linguistics
@@@@1@2@@danf@17-8-2009 10150020@unknown@formal@none@1@S@'''Corpus linguistics''' is the [[study of language]] as expressed in [[sample]]s ''([[Text corpus|corpora]])'' or "real world" text.@@@@1@17@@danf@17-8-2009 10150030@unknown@formal@none@1@S@This method represents a [[digest]]ive approach to deriving a set of abstract rules by which a [[natural language]] is governed or else relates to another language.@@@@1@26@@danf@17-8-2009 10150040@unknown@formal@none@1@S@Originally done by hand, corpora are largely derived by an automated process, which is corrected.@@@@1@15@@danf@17-8-2009 10150050@unknown@formal@none@1@S@Computational methods had once been viewed as a [[holy grail]] of [[linguistics|linguistic]] research, which would ultimately manifest a [[ruleset]] for [[natural language processing]] and [[machine translation]] at a high level.@@@@1@30@@danf@17-8-2009 10150060@unknown@formal@none@1@S@Such has not been the case, and since the [[cognitive revolution]], cognitive linguistics has been largely critical of many claimed practical uses for corpora.@@@@1@24@@danf@17-8-2009 10150070@unknown@formal@none@1@S@However, as [[computation]] capacity and speed have increased, the use of corpora to study language and term relationships en masse has gained some respectability.@@@@1@24@@danf@17-8-2009 10150080@unknown@formal@none@1@S@The corpus approach runs counter to [[Noam Chomsky]]'s view that real language is riddled with performance-related errors, thus requiring careful analysis of small speech samples obtained in a highly controlled laboratory setting.@@@@1@32@@danf@17-8-2009 10150090@unknown@formal@none@1@S@Corpus linguistics does away with Chomsky's ''competence/performance'' split; adherents believe that reliable language analysis best occurs on field-collected samples, in natural contexts and with minimal experimental interference.@@@@1@27@@danf@17-8-2009 10150100@unknown@formal@none@1@S@== History ==@@@@1@3@@danf@17-8-2009 10150110@unknown@formal@none@1@S@A landmark in modern corpus linguistics was the publication by [[Henry Kucera]] and [[Nelson Francis]] of ''Computational Analysis of Present-Day American English'' in 1967, a work based on the analysis of the [[Brown Corpus]], a carefully compiled selection of current American English, totalling about a million words drawn from a wide variety of sources.@@@@1@54@@danf@17-8-2009 10150120@unknown@formal@none@1@S@Kucera and Francis subjected it to a variety of computational analyses, from which they compiled a rich and variegated opus, combining elements of linguistics, language teaching, [[psychology]], [[statistics]], and [[sociology]].@@@@1@30@@danf@17-8-2009 10150130@unknown@formal@none@1@S@A further key publication was [[Randolph Quirk]]'s 'Towards a description of English Usage' (1960, Transactions of the Philological Society, 40-61) in which he introduced ''The Survey of English Usage''.@@@@1@29@@danf@17-8-2009 10150140@unknown@formal@none@1@S@Shortly thereafter, Boston publisher [[Houghton-Mifflin]] approached Kucera to supply a million word, three-line citation base for its new ''[[The American Heritage Dictionary of the English Language|American Heritage Dictionary]]'', the first [[dictionary]] to be compiled using corpus linguistics.@@@@1@37@@danf@17-8-2009 10150150@unknown@formal@none@1@S@The AHD made the innovative step of combining prescriptive elements (how language ''should'' be used) with descriptive information (how it actually ''is'' used).@@@@1@23@@danf@17-8-2009 10150160@unknown@formal@none@1@S@Other publishers followed suit.@@@@1@4@@danf@17-8-2009 10150170@unknown@formal@none@1@S@The British publisher Collins' [[COBUILD]] [[monolingual learner's dictionary]], designed for users learning [[English language learning and teaching|English as a foreign language]], was compiled using the [[Bank of English]].@@@@1@28@@danf@17-8-2009 10150180@unknown@formal@none@1@S@The [[Brown Corpus]] has also spawned a number of similarly structured corpora: the [[LOB Corpus]] (1960s [[British English]]), Kolhapur ([[Indian English]]), Wellington ([[New Zealand English]]), Australian Corpus of English ([[Australian English]]), the Frown Corpus ([[early 1990s]] [[American English]]), and the FLOB Corpus (1990s British English).@@@@1@45@@danf@17-8-2009 10150190@unknown@formal@none@1@S@Other corpora represent many languages, varieties and modes, and include the [[International Corpus of English]], and the [[British National Corpus]], a 100 million word collection of a range of spoken and written texts, created in the 1990s by a consortium of publishers, universities ([[Oxford University|Oxford]] and [[Lancaster University|Lancaster]]) and the [[British Library]].@@@@1@52@@danf@17-8-2009 10150200@unknown@formal@none@1@S@For contemporary American English, work has stalled on the [[American National Corpus]], but the 360 million word [[Corpus of Contemporary American English (COCA)]] (1990-present) is now available.@@@@1@27@@danf@17-8-2009 10150210@unknown@formal@none@1@S@== Methods ==@@@@1@3@@danf@17-8-2009 10150220@unknown@formal@none@1@S@This means dealing with real input data, where descriptions based on a linguist's intuition are not usually helpful.@@@@1@18@@danf@17-8-2009