[222001000010] |
If words have definitions, they have odd definitions
[222001000020] |Last night, when the KU-Memphis NCAA basketball championship went into overtime, one of the announcers remarked, "Kansas knows what it's like to be in an overtime championship game."
[222001000030] |This struck me as an odd statement, since they had mentioned only a few minutes ago that there hadn't been overtime in an NCAA basketball championship since 1997.
[222001000040] |A few moments later, we learned that the announcer was referring to a triple-overtime game in the 1950s.
[222001000050] |The 1950s!
[222001000060] |There may have been some in the audience who remember that game, but I doubt anybody directly associated with the KU basketball team does.
[222001000070] |You may be willing to view this as announcers spewing nonsense as they usually do, but it's actually an example of an incredibly odd feature of language (or, perhaps, of the way we think).
[222001000080] |Chomsky's favorite example is London, which has existed for a thousand years, during which nearly every pebble has been replaced.
[222001000090] |You could tear down London and rebuild it across the Thames a thousand years from now, and it would still be London.
[222001000100] |More colloquially, there is a joke about an farmer who says, "This is an excellent hammer.
[222001000110] |I've had it for years.
[222001000120] |I've only had to replace the head three times and the handle twice."
[222001000130] |This problem applies to people as well.
[222001000140] |It turns out (I don't remember where I read this, unfortunately) that every molecule in your body is replaced every few years, such that nothing that is in your body now was in it a decade ago.
[222001000150] |Yet you still believe you are the same person.
[222001000160] |Clearly, humans are comfortable with the notion that the object remains constant even if all the parts change.
[222001000170] |This interestingly squares well with work in developmental psychology which suggests that infants recognize objects based on spatial cohesiveness (objects don't break apart and reform) and spatial continuity (objects don't disappear and reappear elsewhere).
[222001000180] |However, they are perfectly comfortable with objects that radically change shape -- for instance, from a duck into a truck.
[222001000190] |It isn't until about the time that children begin to speak that they expect ducks to stay ducks and trucks to stay trucks.
[222001160010] |Stupid babies learn language
[222001160020] |It is well-known that infants learn their native languages with incredible ease.
[222001160030] |I just came across a passage that puts this into particularly striking context:
[222001160040] |A first point to note here is the obvious intellectual limitations that children have while language acquisition proceeds apparently without any effort.
[222001160050] |We are all extremely impressed if a two-year-old figures out to put the square blocks in the square holes and the round blocks in the round holes.
[222001160060] |Yet somehow by this age children are managing to cope with the extraordinarily difficult task of learning language.
[222001160070] |This is particularly impressive, the authors point out, given that according to a number of theories
[222001160080] |we are to believe that children do both of these things using the very same domain neutral intellectual resources.
[222001160090] |THis is all the more remarkable given that a complete grammar for a single language remains an uncompleted goal for professional linguists.
[222001160100] |Laurence, S., Margolis, E. (2001).
[222001160110] |The poverty of the stimulus argument.
[222001160120] |British Journal of the Philosophy of Science, 52, 217-276.
[222001280010] |Autism and Vaccines
[222001280020] |Do vaccines cause autism?
[222001280030] |It is a truism that nothing can ever be disproven (in fact, one of the most solid philosophical proofs is that neither science -- nor any other extant method of human discovery -- can prove any empirical claims either).
[222001280040] |That said, the evidence for vaccines causing autism is about as good as the the evidence that potty-training causes autism.
[222001280050] |Symptoms of autism begin to occur around some time after the 2-year-old vaccinations, which is also about the same time potty-training typically happens.
[222001280060] |Otherwise, a number of studies have failed to find any link.
[222001280070] |Nonetheless, the believers in the vaccines-cause-autism theory have convinced some reasonably mainstream writers and even all three major presidential candidates that the evidence is, at the worst, only "inconclusive."
[222001280080] |My purpose here is not to debunk the vaccine myth. Others have done it better than I can.
[222001280090] |My purpose is to point out that, even if the myth were true, not vaccinating your children would be a poor solution.
[222001280100] |It has been such a long time since we've had to deal with polio and smallpox, that people have forgotten just how scary they were.
[222001280110] |In 1952, at the height of the polio epidemics, around 14 out of 100,000 of every Americans had paralytic polio.
[222001280120] |300-500 million people died of smallpox in the 20th century.
[222001280130] |Add in hepatitis A, hepatitis B, mumps, measles, rubella, diptheria, pertussis, tetanus, HiB, chicken pox, rotavirus, meningococcal disease, pneumonia and the flu, and no wonder experts estimate that "fully vaccinating all U. S. children born in a given year from birth to adolescence saves an estimated 33,000 lives and prevents an estimated 14 million infections."
[222001280140] |Thus, while current estimates are that 0.6% of American children develop autism, 0.8% would have died without vaccines -- and that's not counting blindness, paralysis, etc.
[222001280150] |It seems like a good trade, even if you assume that every single case of Autism is due to vaccines.
[222001310010] |Harvard Laboratory of Developmental Studies summer internship program
[222001310020] |The summer internship program at the Harvard Laboratory of Developmental Studies began this Monday.
[222001310030] |Lots of labs take interns.
[222001310040] |In fact, if you seem motivated, smart and competent, I suspect pretty much any lab would take you on as a volunteer for the summer.
[222001310050] |What makes the LDS internship program different is that it's an actual program.
[222001310060] |The labs (primarily the Snedeker Lab and the Carey Lab -- Spelke Lab does not participate directly, though there is so much sharing between the three labs they often do so indirectly) take about a dozen undergraduates each summer.
[222001310070] |Each participant is assigned to a specific research project run by a specific graduate student.
[222001310080] |The projects are chosen such that there is a good chance they will be completed successfully before the internship program ends, making sure the interns have something to talk about at future job or school interviews.
[222001310090] |The faculty advisers are very concerned that the interns don't just do busy work but actually learn something, so interns participate in a weekly reading group as well as a weekly lab meeting.
[222001310100] |In addition, there are activities such as the twice-monthly barbecue (organized this year, in part, by yours truly).
[222001310110] |Oh, and many of the summer students get financial support, which is a definite plus.
[222001310120] |Anyway, it appears to be a good program.
[222001310130] |This is my first summer participating (my intern will be studying pronoun use), so we'll see how it goes.
[222001640010] |Who does Web-based experiments?
[222001640020] |Obviously, I would prefer that people do my Web-based experiments.
[222001640030] |Having done those, though, the best place to find a list of Web-based experiments is to check the list maintained at the University of Hanover.
[222001640040] |Who is posting experiments?
[222001640050] |One interesting question that can be answered by this list is who exactly does experiments online.
[222001640060] |I checked the list of recent experiments posted under the category of Cognition.
[222001640070] |From June 1st through September 12, experiments were posted by
[222001640080] |Brown University 2 UCLA 1 University College London 2 University of Cologne 1 Duke University 2 University of London 1 Harvard University 1 University of Saskatchewan 3 University of Leeds 2 University of Minnesota 1
[222001730010] |Word Sense: A new experiment from the Cognition & Language Lab
[222001730020] |The last several months have been spent analyzing data and writing papers.
[222001730030] |Now that one paper has been published, two are well into the review process, and another is mostly written, it is time at the Cognition and Language lab to start actively collecting data again.
[222001730040] |I just posted our first major new experiment since last winter.
[222001730050] |It is called "Word Sense," and it takes about 5 minutes to complete.
[222001730060] |It can be taken by anybody of any age and of any language background.
[222001730070] |As always, you can view a description of the study at the end.
[222001730080] |You also will get a summary of your own results.
[222001730090] |I'll be writing more about this research project in future posts.
[222001810010] |Singapore's Science Complex
[222001810020] |Among developing countries that are investing heavily in science, Singapore (is Singapore still a developing country?) stands out.
[222001810030] |A recent article in Nature profiles a massive new public/private science complex called "Fusionopolis."
[222001810040] |This is a physical-sciences counterpart to the existing "Biopolis."
[222001810050] |Although the overall spending rate on science is still small by country standards ($880 million/year), it is impressive on a per-capita basis.
[222001810060] |Currently, it is spending 2.6% of its gross domestic product on science, and plans to increase to 3% by the end of the decade, which would put it ahead of Britain and the United States.
[222001810070] |What struck me in the article was that Singapore is very explicit about it's goal, and that isn't knowledge for knowledge's sake.
[222001810080] |According to Chuan Poh Lim, chairman of A*STAR, Singapore's central agency for encouraging science and technology, Singapore recognizes it can't compete with China or India as a low-cost manufacturer.
[222001810090] |"In terms of 'cheaper and faster', we will lose out.
[222001810100] |We need a framework for innovation."
[222001810110] |The ultimate goal is to build an economy with a stronger base in intellectual property, generating both new products and also patent royalties.
[222001960010] |Who are you calling a neuroscientist: Has neuroscience killed psychology?
[222001960020] |The Chronicle of Higher Education just produced a list of the five young scholars to watch who combine neuroscience and psychology.
[222001960030] |The first one listed is George Alvarez, who was just hired by Harvard.
[222001960040] |Alvarez should be on anybody's top five list.
[222001960050] |The department buzzed for a week after his job talk, despite the fact that many of us already knew his work.
[222001960060] |What is impressive is not only the quantity of his research output -- 19 papers at last count, with 6 under review or revision -- but the number of truly ground-breaking pieces of work.
[222001960070] |Several of his papers have been very influential in my own work on visual working memory.
[222001960080] |He is also one of the best exemplars of classical cognitive psychology I know.
[222001960090] |His use of neuroscience techniques is minimal, and currently appears to be limited to a single paper (Batelli, Alvarez, Carlson &Pascual-Leone, in press).
[222001960100] |Again, this is not a criticism.
[222001960110] |Neurons vs. Behavior This is particularly odd in the context of the attached article, which tries to explore the relationship between neuroscience techniques and psychology.
[222001960120] |Although there is some balance, with a look at the effect of neuroscience in draining money away from traditional cognitive science, I read the article as promoting the notion that the intersection of neuroscience and psychology is not just the place to be at, it's the only place to be at.
[222001960130] |Alvarez is one of the best examples of the opposite claim: that there is still a lot of interesting cognitive science to be done that doesn't require neuroimaging.
[222001960140] |I should point out that I say this all as a fan of neuroscience, and as somebody currently designing both ERP and fMRI experiments.
[222001960150] |EEG vs. fMRI One more thing before I stop beating up on the Chronicle (which is actually one of my favorite publications).
[222001960160] |The article claims that EEG (the backbone of ERP) offers less detailed information about the brain in comparison with fMRI.
[222001960170] |The truth is that EEG offers less detailed information about spatial location, but its temporal resolution is far greater.
[222001960180] |If the processes you are studying are lightning-fast and the theories you are testing make strong claims about the timing of specific computations, fMRI is not ideal.
[222001960190] |I think this is why fMRI has had less impact on the study of language than it has in some other areas.
[222001960200] |For instance, the ERP study I am working on looks at complex interactions between semantic and pragmatic processes that occur over a few hundred milliseconds.
[222001960210] |I have seen some very inventive fMRI work on the primary visual cortex that managed that kind of precision, but it is rare (and probably only succeeded because the layout of the visual areas of the brain, in contrast with the linguistic areas, is fairly well-established).
[222002000010] |CogLangLab по-русски
[222002000020] |One of the advantages of posting experiments on the Web rather than running them in a lab is that it makes it easier to recruit participants who don't happen to live near the lab.
[222002000030] |Two years ago, I was testing an idea in the literature about the differences between reading an alphabetic script like English vs. reading a character-based script like Chinese.
[222002000040] |Although there are a fair number of Chinese-speakers living in Cambridge, it was still a lot of work to recruit enough participants.
[222002000050] |When I finally do the follow-up study, I'll post it on the Web.
[222002000060] |In the meantime, I have a new experiment in Russian.
[222002000070] |Because America produces the bulk of the world's scientific research (for now), much of the work on language has focused on English.
[222002000080] |Periodically, it's a good idea to check other languages and make sure what we know about English generalizes (or doesn't).
[222002000090] |And so was born
[222002000100] |Угадай кто сликтопоз
[222002000110] |It takes about 5 minutes to complete.
[222002000120] |Participate in it by clicking here.
[222002080010] |Skeptical of the Skeptics
[222002080020] |I have complained -- more than once -- that the media and the public believe a psychological fact (some people are addicted to computer games) if a neuroimaging study is somehow involved, even if the study itself is irrelevant (after all, the definition of addiction does not require a certain pattern of brain activity -- it requires a certain pattern of physical activity).
[222002080030] |Not surprisingly, I and like-minded researchers were pleased when a study came out last year quantifying this apparent fact.
[222002080040] |That is, the researchers actually found that people rated an explanation of a psychological phenomenon as better if it contained an irrelevant neuroscience fact.
[222002080050] |Neuroskeptic has written a very provocative piece urging us to be skeptical of this paper:
[222002080060] |This kind of research - which claims to provide hard, scientific evidence for the existence of a commonly believed in psychological phenomenon, usually some annoyingly irrational human quirk - is dangerous; it should always be read with extra care.
[222002080070] |The danger is that the results can seem so obviously true ("Well of course!") and so important ("How many times have I complained about this?") that the methodological strengths and weaknesses of the study go unnoticed.
[222002080080] |Read the rest of the post here.
[222002330010] |Will There be a Neuroscience Culture War?
[222002330020] |Martha Farah (Neuroscientist, UPenn) and Nancey Murphy (Theologian, Fuller Theological Seminary), writing in a recent issue of Science, argue that "neuroscience will post a far more fundamental challenge than evolutionary biology to many religions."
[222002330030] |The reason is straightforward:
[222002330040] |Most religions endorse the idea of a soul (or spirit) that is distinct from the physical body ...
[222002330050] |However, as neuroscience begins to reveal the mechanisms underlying personality, love, morality, and spirituality, the idea of a ghost in the machine becomes strained.
[222002330060] |Brain imaging indicates that all of these traits have physical correlates in brain function.
[222002330070] |Furthermore, pharmacologic influences on these traits, as well as the effects of localized stimulation or damage, demonstrate that the brain processes in question are not mere correlates but are the physical bases of these central aspects of our personhood.
[222002330080] |If these aspects of the person are all features of the machine, why have a ghost at all?
[222002330090] |[Emphasis mine.]
[222002330100] |While not at all detracting from their point, it's interesting that neuroscience does not yet seem to be a major target of religious conservatives.
[222002330110] |The authors argue that such a backlash is a brewin' ("'Nonmaterialist neuroscience' has joined 'intelligent design' as an alternative interpretation of scientific data"), but the evidence is a recently published book.
[222002330120] |The term gets a paltry number of Google hits, the first few of which, at least, are people attacking the concept.
[222002330130] |They make one further interesting point: dualism is a relatively new concept, which came into existence about a century later than Jesus.
[222002330140] |By implication, those who insist on a strict interpretation of the Bible actually should support materialism.
[222002330150] |If the culture war comes, this is unlikely to make a compelling argument, but it does say something very interesting about human nature.
[222002870010] |The science of puns
[222002870020] |Puns are jokes involving ambiguous words.
[222002870030] |Some are hilarious.
[222002870040] |Some are only groan-worthy.
[222002870050] |(If anyone wants to argue the first one isn't a pun, comment away.)
[222002870060] |Wait -- aren't all words ambiguous?
[222002870070] |The thing about puns that pops out at me is that pretty much every word is ambigous.
[222002870080] |My parade-case for an unambiguous word -- really a phrase -- is George Washington.
[222002870090] |But, of course, more than one person has been named "George Washington," so even that one is ambiguous.
[222002870100] |In any case, the vast majority of words are ambiguous -- including the words word and pun.
[222002870110] |Most of the time we don't even notice that a words is ambiguous, because our brains quickly select the contextually-appropriate meaning.
[222002870120] |For a pun, this selection has to fail in some way, since we remain aware of the multiple appropriate meanings.
[222002870130] |Studying puns As someone who studies how context is used to understand language, this makes puns and other homophones interesting.
[222002870140] |I'm also currently starting a new EEG experiment looking at the processing of ambiguous words in folks on the autism spectrum.
[222002870150] |But one question that I'm interested in is why some puns are funnier than others.
[222002870160] |As a first step in that work, I need a bunch of puns ranked for how funny they are.
[222002870170] |There are website out there that have these types of rankings, but they don't use the sorts of controls that any peer-reviewed journal will demand.
[222002870180] |So I am running a rank-that-pun "experiment" at GamesWithWords.org called Puntastic!
[222002870190] |It goes without saying that once I've got enough puns rated, I'll post what I learn here (including what were the funniest and the worst puns).
[222002870200] |This make take a few months, though, as I'm getting ratings for over 500 puns.
[222002870210] |(picture above isn't just a picture -- it's a t-shirt)
[222003420010] |Texting during sex
[222003420020] |"Teens surprisingly OK with texting during sex," notes Slate's news aggregator.
[222003420030] |This seemed like a good lead for a piece I've wanted to write for a while: just how much we should trust claims that 10% of people agree to claim X. In many cases, we probably should put little faith in those numbers.
[222003420040] |As usual, Stephen Colbert explains why.
[222003420050] |In his infamous roast of George W Bush, he notes
[222003420060] |Now I know there's some polls out there that say this man has a 32 percent approval rating ...
[222003420070] |Pay no attention to people who say the glass is half empty .. because 32 percent means it's 2/3 empty.
[222003420080] |There's still some liquid in that glass, is my point.
[222003420090] |But I wouldn't drink it.
[222003420100] |The last third is usually backwash.
[222003420110] |This was meant as a dig at those who still supported Bush, but there's a deeper point to be made: there's a certain percentage of people who, in a survey, will say "yes" to anything.
[222003420120] |Numbers
[222003420130] |For instance, many of my studies involve asking people's opinions about various sentences.
[222003420140] |In a recent one I ran on Amazon Mechanical Turk, I presented people with sentence fragments and asked them which pronoun they thought would likely be the next word in the sentence:
[222003420150] |John went to the store with Sally.
[222003420160] |She/he...
[222003420170] |In that case, it could be either pronoun, so I'm trying to get a sense of what people's biases are.
[222003420180] |However, I put in some filler trials just to make sure people were paying attention:
[222003420190] |Billy went to the store with Alfred.
[222003420200] |She/he...
[222003420210] |In this case, it's really, really unlikely the pronoun will be "she," since there aren't any female characters in the story.
[222003420220] |Even so, over 4% of the time participants still clicked on "she."
[222003420230] |This wasn't an issue of some of the participants simply being bad.
[222003420240] |I included 10 such sentences, and nobody only one person got more than 1 of these wrong.
[222003420250] |However, a lot of people did manage to miss 1 ... probably because they simply were sloppy, made a mistake, were momentarily not thinking ... or because they really thought the next word would be "she."
[222003420260] |Those numbers are actually pretty good.
[222003420270] |In another, slightly harder experiment that I ran on my website, people didn't do so well.
[222003420280] |This one was shorter, so I included only 4 "catch trials" -- questions for which there was only one reasonable answer.
[222003420290] |Below is a pie chart of the participants, according to how many of these they got right:
[222003420300] |You can see that over half got them all right, but around a quarter missed 1, and a significant sliver got no more than 50% correct.
[222003420310] |This could suggest many things: my questions weren't as well-framed as I thought, I had a lot of participants who weren't paying attention, some people were deliberately goofing off, etc.
[222003420320] |Poll numbers
[222003420330] |This isn't a problem specific to experiments.
[222003420340] |As we all learned in 2000, a certain number of people accidentally vote for the wrong candidate through some combination of not paying attention and poor ballot design.
[222003420350] |So there is a difference between a survey finding that 10% of teens say that they think texting during sex is fine and 10% of teens actually thinking that texting during sex is fine.
[222003420360] |A good survey will incorporate methods of sussing out who is pulling the surveyor's leg (or not paying attention, or having a slip of the tongue, etc.).
[222003420370] |Real Surveys
[222003420380] |I didn't want to unnecessarily pick on this particular study, so I tried to hunt down the original source to see if they had done anything to protect against the "backwash" factor.
[222003420390] |Slate linked to a story on mashable.com.
[222003420400] |Mashable claimed that the research was done by the consumer electronics shopping and review site Retrevo, but only linked to Retrevo's main page, not any particular article.
[222003420410] |I did find a blog on Retrevo that frequently presents data from surveys, but nothing this year matched the survey in question (though this comes close).
[222003420420] |I found several other references to this study using Google, but all referenced Retrevo.
[222003420430] |If anyone knows how to find the original study, I'd love to see it -- but if it doesn't exist, it wouldn't be the first apocryphal study.
[222003420440] |So it turns out that the backwash effect isn't the only thing to be careful of when reading survey results.
[222003420450] |UPDATE
[222003420460] |I have since heard from Revetro.
[222003420470] |See here.
[222003430010] |Science blogging and the law
[222003430020] |The Kennedy School at Harvard had a conference on science journalism.
[222003430030] |Among the issues discussed were legal pitfalls science bloggers can run into.
[222003430040] |Check out this blog post at the Citizen Media Law Project.
[222003840010] |Is psychology a science, redux
[222003840020] |Is psychology a science?
[222003840030] |I see this question asked a lot on message boards, and it's time to discuss it again here.
[222003840040] |I think the typical response by a researcher like myself is an annoyed "of course, you ignoramus."
[222003840050] |But a more subtle response is deserved, as the answer depends entirely on what you mean by "psychology" and what you mean by "science."
[222003840060] |Two Psychologies
[222003840070] |First, if by "psychology" you mean seeing clients (like in Good Will Hunting or Silence of the Lambs), then, no, it's probably not a science.
[222003840080] |But that's a bit like asking whether engineers or doctors are scientists.
[222003840090] |Scientists create knowledge.
[222003840100] |Client-visiting psychologists, doctors and engineers use knowledge.
[222003840110] |Of course, you could legitimately ask whether client-visiting psychologists base their interventions on good science.
[222003840120] |Many don't, but that's also true about some doctors and, I'd be willing to bet, engineers.
[222003840130] |Helpfully, "engineering" and "physics" are given different names, while the research and application ends of psychology confusingly share the same name.
[222003840140] |(Yes, I'm aware that engineering is not hte application of physics writ broadly -- what's the application of string theory? -- and one can be a chemical engineer, etc.
[222003840150] |I actually think that makes the analogy to the two psychologies even more apt).
[222003840160] |It doesn't help that the only psychologists who show up in movies are the Good Will Hunting kind (though if paleoglaciologists get to save the world, I don't see why experimental psychologists don't!), but it does exist.
[222003840170] |A friend of mine (a physicist) once claimed psychologists don't do experiments (he said this un-ironically over IM while I was killing time in a psychology research lab).
[222003840180] |My response now would be to invite him to participate in one of these experiments.
[222003840190] |Based on this Facebook group, I know I'm not the only one who has heard this.
[222003840200] |Methods
[222003840210] |There are also those, however, who are aware that psychologists do experiments, but deny that it's a true science.
[222003840220] |Some of this has to do with the belief that psychologists still use introspection (there are probably some somewhere, but I suspect there are also physicists who use voodoo dolls somewhere as well, along with mathematicians who play the lottery).
[222003840230] |The more serious objection has to do with the statistics used in psychology.
[222003840240] |In the physical sciences, typically a reaction takes place or does not, or a neutrino is detected is not.
[222003840250] |There is some uncertainty given the precision of the tools being used, but on the whole the results are fairly straight-forward and the precision is pretty good (unless you study turbulence or something similar).
[222003840260] |In psychology, however, the phenomena we study are noisy and the tools lack much precision.
[222003840270] |When studying a neutrino, you don't have to worry about whether it's hungry or sleepy or distracted.
[222003840280] |You don't have to worry about whether the neutrino you are studying is smarter than average, or maybe too tall for your testing booth, or maybe it's only participating in your experiment to get extra credit in class and isn't the least bit motivated.
[222003840290] |It does what it does according to fairly simple rules.
[222003840300] |Humans, on the other hand, are terrible test subjects.
[222003840310] |Psychology experiments require averaging over many, many observations in order to detect patterns within all that noise.
[222003840320] |Science is about predictions.
[222003840330] |In theory, we'd like to predict what an individual person will do in a particular instance.
[222003840340] |In practice, we're largely in the business of predicting what the average person will do in an average instance.
[222003840350] |Obviously we'd like to make more specific predictions (and there are those who can and do), but they're still testable (and tested) predictions.
[222003840360] |The alternative is to declare much of human and animal behavior outside the realm of science.
[222003840370] |Significant differences
[222003840380] |There are some who are on board so far but get off the bus when it comes to how statistics are done in psychology.
[222003840390] |Usually an experiment consists of determining statistically whether a particular result was likely to have occurred by chance alone.
[222003840400] |Richard Feynman famously thought this was nuts (the thought experiment is that it's unlikely to see a license plate printed CPT 349, but you wouldn't want to conclude much from it).
[222003840410] |That's missing the point.
[222003840420] |The notion of significant difference is really a measure of replicability.
[222003840430] |We're usually comparing a measurement across two populations.
[222003840440] |We may find population A is better than population B on some test.
[222003840450] |That could be because population A is underlyingly better at such tests.
[222003840460] |Alternatively, population A was lucky that day.
[222003840470] |A significant difference is essentially a prediction that if we test population A and population B again, we'll get the same results (better performance for population A).
[222003840480] |Ultimately, though, the statistical test is just a prediction (one that typically works pretty well) that the results will replicate.
[222003840490] |Ideally, all experiments would be replicated multiple times, but that's expensive and time-consuming, and -- to the extent that the statistical analysis was done correctly (a big if) -- largely unnecessary
[222003840500] |So what do you think?
[222003840510] |Are the social sciences sciences?
[222003840520] |Comments are welcome.
[223000060010] |Columbia Talk Postponed
[223000060020] |The talk I was supposed to give at Columbia has been postponed due to scheduling conflicts.
[223000060030] |It seems that I have competition from a GALE meeting.
[223000460010] |“select” Isn’t Broken; or Horses, not Zebras
[223000460020] |Hunt and Thomas’s book The Pragmatic Programmer is my first recommendation to someone starting to think about the process of programming.
[223000460030] |It’s organized into a series of tips, of which the relevant one for today’s discussion is:
[223000460040] |Number 26. “select” Isn’t Broken
[223000460050] |It is rare to find a bug in the OS or the compiler, or even a third-party product or library.
[223000460060] |The bug is most likely in the application.
[223000460070] |They tell a story about a stubborn colleague who insisted the select
system (multithreaded network I/O concept introduced into Java with the nio
package) was broken under Solaris.
[223000460080] |He finally read the doc and found the fix in minutes after weeks of burying his head in the sand.
[223000460090] |Hunt and Thomas’s advice in the text is “if you see hoof prints, think horses —not zebras”.
[223000460100] |This is actually a good operating principle.
[223000460110] |Just don’t rule out zebras if you happen to be in Africa or at a zoo.
[223000460120] |Or a patient on the T.V. show House, M.D..
[223000460130] |While I appreciate Hunt and Thomas’s faith in OSes, compilers and third-party libraries like LingPipe, the fact is, today’s complex compilers and libraries are more often the root of problems than the ANSI C compiler was back in the day.
[223000460140] |Let me enumerate problems we’ve had with Java that have affected Alias-i or LingPipe:
[223000460150] |2GB file transfer limit under Windows with java.nio
.
[223000460160] |This is a known bug.
[223000460170] |This caused me to have to rewrite my backup scripts when recently backing up all of Alias-i’s data.
[223000460180] |They worked fine for years at home backing up photos and music.
[223000460190] |JDK 1.4 Signature error for XML.
[223000460200] |This was another known bug.
[223000460210] |They missed an exception in one of the more rarely used methods, but it messed up all of LingPipe’s XML filters.
[223000460220] |I had to catch arbitrary exceptions, check them for type using instanceof
, and then rethrow in order to get compatibility across versions.
[223000460230] |1.6 Generics Compilation.
[223000460240] |Another known bug.
[223000460250] |Here the 1.6 compiler can’t handle complex conjunctive, dependent type declarations.
[223000460260] |I may be missing something, but that’s a non-trivial number of times it was Java’s fault.
[223000460270] |If you don’t trust me, check out the Java Bug Parade.
[223000460280] |In fact, in looking up the link, I found that:
[223000460290] |File.deleteOnExit() doesn’t work on open files.
[223000460300] |Some of our temp files never do get cleaned up under windows.
[223000460310] |I only just realized this was Java/Windows fault and not mine in looking at the Top 25 Java Bugs (most of which have to dow ith Swing).
[223000460320] |The most well known bug to be reported in Java recently was Josh Bloch’s bombshell about binary search:
[223000460330] |Binary Search is Broken for large arrays.
[223000460340] |Josh tells the story better than me.
[223000460350] |(The source of the bug, Jon Bentley’s book Programming Pearls, is also one of my top recommendations for people who want to think like developers, not theory professors [like me].)
[223000460360] |When I was working at SpeechWorks, we had lots of multi-threaded dialog/speech recognition code which worked fine on every platform but Linux, where threads would just die.
[223000460370] |I don’t even recall what the workaround was there.
[223000460380] |And the number of times bugs were traced to C++’s standard template library led the company to institute a no-C++ in the interface policy.
[223000460390] |It was just too non-portable.
[223000460400] |The Latest Debugging Story
[223000460410] |Now that I’ve probably lost most of the readers, I’ll confess that the recent perplexing problem with TF/IDF classification was indeed my fault and not Java’s.
[223000460420] |The unit tests were right —I’d spent hours building them by hand with pencil and paper.
[223000460430] |The implementation was wrong.
[223000460440] |I’d swapped two internal arrays, one hold term IDFs and one holding document TF/IDFs.
[223000460450] |It just coincidentally turned out the two arrays had the same values for “a” and “b” under the 1.5 iterator order, but not under the 1.6 iterator order.
[223000460460] |I hate dealing with problems like this where the compiler can’t catch the bug.
[223000460470] |A larger unit test almost certainly would’ve caught this problem.
[223000460480] |The 3.1.1 release patches the bug and is now available from the LingPipe Home Page.
[223000840010] |Hyphenation as a Noisy Channel
[223000840020] |Update November 2008: Our tutorial on hyphenation is now available as part of LingPipe and on the web at:
[223000840030] |LingPipe Hyphenation and Syllabification Tutorial
[223000840040] |Noisy channel models with very simple deterministic channels can be surprisingly effective at simple linguistic tasks like word splitting.
[223000840050] |We’ve used them for Chinese word segmentation (aka word boundary detection, aka tokenization), not to mention spelling correction.
[223000840060] |In this blog entry, I’ll provide a first look at our forthcoming tutorial on hyphenation.
[223000840070] |The hyphenation problem is that of determining if a hyphen can be inserted between two positions in a word.
[223000840080] |Hyphenation is an orthographic process, which means it operates over spellings.
[223000840090] |In contrast, syllabification is a phonological (or phonetic) process, which means it operates over sounds.
[223000840100] |Hyphenation roughly follows syllabification, but is also prone to follow morphological split points.
[223000840110] |The hyphenation problem isn’t even well-defined on a per-word level.
[223000840120] |There are pairs such as num-ber
(something you count with) and numb-er
(when you get more numb) that have the same spelling, but different pronunciations and corresponding hyphenations depending on how they are used.
[223000840130] |I’ll just ignore this problem here; in our evaluation, ambiguities always produce at least one error.
[223000840140] |The Noisy Channel Model
[223000840150] |The noisy channel model consists of a source that generates messages and a noisy channel along which they are passed.
[223000840160] |The receiver’s job is to recover (i.e. decode) the underlying message.
[223000840170] |For hyphenation, the source is a model of what English hyphenation looks like.
[223000840180] |The channel model deterministically removes spaces.
[223000840190] |The receiver thus needs to figure out where the hyphens should be reinserted to recover the original message at the source.
[223000840200] |The training corpus for the source model consists of words with hyphenations represented by hyphens.
[223000840210] |The model is just a character language model (I used 8-grams, but it’s not very length sensitive).
[223000840220] |This gives us estimates of probabilities like p("che-ru-bic")
and p("cher-u-bic")
.
[223000840230] |Our channel model just deterministically removes spaces, so that p("cherubic"|"che-ru-bic") = 1.0
.
[223000840240] |To use the noisy channel model to find a hyphenation given an unhyphenated word, we just find the hyphenation h
that is most likely given the word w
, using Bayes’s rule in a maximization etting: ARGMAXh p(h|w) = ARGMAXh p(w|h)*p(h)
.
[223000840250] |Because the channel probabilities p(w|h)
are always 1.0 if the characters in w
match those in h, this reduces to finding the hyphenation h
yielding character sequence w
which maximizes p(h)
.
[223000840260] |For example, the model will estimate "che-ru-bic"
to be a more likely hyphenation than "cher-u-bic"
if p("che-ru-bic") >p("cher-u-bic")
.
[223000840270] |Decoding is fairly efficient for this task, despite using a high-order n-gram language model, because the channel bounds the combinatorics by only allowing a single hyphen to be inserted at each point; dynamic programming into language model states can reduce them even further.
[223000840280] |English Evaluation
[223000840290] |So how do we test how well the model works?
[223000840300] |We just became members of the Linguistic Data Consortium, who distribute Baayen, Piepenbrock and Gulikers' CELEX 2 corpus.
[223000840310] |CELEX is a great corpus.
[223000840320] |It contains pronunciations, syllabifications, and hyphenations for a modest sized dictionary of Dutch, English and German.
[223000840330] |For instance, there are 66,332 distinct English words in the corpus (there are many more entries, but these constitute duplicates and compounds whose constituents have their normal hyphenations).
[223000840340] |These 66,332 word have 66,418 unique hyphenations, meaning 86 of the words lead to ambiguities.
[223000840350] |This is about 1/10th of a percent, so I won't worry about it here.
[223000840360] |I evaluated with randomly partitioned 10-fold cross-validation.
[223000840370] |With the noisy channel model above, LingPipe had a 95.4% whole word accuracy, with a standard deviation of 0.2%.
[223000840380] |That means we got 95.4% of the words completely correctly hyphenated.
[223000840390] |We can also look at the 111,521 hyphens in the corpus, over which we had a 97.3% precision and a 97.4% recall.
[223000840400] |That is, we missed 2.6% true hyphenation points (false negatives), and 2.7% of the hyphenations we returned were spurious (false positives).
[223000840410] |Finally, we can look at per-decision accuracy, for which there were 482,045 positions between characters, over which we were 98.8% accurate.
[223000840420] |Forward and Backward Models
[223000840430] |But that's not all.
[223000840440] |Like HMMs, there's a degree of label bias in a left-to-right language model.
[223000840450] |So I reversed all the data and built a right-to-left (or back-to-front) model.
[223000840460] |Using n-best extraction, I ran this two ways.
[223000840470] |First, I just added the score to the forward model to get an interpolated score.
[223000840480] |Somewhat surprisingly, it behaved almost identically to the forward-only model, with slightly lower per-hyphen and per-decision scores.
[223000840490] |More interestingly, I then ran them in intersect mode, which means only returning a hyphen if it was in the first-best analysis of both the left-to-right and right-to-left models.
[223000840500] |This lowered per-word accuracy to 94.6% (from 95.4%), but raised precision to 98.0% (from 97.3%) while only lowering recall to 96.7% (from 97.4%).
[223000840510] |Overall, hyphenation is considered to be a precision business in application, as it's usually used to split words across lines in documents, and many split points might work.
[223000840520] |Is it State of the Art?
[223000840530] |The best results I've seen for this task were reported in Bartlett, Kondrak and Cherry's 2008 ACL paper Automatic Syllabification with Structured SVMs for Letter-To-Phoneme Conversion, which also received an outstanding paper award.
[223000840540] |They treated this problem as a tagging problem and applied structured support vector machines (SVMs).
[223000840550] |On a fixed 5K testing set, they report a 95.65% word accuracy, which is slightly higher than our 95.4%.
[223000840560] |The 95% binomial confidence intervals for 5000 test cases are +/- 0.58% and our measured standard deviation was .2%, with results ranging from 95.1 to 95.7% on different folds.
[223000840570] |Their paper also tackled pronuncation, for which their hyphenation was only one feature.
[223000840580] |German and Dutch are easier to hyphenate than English.
[223000840590] |Syllabification in all of these languages is also easier than hyphenation.
[223000840600] |But is it better than 1990?
[223000840610] |Before someone in the audience gets mad at me, I want to point out that we could follow Coker, Church and Liberman (1990)'s lead in reporting results:
[223000840620] |The Bell Laboratories Text-to-Speech system, TTS, takes a radical dictionary-based approach; dictionary methods (with morphological and analogical extensions) are used for the vast majority of words.
[223000840630] |Only a fraction of a percent (0.5% of words overall; 0.1% of lowercase words) are left for letter-to-sound rules.
[223000840640] |Although this insight won't get us into NIPS, it's how we'd field an application.
[223002490010] |Breck in Time Out NY‘s Public Eye
[223002490020] |I (Bob) love Time Out New York‘s feature Public Eye.
[223002490030] |Each week, it’s a different New Yorker, usually in some kind of wild outfit.
[223002490040] |This week they’re featuring our Breck, Alias-i’s self-described “president, founder, whatever — chief janitor”.
[223002490050] |Check out the full feature:
[223002490060] |Breck in the Public Eye
[223002490070] |The photo’s a block or two from our office, which is next to McCarren Park, where you’ll often find Breck flying his planes.
[223002490080] |It’s too bad they didn’t catch him in his orange boiler suit, or one of his more stylin’ vintage or designer jackets.
[223002490090] |In Breck’s defense, he’s been wearing hats long before the trend among Williamsburg hipsters.
[223002490100] |Me, I’m a jeans and t-shirt and baseball cap kind of guy.
[223002490110] |The feature includes a plug for the Brooklyn Aerodrome; the site contains movie clips taken from cameras mounted on Breck’s RC planes, such as the transparent jobby featured in the photo above.
[223002750010] |Data Structure for Parse Trees?
[223002750020] |Writing parsers used to be my bread and butter, but I’m a bit stumped on how to create a reasonable object-oriented data structure for parse trees.
[223002750030] |Here’s a simple parse tree in the Penn Treebank‘s notation (borrowed from Wikipedia: Treebank):
[223002750040] |Conceptually trees come in two varieties.
[223002750050] |Lexical trees are made up of a root category and a word.
[223002750060] |For example, the lexical tree (NNP John)
has a root category NNP
and a word John
.
[223002750070] |Phrasal trees are made up of a root category and a sequence of subtrees.
[223002750080] |For example, the phrasal tree (VP (VPZ loves) (NP (NNP Mary)))
consists has root category VP
and two subtrees, the lexical subtree (VPZ loves)
and the phrasal subtree (NP (NNP Mary))
.
[223002750090] |The latter tree, rooted at NP
, only has a single subtree, (NNP Mary)
.
[223002750100] |On the other hand, the top level tree, rooted at S
, has three subtrees, rooted at NP
, VP
, and .
(period).
[223002750110] |Java Data Structures?
[223002750120] |So my question is, how do I represent parse trees in Java?
[223002750130] |For simplicity assume categories are strings (rather than, say generic objects or integer symbol table representations) and that words are also strings.
[223002750140] |A First Attempt
[223002750150] |An obvious first step would be to define an interface for trees,
[223002750160] |and then extend it for lexical trees,
[223002750170] |and phrasal trees,
[223002750180] |This pretty much matches our informal description of trees directly.
[223002750190] |The Problem
[223002750200] |So what’s the problem?
[223002750210] |Waiter, Taste My Soup
[223002750220] |Eddie Murphy‘s memorable schtick from Coming to America provides a hint:
[223002750230] |Eddie Murphy –as the old white man in the barber shop says, “Wait a minute.
[223002750240] |Wait a minute.
[223002750250] |Wait.
[223002750260] |Stop right there!
[223002750270] |Listen.
[223002750280] |Stop right there, man.
[223002750290] |A man goes into a restaurant.
[223002750300] |You listenin’?
[223002750310] |A man goes into a restaurant, and he sits down, he’s having a bowl of soup and he says to the waiter, “waiter come taste the soup.”
[223002750320] |Waiter says, “Is something wrong with the soup?”
[223002750330] |He says, “Taste the soup.”
[223002750340] |He [waiter] says, “Is there something wrong with the soup?
[223002750350] |Is the soup too hot?”
[223002750360] |He says, “Will you taste the soup?”
[223002750370] |[Waiter says ,] “What’s wrong, is the soup too cold?”
[223002750380] |[He says, ] “Will you just taste the soup?”
[223002750390] |[Waiter says, ] “Alright, I’ll taste the soup –where’s the spoon?
[223002750400] |Aha.
[223002750410] |Aha!”
[223002750420] |Not to mention, a widely applicable moral that may be applied to software.
[223002750430] |Maybe “eating your customer’s soup” should be as well known an idiom as “eating your own dogfood.”
[223002750440] |The Problem is Unions
[223002750450] |So now that you sussed out the moral of Mr. Murphy’s story, and diligently applied it by trying to use the tree interface, hopefully, like Mr. Murphy’s waiter, you see the problem.
[223002750460] |To do anything other than get the root of a tree, you need to use instanceof
and then apply an unchecked cast.
[223002750470] |Double Yucky.
[223002750480] |The root of the problem (pun intended) is that the tree data structure is essentially a union data structure (aka a “variant record”).
[223002750490] |You see these in straight-up C all the time.
[223002750500] |An Alternative
[223002750510] |What I actually have in place is a C-like approach that doesn’t require any unchecked casts.
[223002750520] |But it’s giving off a really bad smell.
[223002750530] |Darn near boufin.
[223002750540] |The idea is to use the isLexical()
test and if it’s true, use word()
to get the word and if it’s false, use subtrees()
to get the list of subtrees.
[223002750550] |In C, there’s often a key indicating which type a union data structure contains.
[223002750560] |I can’t decide which is better, returning null
from word()
if it’s not a lexical tree, or throwing an unsupported operation exception (it’s not quite an illegal state, as my tree implementations would almost certainly be immutable).
[223002750570] |Any Better Ideas?
[223002750580] |I’m very open to suggestions for a better way to represent trees.
[223002780010] |Chapelle, Metzler, Zhang, Grinspan (2009) Expected Reciprocal Rank for Graded Relevance
[223002780020] |Expected Reciprocal Rank Evaluation Metric
[223002780030] |In this post, I want to discuss the evaluation metric, expected reciprocal rank (ERR), which is the basis of Yahoo!’s Learning to Rank Challenge.
[223002780040] |The ERR metric was introduced in:
[223002780050] |Chapelle, Olivier, Donald Metzler, Ya Zhang, and Pierre Grinspan. 2009.
[223002780060] |Expected reciprocal rank for graded relevance.
[223002780070] |In CIKM.
[223002780080] |The metric was designed to model a user’s search behavior better than previous metrics, many of which are discussed in the paper and related to expected reciprocal rank.
[223002780090] |The Cascade Model
[223002780100] |Expected reciprocal rank is based on the cascade model of search (there are citations in the paper).
[223002780110] |The cascade model assumes a user scans through ranked search results in order, and for each document, evaluates whether the document satisfies the query, and if it does, stops the search.
[223002780120] |This model presupposes that a single document can satisfy a search.
[223002780130] |This is a reasonable model for question answering (e.g. [belgium capital], [raveonettes concert dates]) or for navigation queries (e.g. [united airlines], [new york times]).
[223002780140] |In a cascade model, a highly ranked document that is likely to satisfy the search will dominate the metric.
[223002780150] |But the cascade model is a poor model for the kinds of search I often find myself doing, which is more research survey oriented (e.g. ["stochastic gradient" "logistic regression" (regularization OR prior)], [splice variant expression RNA-seq]).
[223002780160] |In these situations, I want enough documents to feel like I’ve covered a field up to whatever need I have.
[223002780170] |Editorial Grades
[223002780180] |To make a cascade model complete, we need to model how likely it is that a given document will satisfy a given user query.
[223002780190] |For the learning to rank challenge, this is done by assigning each query-document pair an “editorial grade” from 0 to 4, with 0 meaning irrelevant and 4 meaning highly relevant.
[223002780200] |These are translated into probabilities of the document satsifying the search by mapping a grade to , resulting in the following probabilities:
[223002780210] |That is, when reviewing a search result for a query with an editorial grade of 3, there is a 7/16 chance the user will be satisfied with that document and hence terminate the search, and a 9/16 chance they will continue with the next item on the ranked list.
[223002780220] |Expected Reciprocal Rank
[223002780230] |Expected reciprocal rank is just the expectation of the reciprocal of the position of a result at which a user stops.
[223002780240] |Suppose for a query , a system returns a ranked list of documents , where the probability that document satisfies the user query is given by the transform of the editorial grade assigned to the query-document pair, which we write .
[223002780250] |If we let be a random variable denoting the rank at which we stop, the metric is the expectation of ,
[223002780260] |Stopping at rank involves being satisfied with document , the probability of which is .
[223002780270] |It also involves not having been satisified with any of the previous documents ranked , the probability of which is .
[223002780280] |This is all multiplied by , because it’s the inverse stopping rank whose expectation is being computed.
[223002780290] |For instance, suppose my system returns documents D1, D2, and D3 for a query Q, where the editorial grades for the document-query pairs are 3, 2, and 4 respectively.
[223002780300] |The expected reciprocal rank is computed by
[223002780310] |For instance, to stop at rank 2, I have to not be satisfied by the document at rank 1 and be satisfied by the document at rank 2.
[223002780320] |We then just multiply the reciprocal ranks by the stop probabilities to get:
[223002780330] |Document Relevance Independence
[223002780340] |One problem remaining for this metric (and others) is correlated documents.
[223002780350] |I often find myself doing a web search and getting lots and lots of hits that are essentially wrappers around the same PDF paper (e.g. CiteSeer, Google Scholar, ACM, several authors’ publication pages, departmental publications pages, etc).
[223002780360] |Assigning each of these an independent editorial grade does not make sense, because if one satisfies my need, they all will, and if one doesn’t satisfy my need, none of them will.
[223002780370] |They’re not exact duplicate pages, though, so it’s not quite just a deduplication problem.
[223002780380] |And of course, this is just the extreme end of correlation among results.
[223002780390] |Lots of queries are ambiguous.
[223002780400] |For instance, consider the query [john langford], which is highly ambiguous.
[223002780410] |There’s an Austin photographer, a realtor, the machine learning blogger, etc.
[223002780420] |I probably want one of these, not all of them (unless I’m doing a research type search), so my chance of stopping at a given document is highly dependent on which John Langford is being mentioned.
[223002780430] |This is another example of non-independence of stopping criteria.
[223002780440] |Reciprocal Rank vs. Rank
[223002780450] |One nice feature of the expected reciprocal rank metric is that it always falls between 0 and 1, with 1 being best.
[223002780460] |This means that the loss due to messing up any single example is bounded.
[223002780470] |This is like 0/1 loss for classifiers, where the maximum loss for misclassifying a given example is 1.
[223002780480] |Log loss for classifiers is the negative log probability of the correct response.
[223002780490] |This number isn’t bounded.
[223002780500] |Similarly, were we to measure expected rank rather than expected reciprocal rank, the loss for a single example would not be bounded other than by the number of possible matching docs.
[223002780510] |It actually seems to make more sense to me to use ranks because of their natural scale and ease of combination; expected rank measures the actual number of pages a user can be expected to consider.
[223002780520] |With reciprocal rank, as the results lists get longer, the tail is inverse weighted.
[223002780530] |That is, finding a document at rank 20 adds at most 1/20 to the result.
[223002780540] |This makes a list of 20 documents with all zero grade documents (completely irrelevant) have ERR of 0, and a list of 20 docs with the first 19 zero grade and the last grade 4 (perfect) has ERR of 1/20 * 15/16.
[223002780550] |If the perfect result is at position k and all previous results are completely irrelevant, the ERR is 1/k.
[223002780560] |This means there’s a huge difference between rank 1 and 2 (ERR = 1, vs. ERR = 0.5), with a much smaller absolute difference between rank 10 and rank 20 (ERR = 0.10 vs. ERR=0.05).
[223002780570] |This matters when we’re averaging the ERR values for a whole bunch of examples.
[223002780580] |With ranks, there’s the huge problem of no results being found, which provides a degenerate expected rank calculation.
[223002780590] |Somehow we need the expected rank calculation to return a large value if there are no relevant results.
[223002780600] |Maybe if everyone’s evaluating the same documents it won’t matter.
[223003070010] |Bing Translate Has a Great UI, and Some Nice NLP, too
[223003070020] |I’m really digging the user interfaces Bing has put up.
[223003070030] |Google’s now copying their image display and some of their results refinement.
[223003070040] |I’ve been working on tokenization in Arabic for the LingPipe book and was using the text from an Arabic Wikipedia page as an example.
[223003070050] |Here are links to Bing’s and Google’s offerings:
[223003070060] |Bing Translate
[223003070070] |Google Translate
[223003070080] |Yahoo!’s still using Babel Fish in last year’s UI; it doesn’t do Arabic.
[223003070090] |Language Detection
[223003070100] |First, it uses language detection to figure out what language you’re translating from.
[223003070110] |Obvious, but oh so much nicer than fiddling with a drop-down menu.
[223003070120] |Side-by-Side Results
[223003070130] |Second, it pops up the results side-by-side.
[223003070140] |Sentence-Level Alignments
[223003070150] |Even cooler, if you mouse over a region, it does sentence detection and shows you the corresponding region in the translation.
[223003070160] |Awesome.
[223003070170] |Back at Google
[223003070180] |I went back and looked at Google translate and see that they’ve added an auto-detect language feature since my last visit.
[223003070190] |Google only displays the translated page, but as you mouse over it, there’s a pop-up showing the original text and asking for a correction.
[223003070200] |I don’t know who did what first, but these are both way better interfaces than I remember from the last time I tried Google translation.
[223003070210] |And the results are pretty good NLP-wise, too.
[224000160010] |Joint Inference
[224000160020] |Joint inference is becoming really popular these days.
[224000160030] |There's even a workshop coming up on the topic that you should submit a paper to!
[224000160040] |(That is, unless your paper is more related to the computationally hard NLP workshop I'm running with Ryan McDonald and Fernando Pereira).
[224000160050] |The problem with joint inference is that while it seems like a great idea at the outset -- getting rid of pipelining, etc. -- in theory it's not always the best choice.
[224000160060] |I have a squib on the topic, titled Joint Inference is not Always Optimal and I welcome any and all comments/criticisms/etc.
[224000290010] |Unlabeled Structured Data
[224000290020] |I'll skip discussion of multitask learning for now and go directly for the unlabeled data question.
[224000290030] |It's instructive to compare what NLPers do with unlabeled data to what MLers do with unlabeled data.
[224000290040] |In machine learning, there are a few "standard" approaches to boosting supervised learning with unlabeled data:
[224000290050] |Use the unlabeled data to construct a low dimensional manifold; use this manifold to "preprocess" the training data.
[224000290060] |Use the unlabeled data to construct a kernel.
[224000290070] |Use the unlabeled data during training time to implement the "nearby points get similar labels" intuition.
[224000290080] |There are basically two common uses of unlabeled data in NLP (that I can think of):
[224000290090] |Use the unlabeled data to cluster words; use these word clusters as features for training.
[224000290100] |Use the unlabeled data to bootstrap new training instances based on highly precise patterns (regular expressions, typically).
[224000290110] |Why the discrepancy?
[224000290120] |I think it partially that the ML techniques are relatively new and developed without thinking about language problems.
[224000290130] |For instance, most manifold learning techniques work in mid-dimensional, continuous space, not uber-high-dimensional sparse discrete space.
[224000290140] |Moreover, most of the ML-style techniques scale as O(N^3), where N=# of unlabeled points.
[224000290150] |This is clearly far far too expensive for any reasonable data set.
[224000290160] |For the converse, NLP semi-sup techniques are very tied to their domain, and don't generalize well.
[224000290170] |The paper that's been getting a lot of attention recently -- on both sides -- is the work by Ando and Zhang.
[224000290180] |As I see it, this is one way of formalizing the common practice in NLP to a form digestible by ML people.
[224000290190] |This is great, because maybe it means the two camps will be brought closer together.
[224000290200] |The basic idea is to take "A" style NLP learning, but instead of clustering as we normally think of clustering (words based on contexts), they try to learn a classifier that predicts known "free" aspects of the unlabeled data (is this word capitalized?). Probably the biggest (acknowledged) shortcoming of this technique is that a human has to come up with these secondary classification problems.
[224000290210] |Can we try to do that automatically?
[224000290220] |But beyond standard notions of supervised -> semi-supervised -> unsupervised, I think that working in the structured domain offers us a much more interesting continuum.
[224000290230] |Maybe instead of having some unlabeled data, we have some partially labeled data.
[224000290240] |Or data that isn't labeled quite how we want (William Cohen recently told me of a data set where they have raw bio texts and lists of proteins that are important in these texts, but they want to actually find the names in unlabeled [i.e., no lists] texts.)
[224000290250] |Or maybe we have some labeled data but then want to deploy a system and get user feedback (good/bad translation/summary).
[224000290260] |This is a form of weak feedback that we'd ideally like to use to improve our system.
[224000290270] |I think that investigating these avenues is also a very promising direction.
[224000660010] |Statistical NLP is not NLP but just Statistics?
[224000660020] |bact' brings up an interesting point, perhaps more provocative than my original (intended-to-be provocative) pseudo-question.
[224000660030] |To quote, he says:
[224000660040] |and some also said,statistical natural language processing is not language processing at all, only statistics :P
[224000660050] |My impression is that the only sense in which this sentence is true is if you insist that what goes on inside the black box of statistical NLP is somehow explaining what goes on inside our heads.
[224000660060] |I see it as essentially parallel to the argument against "neural-style" machine learning.
[224000660070] |Some neural networks people used to claim (some still do, I hear) that what happens in an artificial neural net is essentially the same as what goes on in our minds.
[224000660080] |My impression (though this is now outside what I really know for sure) is that most cognitive scientists would strongly disagree with this claim.
[224000660090] |I get the sense that the majority of people who use NNets in practice use them because they work well, not out of some desire to mimic what goes on in our heads.
[224000660100] |I feel the same is probably true for most statistical NLP.
[224000660110] |I don't know of anyone who would claim that when people parse sentences they do chart parsing (I know some people claim something more along the lines of incremental parsing actually does happen and this seems somewhat plausible to me).
[224000660120] |Or that when people translate sentences they apply IBM Model 4 :).
[224000660130] |On the other hand, the alternative to statistical NLP is essentially rule-based NLP.
[224000660140] |I have an equally hard time believing that we behave simply as rule processing machines when parsing or translating, and that we efficiently store and search through millions of rules in order to do processing.
[224000660150] |In fact, I think I have a harder time believing this than believing the model 4 story :P.
[224000660160] |Taking a step back, it seems that there are several goals one can have with dealing with language on a computer.
[224000660170] |One can be trying to carry out tasks that have to do with language, which I typically refer to as NLP.
[224000660180] |Alternatively, one can be trying to model how humans work with language.
[224000660190] |I would probably call this CogNLP or something like that.
[224000660200] |One could instead try to use computers and language data to uncover "truths" about language.
[224000660210] |This is typically considered computational linguistics.
[224000660220] |I don't think any of these goals is a priori better than the others, but they are very different.
[224000660230] |My general feeling is that NLPers cannot solve all problems, CogNLPers don't really know what goes on in our minds and CLers are a long way from understanding how language functions.
[224000660240] |Given this, I think it's usually best to confine a particular piece of work to one of the fields, since trying to solve two or three at a time is likely going to basically be impossible.
[224001130010] |Math as a Natural Language
[224001130020] |Mathematics (with a capital "M") is typically considered a formal language.
[224001130030] |While I would agree that it tends to be more formal than, say, English, I often wonder whether it's truly formal (finding a good definition of formal would be useful, but is apparently somewhat difficult).
[224001130040] |In other words, are there properties that the usual natural languages have that math does not.
[224001130050] |I would say there are only a few, and they're mostly unimportant.
[224001130060] |Secondly, note that I bolded the "a" above.
[224001130070] |This is because I also feel that Mathematics is more like a collection of separate languages (or, dialects if you prefer since none of them has an army -- except perhaps category theory) that a single language.
[224001130080] |First regarding the formality.
[224001130090] |Typically when I think of formal languages, I think of something lacking ambiguity.
[224001130100] |This is certainly the case for the sort of math that one would type into matlab or a theorem prover or mathematica.
[224001130110] |But what one types in here is what I would call "formal math" precisely because it is not the language that mathematicians actually speak.
[224001130120] |This is perhaps one reason why these tools are not more widely used: translating between the math that we speak and the math that these tools speak is somewhat non-trivial.
[224001130130] |The ambiguity issue is perhaps the most forceful argument that math is not completely formal: operators get overloaded, subscripts and conditionings get dropped, whole subexpressions get elided (typically with a "..."), etc.
[224001130140] |And this is not just an issue of venue: it happens in formal math journals as well.
[224001130150] |It is often also the case that even what gets publishes is not really the mathematics that people speak.
[224001130160] |Back when I actually interacted with real mathematicians, there was inevitably this delay for "formally" writing up results, essentially breaking apart developed shorthand and presenting things cleanly.
[224001130170] |But the mathematics that appears in math papers is not really in its natural form -- at least, it's often not the language that mathematicians actually work with.
[224001130180] |To enumerate a few points: Math...
[224001130190] |has a recursive structure (obviously)
[224001130200] |is ambiguous
[224001130210] |is self-referential ("see Eq 5" or "the LHS" or simply by defining something and giving it a name)
[224001130220] |has an alphabet and rules for combining symbols
[224001130230] |To some degree, it even has a phonology.
[224001130240] |One of the most painful classes I ever took (the only class I ever dropped) was an "intro to logic for grad CS students who don't know math" class.
[224001130250] |This class pained me because the pedagogical tool used was for the professor to hand out typed notes and have the students go around and each read a definition/lemma/theorem.
[224001130260] |After twenty minutes of "alpha superscript tee subscript open parenthesis eye plus one close parethesis" I was ready to kill myself.
[224001130270] |It has no morphology that I can think of, but neither really does Chinese.
[224001130280] |Moving on to the "a" part, I propose a challenge.
[224001130290] |Go find a statistician (maybe you are one!).
[224001130300] |Have him/her interpret the following expression: .
[224001130310] |Next, go find a logician and have him/her interpret .
[224001130320] |For people in the respective fields, these expressions have a very obvious meaning (I'm guessing that most of the readers here know what the second is).
[224001130330] |I'm sure that if I had enough background to drag up examples from other fields, we could continue to perplex people.
[224001130340] |In fact, even though about 8 years ago I was intimately familiar with the first sort of expression, I actually had to look it up to ensure that I got the form right (and to ensure that I didn't make a false statement).
[224001130350] |This reminded me somewhat of having to look up even very simple words and expressions in Japanese long after having used it (somewhat) regularly.
[224001130360] |I think that the diversity of meaning of symbols and subexpressions is a large part of the reason why most computer algebra systems handle only a subset of possible fields (some do algebra, some calculus, some logic, etc.).
[224001130370] |I believe in my heart that it would be virtually impossible to pin down a single grammar (much less a semantics!) for all of math.
[224001130380] |So what does this have to do with an NLP blog?
[224001130390] |Well, IMO Math is a natural language, at least in all the ways that matter.
[224001130400] |So why don't we study it more?
[224001130410] |In particular, when I download a paper, what I typically do is first read the abstract, then flip to the back to skim to results, and then hunt for the "main" equation that will explain to me what they're doing.
[224001130420] |For me, at least, this is much faster than trying to read all of the text, provided that I can somewhat parse the expression (which is only a problem when people define too much notation).
[224001130430] |So much information, even in ACL-style papers, but more-so in ICML/NIPS-style papers, is contained in the math.
[224001130440] |I think we should try to exploit it.
[224001140010] |Quality vs. quantity in data annotation
[224001140020] |I'm in the process of annotating some data (along with some students---thanks guys!).
[224001140030] |While this data isn't really in the context of an NLP application, the annotation process made me think of the following issue.
[224001140040] |I can annotate a lot more data if I'm less careful.
[224001140050] |Okay, so this is obvious.
[224001140060] |But it's something I hadn't really specifically thought of before.
[224001140070] |So here's the issue.
[224001140080] |I have some fixed amount of time in which to annotate data (or some fixed amount of dollars).
[224001140090] |In this time, I can annotate N data points with a noise-rate of eta_N. Presumably eta_N approaches one half (for a binary task) as N increases.
[224001140100] |In other words, as N increases, (1-2 eta_N) approaches zero.
[224001140110] |A standard result in PAC learning states that a lower bound on the number of examples required to achieve 1-epsilon accuracy with probability 1-delta with a noise rate of eta_N when the VC-dimension is h is (h+log(1/delta))/(epsilon (1-2 eta)^2)).
[224001140120] |This gives us some insight into the problem.
[224001140130] |This says that it is worth labeling more data (with higher noise) only if 1/(1-2 eta_N)^2 increases more slowly than N.
[224001140140] |So if we can label twice as much data and have the noise of this annotation increase by less than a factor of 0.15, then we're doing well.
[224001140150] |(Well in the sense that we can keep the bound the same an shrink either \epsilon or delta.)
[224001140160] |So how does this hold up in practice?
[224001140170] |Well, it's hard to tell exactly for real problems because running such experiments would be quite time-consuming.
[224001140180] |So here's a simulation.
[224001140190] |We have a binary classification problem with 100 features.
[224001140200] |The weight vector is random; the first 50 dimensions are Nor(0,0.2); the next 35 are Nor(0,5); the final 15 are Nor(m,1) where m is the weight of the current feature id minus 35 (feature correlation).
[224001140210] |We vary the number of training examples and the error rate.
[224001140220] |We always generate equal number of positive and negative points.
[224001140230] |We train a logistic regression model with hyperparameters tuned on 1024 (noisy) dev points and evaluate on 1024 non-noisy test points.
[224001140240] |We do this ten times for each setting and average the results.
[224001140250] |Here's a picture of accuracy as a function of data set size and noise rate:
[224001140260] |And here's the table of results:
[224001140270] |The general trend here seems to be that if you don't have much data (N<=256), then it's almost always better to get more data at a much higher error rate (0.1 or 0.2 versus 0.0). Once you have a reasonable amount of data, then it starts paying to be more noise-free. Eg., 256 examples with 0.05 noise is just about as good as 1024 examples with 0.2 noise. This roughly concurs with the theorem (at least in terms of the trends).
[224001140280] |I think the take-home message that's perhaps worth keeping in mind is the following.
[224001140290] |If we only have a little time/money for annotation, we should probably annotate more data at a higher noise rate.
[224001140300] |Once we start getting more money, we should simultaneously be more careful and add more data, but not let one dominate the other.
[224001460010] |The behemoth, PubMed
[224001460020] |The friend I crashed with while attending SODA is someone I've known since we were five years old.
[224001460030] |(Incidentally, there's actually someone in the NLP world who I've actually known from earlier...small world.)
[224001460040] |Anyway, the friend I stayed with is just finishing med school at UCSF and will soon be staying there for residency.
[224001460050] |His specialty is neurosurgery, and his interests are in neural pathologies.
[224001460060] |He spent some time doing research on Alzheimer's disease, effectively by studying mice (there's something I feel sort of bad about finding slightly amusing about mice with Alzheimer's disease).
[224001460070] |Needless to say, in the process of doing research, he made nearly daily use out of PubMed.
[224001460080] |(For those of you who don't know, PubMed is like the ACL anthology, but with hundreds of thousands of papers, with new ones being added by the truckload daily, and will a bunch of additional things, like ontologies and data sets.)
[224001460090] |There are two things I want to talk about regarding PubMed.
[224001460100] |I think both of these admit very interesting problems that we, as NLPers, are qualified to tackle.
[224001460110] |I think the most important thing, however, is opening and maintaining a wide channel of communication.
[224001460120] |There seems to be less interaction between people who do (for instance) bio-medical informatics (we have a fairly large group here) and what I'll term as mainstream NLPers.
[224001460130] |Sure, there have been BioNLP workshops at ACLs, but I really think that both communities would be well-served to interact more.
[224001460140] |And for those of you who don't want to work on BioNLP because it's "just a small domain problem", let me assure you: it is not easy... don't think of it in the same vein as a true "sublanguage" -- it is quite broad.
[224001460150] |I suppose I should give a caveat that my comments below are based on a sample size of one (my friend), so it may not be totally representative.
[224001460160] |But I think it generalizes.
[224001460170] |Search in PubMed, from what I've heard, is good in the same ways that web search is good and bad in the same ways that web search is bad.
[224001460180] |It is good when you know what you're looking for (i.e., you know the name for it) and bad otherwise.
[224001460190] |One of the most common sorts of queries that my friend wants to do is something like "show me all the research on proteins that interact in some way with XXX in the context of YYY" where XXX is (eg) a gene and YYY is (eg) a disease.
[224001460200] |The key is that we don't know which proteins these are and so it's hard to query for them directly.
[224001460210] |I know that this is something that the folks at Penn (and probably elsewhere) are working on, and I get the impression that a good solution to this problem would make lots and lots of biologists much happier (and more productive).
[224001460220] |One thing that was particularly interesting, however, is that he was pretty averse to using structured queries like the one I gave above.
[224001460230] |He effectively wants to search for "XXX YYY" and have it realize that XXX is a gene, YYY is a disease, and that it's "obvious" that what he wants is proteins that interact with (or even, for instance, pathways that contain) XXX in the context of disease YYY.
[224001460240] |On the other hand, if YYY were another gene, then probably he's be looking for diseases or pathways that are regulated by both XXX and YYY.
[224001460250] |It's a bit complex, but I don't think this is something particularly beyond our means.
[224001460260] |The other thing I want to talk about is summarization.
[224001460270] |PubMed actually archives a fairly substantial collection of human-written summaries.
[224001460280] |These fall into one of two categories.
[224001460290] |The first, called "systematic reviews" are more or less what we would think of as summaries.
[224001460300] |However, they are themselves quite long and complex.
[224001460310] |They're really not anything like sentence extracts.
[224001460320] |The second, called "meta analyses" are really not like summaries at all.
[224001460330] |In a meta analysis, an author will consider a handful of previously published papers on, say, the effects of smoking on lifespan.
[224001460340] |He will take the data and results published in these individual papers, and actually do novel statistical analyses on them to see how well the conclusions hold.
[224001460350] |From a computational perspective, the automatic creation of meta analyses would essentially be impossible, until we have machines that can actually run experiments in the lab.
[224001460360] |"Systematic reviews", on the other hand, while totally outside the scope of our current technology, are things we could hope to do.
[224001460370] |And they give us lots of training data.
[224001460380] |There are somewhere around ten to forty thousand systematic reviews on PubMed, each about 20 pages long, and each with references back to papers, almost all of which are themselves in PubMed.
[224001460390] |Finding systematic reviews older than a few years ago (when the began being tagged explicitly) has actually sprouted a tiny cottage industry.
[224001460400] |And PubMed nicely makes all of their data available for download, without having to crawl, something that makes life much easier for us.
[224001460410] |My friend warns that it might not be a good idea to use all systematic reviews, but only those from top journals.
[224001460420] |(They tend to be less biased, and better written.)
[224001460430] |However, in so far as I don't think we'd even have hope of getting something as good as a systematic review from the worst journal in the world, I'm not sure this matters much.
[224001460440] |Maybe all it says is that for evaluation, we should be careful to evaluate against the top.
[224001460450] |Now, I should point out that people in biomedical informatics have actually been working on the summarization problem too.
[224001460460] |From what I can tell, the majority of effort there is on rule-based systems that build on top of more rule-based systems that extract genes, pathways and relations.
[224001460470] |People at the National Library of Medicine, Rindflesch and Fiszman, use SemRep to do summarization, and they have tried applying it to some medical texts.
[224001460480] |Two other people that I know are doing this kind of work are Kathy McKeown and Greg Whalen, both at Columbia.
[224001460490] |The Columbia group has access to a medically informed NLP concept extractor called MedLEE, which gives them a leg up on the low-level processing details.
[224001460500] |If you search for 'summarization medical OR biomedical' in GoogleScholar, you'll get a raft of hits (~9000).
[224001460510] |Now, don't get me wrong -- I'm not saying that this is easy -- but for summarization folks who are constantly looking for "natural artifacts" of their craft, this is an enormous repository.
[224001660010] |Help! Contribute the LaTeX for your ACL papers!
[224001660020] |This is a request to the community.
[224001660030] |If you have published a paper in an ACL-related venue in the past ten years or so, please consider contributing the LaTeX source.
[224001660040] |Please also consider contributing talk slides!
[224001660050] |It's an relatively painless process: just point your browser here and upload!
[224001660060] |(Note that we're specifically requesting that associated style files be included, though figures are not necessary.)
[224001840010] |Mixture models: clustering or density estimation
[224001840020] |My colleague Suresh Venkatasubramanian is running as seminar on clustering this semester.
[224001840030] |Last week we discussed EM and mixture of Gaussians.
[224001840040] |I almost skipped because it's a relatively old hat topic for me (how many times have I given this lecture?!), and had some grant stuff going out that day.
[224001840050] |But I decided to show up anyway.
[224001840060] |I'm glad I did.
[224001840070] |We discussed a lot of interesting things, but something that had been bugging me for a while finally materialized in a way about which I can be precise.
[224001840080] |I basically have two (purely qualitative) issues with mixture of Gaussians as a clustering method.
[224001840090] |(No, I'm not actually suggesting there's anything wrong with using it in practice.)
[224001840100] |My first complaint is that many times, MoG is used to get the cluster assignments, or to get soft-cluster assignments... but this has always struck me as a bit weird because then we should be maximizing over the cluster assignments and doing expectations over everything else.
[224001840110] |Max Welling has done some work related to this in the Bayesian setting.
[224001840120] |(I vaguely remember that someone else did basically the same thing at basically the same time, but can't remember any more who it was.)
[224001840130] |But my more fundamental question is this.
[224001840140] |When we start dealing with MoG, we usually say something like... suppose we have a density F which can be represented at F = pi_0 F_0 + pi_1 F_1 + ... + pi_K F_K, where the pis give a convex combination of "simpler" densities F_k.
[224001840150] |This question arose in the context of density estimation (if my history is correct) and the maximum likelihood solution via expectation maximization was developed to solve the density estimation problem.
[224001840160] |That is, the ORIGINAL goal in this case was to do density estimation; the fact that "cluster assignments" were produced as a byproduct was perhaps not the original intent.
[224001840170] |I can actually give a fairly simple example to try to make this point visually.
[224001840180] |Here is some data generated by a mixture of uniform distributions.
[224001840190] |And I'll even tell you that K=2 in this case.
[224001840200] |There are 20,000 points if I recall correctly:
[224001840210] |Can you tell me what the distribution is?
[224001840220] |Can you give me the components?
[224001840230] |Can you give me cluster assignments?
[224001840240] |The problem is that I've constructed this to be non-identifiable.
[224001840250] |Here are two ways of writing down the components.
[224001840260] |(I've drawn this in 2D, but only pay attention to the x dimension.)
[224001840270] |They give rise to exactly the same distribution.
[224001840280] |One is equally weighted components, one uniform on the range (-3,1) and one uniform on the range (-1,3).
[224001840290] |The other is to have to components, one with 2/3 weight on the range (-3,3) and one with 1/3 weight on the range (-1,1).
[224001840300] |I could imagine some sort of maximum likelihood parameter estimation giving rise to either of these (EM is hard to get to work here because once a point is outside the bounds of a uniform, it has probability zero).
[224001840310] |They both correctly recover the distribution, but would give rise to totally different (and sort of weird) cluster assignments.
[224001840320] |I want to quickly point out that this is a very different issue from the standard "non-identifiability in mixture models issue" that has to do with the fact that any permutation of cluster indices gives rise to the same model.
[224001840330] |So I guess that all this falls under the category of "if you want X, go for X." If you want a clustering, go for a clustering -- don't go for density estimation and try to read off clusters as a by-product.
[224001840340] |(Of course, I don't entirely believe this, but I still think it's worth thinking about.)
[224002250010] |Graduating? Want a post-doc? Let NSF pay!
[224002250020] |I get many of emails of the form "I'm looking for a postdoc...."
[224002250030] |I'm sure that other, more senior, more famous people get lots of these.
[224002250040] |My internal reaction is always "Great: I wish I could afford that!"
[224002250050] |NSF has a solution: let them pay for it!
[224002250060] |This is the second year of the CI fellows program, and I know of two people who did it last year (one in NLP, one in theory).
[224002250070] |I think it's a great program, both for faculty and for graduates (especially since the job market is so sucky this year).
[224002250080] |If you're graduating, you should apply (unless you have other, better, job options already).
[224002250090] |But beware, the deadline is May 23!!!
[224002250100] |Here's more info directly from NSF's mouth:
[224002250110] |The CIFellows Project is an opportunity for recent Ph.D. graduates in computer science and closely related fields to obtain one- to two-year postdoctoral positions at universities, industrial research laboratories, and other organizations that advance the field of computing and its positive impact on society.
[224002250120] |The goals of the CIFellows project are to retain new Ph.D.s in research and teaching and to support intellectual renewal and diversity in the computing fields at U.S. organizations......
[224002250130] |Every CIFellow application must identify 1-3 host mentors.
[224002250140] |Click here to visit a website where prospective mentors have posted their information.
[224002250150] |In addition, openings that have been posted over the past year (and may be a source of viable mentors/host organizations for the CIFellowships) are available here.
[224002250160] |Good luck!
[224002350010] |Manifold Assumption versus Margin Assumption
[224002350020] |[This post is based on some discussions that came up while talking about manifold learning with Ross Whitaker and Sam Gerber, who had a great manifold learning paper at ICCV last year.]
[224002350030] |There are two assumptions that are often used in statistical learning (both theory and practice, though probably more of the latter), especially in the semi-supervised setting.
[224002350040] |Unfortunately, they're incompatible.
[224002350050] |The margin assumption states that your data are well separated.
[224002350060] |Usually it's in reference to linear, possibly kernelized, classifiers, but that need not be the case.
[224002350070] |As most of us know, there are lots of other assumptions that boil down to the same thing, such as the low-weight-norm assumption, or the Gaussian prior assumption.
[224002350080] |At the end of the day, it means your data looks like what you have on the left, below, not what you have on the right.
[224002350090] |The manifold assumption that is particularly popular in semi-supervised learning, but also shows up in supervised learning, says that your data lie on a low dimensional manifold embedded in a higher dimensional space.
[224002350100] |One way of thinking about this is saying that your features cannot covary arbitrarily, but the manifold assumption is quite a bit stronger.
[224002350110] |It usually assumes a Reimannian (i.e., locally Euclidean) structure, with data points "sufficiently" densely sampled.
[224002350120] |In other words, life looks like the left, not the right, below: Okay, yes, I know that the "Bad" one is a 2D manifold embedded in 2D, but that's only because I can't draw 3D images :).
[224002350130] |And anyway, this is a "weird" manifold in the sense that at one point (where the +s and -s meet), it drops down to 1D.
[224002350140] |This is fine in math-manifold land, but usually not at all accounted for in ML-manifold land.
[224002350150] |The problem, of course, is that once you say "margin" and "manifold" in the same sentence, things just can't possibly work out.
[224002350160] |You'd end up with a picture like:
[224002350170] |This is fine from a margin perspective, but it's definitely not a (densely sampled) manifold any more.
[224002350180] |In fact, almost by definition, once you stick a margin into a manifold (which is okay, since you'll define margin Euclideanly, and manifolds know how to deal with Euclidean geometry locally), you're hosed.
[224002350190] |So I guess the question is: who do you believe?
[225000030010] |On chicken pecks and why "8" is the only number that gets used to replace a syllabic rime.
[225000030020] |I've noticed that, in the context of email and online slang/abbreviations, the character "8" is the only number or character that gets used to replace a phonological rime (a nucleus plus a coda).
[225000030030] |Most other replacements either replace whole syllables, or just consonant clusters.
[225000030040] |For example (from Wikipedia's "List of Internet slang phrases") 2L8 -- too late GR8 -- great H8 —Hate L8R —Later (sometimes abbreviated to L8ER) M8 —Mate sk8/sk8r —skate/skater W8 —Wait
[225000030050] |The numbers "2" and "4" can replace whole words: 2U2 —To you too G2G —'Got to go' or 'Good to go L2P —Learn to play N2M —Not(hing) too much N2B —Not too bad P2P —Peer to peer T4P - Tell for people
[225000030060] |Here is an example of each character replacing a whole syllable: NE1 —"Anyone" = an.y.one
[225000030070] |"X" replaces a consonant cluster in a few cases, but not the nuclei of the rime: KTHX —OK, thanks TH(N)X, TNX or TX —Thanks
[225000030080] |Why is "8" the only number that gets used to replace a whole rime (a nucleus plus a coda)?
[225000030090] |My guess is that it's because, of the 13 basic number names in English, only two begin with a vowel ("8" and "11").
[225000030100] |The name for "11" is itself 3 syllables long, so it's out as a candidate.
[225000030110] |The name for "8" is the only single syllable number name that starts with a vowel.
[225000030120] |So it's the only one that is eligible for replacing a rime.
[225000030130] |English Number names zero one two three four five six seven eight nine ten eleven twelve
[225000030140] |So, the constraints on using characters to replace a rime are 1) must be pronounced as a single syllable and 2) start with a vowel.
[225000030150] |How many keyboard characters meet these two criteria?
[225000030160] |Letter names = 2.
[225000030170] |If we tried to use them to replace rimes, would the usage catch on?
[225000030180] |"F" and "X" are the only letter names that follows the VC(C) pattern of "8", so "x" it could be used to replace "-ecks/-eks" for example, but how many words end in that?
[225000030190] |Here's a valiant try: chicken pecks -- chicken pX??
[225000030200] |Presumably "@" could replace any "-at" rime and maybe (stretching here) just maybe you could get "&" to replace "-and". Do people do either?
[225000030210] |&
[225000060010] |Spot On
[225000060020] |The Language Guy echoes my sentiments about language death pretty closely here.
[225000060030] |I wonder if he reads me?
[225000320010] |Language and Memory ...
[225000320020] |Yesterday, Andrew Sullivan linked to this Chris Chatham blog Memory Before Language: Preverbal Experiences Recoded Into Newly-Learned Words.
[225000320030] |In it, Chatham says
[225000320040] |adults tend to use language in encoding and retrieving memories...
[225000320050] |I’ve read this sentence multiple times and I still don’t know what it means.
[225000320060] |Chatham is a grad student in Cognitive Neuroscience at the University of Colorado, Boulder.
[225000320070] |From his web page, it’s clear that he’s a smart guy.
[225000320080] |But he’s going to have to explain his point about adult language and memory more clearly.
[225000320090] |Right now, it sounds like bullshit.
[225000880010] |"The demon barber of fleeT...streeT!"
[225000880020] |A rare non-linguistics post: I'm quite a movie buff and predictably underwhelmed by the last decade of pedestrian films.
[225000880030] |If it weren't for Quentin Tarantino and Julie Taymore, this might have been the most banal ten years in movie history.
[225000880040] |But, to my great surprise, there are no less than five movies currently out that I'm excited to see, and three others that I wouldn't mind seeing.
[225000880050] |I'm not sure that has happened before.
[225000880060] |Ever!
[225000880070] |By far, the movie I am most anxious to see is Sweeney Todd.
[225000880080] |I was heavily involved in theatre in high school (and college) and I have a strong memory of watching the great Broadway play starring Angela Lansbury and George Hearn in drama class.
[225000880090] |I have spent that last 20 years with the chorus sounding in my ear, "Sweenyyyyyy ...
[225000880100] |Sweeny Todd ...
[225000880110] |The demon barber of fleeT...streeT!"
[225000880120] |The brilliant over-articulation of the final voiceless stops still slices through me (see, I got a little linguistics in there).
[225000880130] |(UPDATE: I found a great YouTube clip here of the opening song from the Broadway play video I mentioned above.
[225000880140] |And here is a sample of Depp talking about singing, then some of his vocals)
[225000880150] |(UPDATE 2: My Sweeney Todd review is here)
[225000880160] |I heard some snippets of Johnny Depp's vocals this morning on NPR.
[225000880170] |He's a competent singer and smart enough to stay within his range, but he really does not have the strong and confident voice of a Broadway star.
[225000880180] |Nonetheless, he's truly an actor's actor (hmm, an interesting construction, I may follow up on that one) so I'll be seeing the film within hours of this post.
[225000880190] |The Five (in order of preference):
[225000880200] |First: Sweeney Todd: The Demon Barber of Fleet Street Second: I'm Not ThereThird: Across the Universe Fourth: The Kite RunnerFifth: Charlie Wilson's War
[225000880210] |Three that I wouldn't mind seeing:
[225000880220] |a) Juno b) Walk Hard: The Dewey Cox Story c) No Country for Old Men
[225001130010] |On Crowdsourcing and Linguistics
[225001130020] |Rumbling around in my head for some time has been this question: can linguistics take advantage of powerful prediction markets to further our research goals?
[225001130030] |It's not clear to me what predictions linguists could compete over, so this remains an open question.
[225001130040] |However, having just stumbled on to an Amazon.com service designed to harness the power of crowdsourcing called Mechanical Turk (HT Complex Systems Blog) I'm tempted to believe this somewhat related idea could be useful very quickly to complete large scale annotation projects (something I've posted about before), despite the potential for lousy annotations.
[225001130050] |The point of crowdsourcing is to complete tasks that are difficult for computers, but easy for humans.
[225001130060] |For example, here are five tasks currently being listed:
[225001130070] |1. Create an image that looks like another image.
[225001130080] |2. Extract Meeting Date Information from Websites3.
[225001130090] |Your task is to identify your 3 best items for the lists you're presented with.
[225001130100] |4. Describe the sport and athlete's race and gender on Sports Illustrated covers5.
[225001130110] |2 pictures to look at and quickly rate subjectively
[225001130120] |It should be easy enough to crowdsource annotation tasks (e.g., create a web site people can log in to from anywhere which contains the data with an easy-to-use interface for tagging).
[225001130130] |"Alas!", says you, "surely the poor quality of annotations would make this approach hopeless!"
[225001130140] |Would it?
[225001130150] |Recently, Breck Baldwin over at the LingPipe blog discussed the problems of inter-annotator agreement (gasp! there's inter-annotator DIS-agreement even between hip geniuses like Baldwin and Carpenter?
[225001130160] |Yes ... sigh ... yes there is).
[225001130170] |However (here's where the genius part comes in) he concluded that, if you're primarily in the business of recall (i.e, making sure the net you cast catches all the fish in the sea, even if you also pick up some hub caps along the way), then the reliability of annotators is not a critical concern.
[225001130180] |Let's let Breck explain:
[225001130190] |The problem is in estimating what truth is given somewhat unreliable annotators.
[225001130200] |Assuming that Bob and I make independent errors and after adjudication (we both looked at where we differed and decided what the real errors were) we figured that each of us would miss 5% (1/20) of the abstract to gene mappings.
[225001130210] |If we took the union of our annotations, we end up with .025% missed mentions (1/400) by multiplying our recall errors (1/20*1/20)–this assumes independence of errors, a big assumption.
[225001130220] |Now we have a much better upper limit that is in the 99% range, and more importantly, a perspective on how to accumulate a recall gold standard.
[225001130230] |Basically we should take annotations from all remotely qualified annotators and not worry about it.
[225001130240] |We know that is going to push down our precision (accuracy) but we are not in that business anyway.
[225001130250] |Unless I've mis-understood Baldwin's post (I'm just a lousy linguist mind you, not a genius, hehe) then the major issue is adjudicating the error rate of a set of crowdsourced raters.
[225001130260] |Couldn't a bit of sampling do this nicely?
[225001130270] |If you restricted the annotators to, say, grad students in linguistics and related fields, the threshold of "remotely qualified" should be met, and there's plenty of grad students floating around the world.
[225001130280] |This approach strikes me as related to the recent revelations that Wikipedia and Digg and other groups that try to take advantage of web democracy/crowd wisdom are actually functioning best when they have a small group of "moderators" or "chaperones" (read Chris Wilson's article on this topic here).
[225001130290] |So, take a large group of raters scattered around the whole wide world, give them the task and technology to complete potentially huge amounts of annotations quickly, chaperone their results just a bit, and voilà, large scale annotation projects made easy.
[225001130300] |You're welcome, hehe.
[225001230010] |Smitten with Kunis
[225001230020] |This is another, still rare, non-linguistics post about movies (I suppose I could try to draw some connection to the Netflix Prize or Recommender Systems, but, yawn, this is what it is, a movie post).
[225001230030] |I watched Forgetting Sarah Marshall yesterday.
[225001230040] |I feel the need to defend that choice, but I’ll do that later.
[225001230050] |After seeing it, I find I’m smitten with Mila Kunis, and not just because her name is Mila Kunis.
[225001230060] |I knew of Kunis through That 70s Show (although, like most people, I stopped watching midway through the third season, and that was a long time ago), but more through her voicing of Meg on Family Guy.
[225001230070] |In Forgetting Sarah Marshall she is given the right blend of sweetness and tenacity to play to her talents (her screaming match with an ex-boyfriend was literally laugh-out-loud funny) plus she has an awesome tan.
[225001230080] |Her tan is so awesome, it’s like a separate character.
[225001230090] |They could have just put Kunis and her tan next to the ocean and I probably would have watched for the same 112 minute run time.
[225001230100] |It’s an impressive feat to get a Ukrainian THAT tan and not kill her.
[225001230110] |I don’t know what combination of chemicals and baby oil they used, but it worked.
[225001230120] |Zonker Harris would be proud.
[225001230130] |And this is the essential hook, isn’t it?
[225001230140] |In order for a romantic comedy to work, the viewer has to become smitten with one of the leads (or both, if that’s your thang baby, make Paglia proud …on a random related note, is Torchwood the most bi-curious TV show in history?).
[225001230150] |In any case, I walked away from this movie smitten with Mila Kunis.
[225001230160] |While watching this movie, I couldn’t help but reflect on the lack of women in Hollywood who have the two most important characteristics of a romantic comedy lead: adorability and comedic talent.
[225001230170] |Meg Ryan had lots of one and little of the other; frikkin Sandra Bullock had neither yet still managed a decade long career.
[225001230180] |Kunis has both.
[225001230190] |She’s cute as all hell and she can bring the funny (and did I mention the awesome tan?).
[225001230200] |The only other actress today with both of these crucial qualities (sans tan) is Ellen Page (my first impression of her is here) but I fear Page may be limited to the wise-cracking smart-ass.
[225001230210] |I haven’t seen her step out of that role yet (even her small roles in the X-Men movies had this tinge to them).
[225001230220] |Unfortunately, since the corporate takeover of Hollywood in the 1980s, the romantic comedy has been staffed by pretty dolls with little talent (both male and female).
[225001230230] |But this is why most romantic comedies fail.
[225001230240] |They have dull leads.
[225001230250] |The corporate suits create a table of demographics, then plot a script accordingly, then plug in the two actors de jour and voilà!
[225001230260] |Now the romantic comedy may finally be coming out of its stupor.
[225001230270] |Forgetting Sarah Marshall is the latest installment of Apatow Inc’s refashioning of the genre, and god bless ‘em because most romantic comedies suck.
[225001230280] |Box Office Mojo has a list of the 300 top grossing romantic comedies since 1978, and it’s depressing.
[225001230290] |The highest grossing romantic comedy of all time is, by itself, reason to contemplate suicide.
[225001230300] |Even as you scan the large list of movies, it’s a wasteland of forgetability.
[225001230310] |But that’s the downside.
[225001230320] |The upside is that the romantic comedy genre has produced a handful of unforgettable films like His Girl Friday, Harold and Maude, and Annie Hall.
[225001230330] |There is nothing wrong with the genre itself.
[225001230340] |Hell, most epic poems suck ass, but that’s no reason to throw out The Odyssey.
[225001230350] |More to the point, there are good romantic comedies (and John Cusack has been in most of them; if you haven’t seen Grosse Point Blank or High Fidelity, you’re missing out).
[225001230360] |I've highly recommended Juno as a great version of the genre (regardless of what my colleague may think, thppt!), but I can't equate Forgetting Sarah Marshall with Juno, smitten or not.
[225001230370] |But it is a good romantic comedy, just worth the matinée price I paid.
[225001230380] |And that brings me to my reasons for choosing this particular film.
[225001230390] |I have no shame in going to see a romantic comedy, because I want to see another Annie Hall.
[225001230400] |I want the genre to succeed.
[225001230410] |I think Apatow Inc. stresses writing and comedy talent more than most producer-driven entourages, so they’re producing films that, in the very least, are funny and entertaining.
[225001230420] |Plus, I was bored and M. Faust gave it a good review, even though he doesn’t mention Mila Kunis (Bastard! Did you not see her awesome tan?).
[225001250010] |Iron Man Linguistics
[225001250020] |I just saw Iron Man (no no, this is not another movie review ... but you can still read my Forgetting Sarah Marshall and Juno discussions).
[225001250030] |There is an interesting linguistic side-point to be made about language diaspora in Afghanistan.
[225001250040] |As the movie opens, our hero, Tony Stark, is kidnapped near Bagram Air Base in northwest Afghanistan's Parwan Province.
[225001250050] |He is held captive with one other prisoner, a local Afghani doctor named Yinsin (a name carried over from the original comic book I believe, so not particularly Afghani) who says he's from a small town named "Gulmira" (I couldn't find any real town by that name, though it seems to be a fairly common given name).
[225001250060] |Luckily for Stark, Yinsin speaks "many languages", so he's able to understand some of their captors' shouts and orders, but not all (an interesting aside, the actor who plays Yinsin, Shaun Toub, has a backstory worthy of its own screenplay).
[225001250070] |You see, the group which has kidnapped the unfortunate pair goes undefined throughout the movie.
[225001250080] |We are largely left to draw our own conclusions about their origin, ideology, and motivation (though we get some minor clarification late in the movie).
[225001250090] |The one thing we learn about their diversity is that they speak a wide variety of languages, as Yinsin lists some of them for Stark.
[225001250100] |I don't remember the full list, but I believe they included "Arabic, Ashkun, Farsi, Pashto" amongst others.
[225001250110] |So, kudos to the screenwriters for, in the very least, scanning Ethnologue for an appropriate set of languages to list.
[225001250120] |But there's one other language that Yinsin mentions, and it got my attention: Hungarian.
[225001250130] |A few scenes after Yinsin lists the various languages the group speaks (a list that does not include Hungarian), he and Stark are being yelled at by an unnamed thug.
[225001250140] |Stark asks Yinsin what he's saying and Yinsin says something like "I don't know.
[225001250150] |He's speaking Hungarian."
[225001250160] |This was meant as a bit of comic relief, I believe.
[225001250170] |So the screenwriters may have chosen Hungarian at random.
[225001250180] |Perhaps any language that American audiences would perceive as unusual or unexpected would have done the trick.
[225001250190] |Perhaps it would have been even funnier if he said "I dunno, he's speaking Comanche (ba dum boom!)."
[225001250200] |I don't know, but my linguistics radar picked it up and I went searching for any connections Hungary might have with Afghanistan.
[225001250210] |Alas, I have found few.
[225001250220] |I would have to make some serious leaps of logic to connect the dots, and I don't think the movie was going for that.
[225001250230] |The clarifying scenes late in the movie suggest that this groups' motivations are largely financial, not ideological or political, so we might assume this was some random Hungarian mercenary.
[225001250240] |As far as I can tell, this is the most logically consistent interpretation (unless I've misunderstood the movie's plot or dialogue, in which case ... never mind).
[225001370010] |Soda Pop Coke
[225001370020] |"The Great Pop vs. Soda Controversy".
[225001370030] |Red = “Coke”Blue = “pop”Yellow = “soda”
[225001370040] |Ahhhhhh, a classic from first year linguistics....
[225001370050] |(HT: Daily Dish)
[225001370060] |UPDATE: The picture above wasn't visible because the original site wasn't responding so I saved the pic as a JPEG and uploaded it directly.
[225001370070] |There is still a link to the original page.
[225001370080] |Why wasn't the original page responding?
[225001370090] |Maybe because Andrew Sullivan linked to it and the resulting traffic crashed the site.
[225001370100] |Sullivan gets like 5 million views a month.
[225001370110] |The guy's a blogging monster.
[225003520010] |Powerful Minority: Catalonians
[225003520020] |According to The Hollywood Reporter.com, "Parliamentarians in Spain's northeastern region of Catalonia have passed a controversial law requiring half of all commercial films to be dubbed* into the local language."
[225003520030] |I'm generally not a fan of laws relating to language (I'm a linguistic libertarian of sorts), but I recognize that Catalan has benefited tremendously from a strong region where its speakers have money, power, and prestige (the real forces of linguistics, ultimately).
[225003520040] |According to the above report, Catalan accounts for 20% of Spain's film market (ticket sales?) but only 3% of films are dubbed* into Catalan.
[225003520050] |As a proud capitalist pig, I just don't see why the Catalonia 20% market share isn't itself enough to drive film makers to produce the dubbing.
[225003520060] |The Hollywood Reporter claims it costs "€50,000 euros ($61,000) to dub."
[225003520070] |Okay, try adding €1 to the ticket price of dubbed films and see if Catalonians are willing to pay for this service.
[225003520080] |If they aren't willing to pay for it, why legislate it?
[225003520090] |As I recall, several European countries already have differential pricing** for films so this is not a crazy suggestion.
[225003520100] |*actually "dubbed or subtitled." ** where some are cheaper than others, unlike here in the States where all films are the same price...a ludicrous system, btw.
[225003520110] |See HERE for a nice discussion.
[225004300010] |will there be a speech therapy oscar?
[225004300020] |The new movie The King's Speech has won the Toronto film festival's most popular film award.
[225004300030] |Winners of that award often go on to win big at the Oscars.
[225004300040] |Interesting for us linguists because the movie "Tells the story of the man who became King George VI, the father of Queen Elizabeth II.
[225004300050] |After his brother abdicates, George ('Bertie') reluctantly assumes the throne.
[225004300060] |Plagued by a dreaded stutter and considered unfit to be king, Bertie engages the help of an unorthodox speech therapist named Lionel Logue.
[225004300070] |Through a set of unexpected techniques, and as a result of an unlikely friendship, Bertie is able to find his voice and boldly lead the country into war" (from IMDB, emphasis added).
[225004300080] |I don't know anything about Logue, but Caroline Bowen, a Speech-Language Pathologist, posted some good info here, including this bit about the actual "unexpected techniques":
[225004300090] |The therapist diagnosed poor coordination between larynx and diaphragm, and asked him to spend an hour each day practising rigorous exercises.
[225004300100] |The duke came to his rooms, stood by an open window and loudly intoned each vowel for fifteen seconds.
[225004300110] |Logue restored his confidence by relaxing the tension which caused muscle spasms.
[225004300120] |The duke's stammer diminished to occasional hesitations.
[225004300130] |Resonantly and without stuttering, he opened the Australian parliament in Canberra in 1927.
[225004300140] |Using tongue twisters, Logue helped the duke rehearse for major speeches and coached him for the formal language of his coronation in 1937 (emphasis added).
[225004300150] |Bowen says that the King managed to speak in a slow, measured pace.
[225004300160] |You can download a 1 minute sound file of King George VI's broadcast on the day Britain declared war on Nazi Germany here.
[225004300170] |You'll note that he does indeed speak very slowly.
[225004300180] |Bowen, C. (2002).
[225004300190] |Lionel Logue: Pioneer speech therapist.
[225004300200] |Retrieved from www.speech-language-therapy.com/ll.htm on (9/21/2010).
[225004430010] |are north and south 'embodied' concepts?
[225004430020] |Lameen Souag, of Jabal al-Lughat, made a thoughtful comment on part 2 of my review of Guy Deutscher's book Through The Language Glass. and I wanted to post my response because I think it brings up an interesting question about concepts that are embodied and how.
[225004430030] |Lameen's comment: "North" and "south" have nothing to do with a human body's orientation; the only aspect of human-ness relevant to the cardinal directions is that of being located on a small enough part of a rotating sphere, which applies equally to, yes, amoebas, and every other organism on Earth.
[225004430040] |(Obviously, from the observer's perspective it's the sky that's rotating.)
[225004430050] |The difference in question is between a coordinate system based on the observer's body's orientation and one based on the orientation of his/her environment (his planet for cardinal directions, the slope of the ground for "uphill/downhill", etc.)
[225004430060] |The term "ego-centric" may or may not be the most apt way to describe this difference, but the difference is clear.
[225004430070] |My response: Lameen, I respectfully disagree that "north" and "south" are not fundamentally human concepts.
[225004430080] |They are concepts, hence they are filtered by our cognitive system, vulnerable to all the strange and wonderful biases and alterations that systems bears on all concepts.
[225004430090] |So what is so human about north?
[225004430100] |Well, what is north?
[225004430110] |It's a direction away from me, right?
[225004430120] |One can never be at north.
[225004430130] |There is always a north of north (except in the rare case of standing atop the exact north pole, but that doesn't seem relevant).
[225004430140] |But that alone doesn't make it human.
[225004430150] |Imagine a GY speaker were as big as the sun (this is a thought experiment, so reality means nothing, haha).
[225004430160] |Would saying that a tree is north of a river mean anything?
[225004430170] |The scale would be too small.
[225004430180] |Imagine a GY speaker said that an electron was north of a nucleus or that a tree was to the Pacific Ocean of a river.
[225004430190] |Would any of those uses of cardinal directions make sense?
[225004430200] |No, because the scale would make them incoherent.
[225004430210] |The direction concepts north and south are determined, at least in part, by our human scale, hence embodied.
[225004430220] |We conceptualize them as a point, somewhere far off in the distance, and we can point to them.
[225004430230] |But this is an embodied conceptualization which only makes sense for things in the human scale.
[225004430240] |I believe there's more than just human scale at work too, but I don't have enough time to get into it right now, but I think this is a worthwhile topic.
[225004570010] |how to spot an academic con artist
[225004570020] |If you've been to college, you were taught how to scrutinize research sources at some point.
[225004570030] |Let's test your skills, shall we?
[225004570040] |Imagine you run across a popularized article and the author promotes his own expertise using the following:
[225004570050] |"Ph.D" after his name.
[225004570060] |Referencing his multiple books.
[225004570070] |Noting his academic appointments.
[225004570080] |You look at his personal web page list of publications and you see dozens of articles and books going back several decades.
[225004570090] |Must be an expert, right?
[225004570100] |Must be legit, right?
[225004570110] |This is what I saw for John Medina, Ph.D., author of the HuffPo article 'Parentese': Can Speaking To Your Baby This Way Make Her Smarter?
[225004570120] |But I quickly became suspicious about this man's credentials.
[225004570130] |Why?
[225004570140] |Let's look more closely at those bullet points:
[225004570150] |"Ph.D" after his name.
[225004570160] |I recall a professor once saying something like "Once you've been to grad school, everyone you know has a Ph.D.
[225004570170] |It's just not that special."
[225004570180] |This may sound elitist, but the truth is, most people with Ph.Ds don't use the alphabet to promote themselves.
[225004570190] |They use their body of work.
[225004570200] |I'm almost always suspicious of people who promote themselves using their degrees.
[225004570210] |Plus, nowhere on his own site does he list a CV or even where he got his Ph.D.
[225004570220] |I had to find this at the UW web page, listing "PhD, Molecular Biology, Washington State University, 1988" and impressive degree, no doubt, but why hide this?
[225004570230] |It has become common practice for serious academics to provide their full CV on their web page.
[225004570240] |Medina fails to follow this practice.
[225004570250] |Referencing his multiple books.
[225004570260] |All of his books are aimed at non-academics.
[225004570270] |There's nothing wrong with trying to explain your expertise to a lay audience, but at some point you should also be trying to explain your expertise top other experts.
[225004570280] |Noting his academic appointments.
[225004570290] |Here, Medina does seem to have some impressive qualifications.
[225004570300] |He is an "Affiliate Professor, Bioengineering" at The University of Washington.
[225004570310] |As well as director of the Brain Center for Applied Learning Research at Seattle Pacific University (which, as far as I can tell, is a house and has exactly two members, the director and his assistant).
[225004570320] |You look at his personal web page list of publications and you see dozens of articles and books going back several decades.
[225004570330] |This is the most suspicious by far.
[225004570340] |Yes he lists dozens of publications, but almost all of them are short, 2-4 page articles IN THE SAME MAGAZINE, Psychiatric Times, a dubious looking magazine at best.
[225004570350] |The only others are in the equally dubious looking Geriatric Times.
[225004570360] |His publications page does list a REFEREED PAPERS section with some more legitimate academic articles, but he's second or third author on almost all.Add to this the fact that his recommendations seem to be little more than common sense (i.e., talk to your kids more...no duh!).
[225004570370] |I have no problem with someone making money off their education, but this seems to be an example of trying to con people into believing he has more to say than he really does simply because of the letters P-h-D after his name.
[225004570380] |Despite writing this, I don't feel terribly comfortable casting aspersions on someone who may indeed be a serious, legitimate academic.
[225004570390] |If I have made mistakes in this critique, I will apologize.
[225004570400] |But then again it is incumbent upon Medina to do a better job of representing his credentials.
[225004570410] |And it is incumbent upon us as us a lay readers (hey, I ain't no molecular biologists either) to scrutinize supposed experts who are asking us to pay for their expertise (in the form of book prices and speaking fees).