[225000290010] |
The Storm Gathers Force ...
[225000290020] |cko and The Language Guy have a nice discussion of language death going over at knittertating.
[225000290030] |The Language Guy has been around this biz longer than most and he always has thoughtful comments.
[225000290040] |I take cko's comments quite seriously because she is both a smart linguist and an experienced field linguist.
[225000290050] |I'm neither.
[225000290060] |But I'm tall, so, ya know, that's something.
[225000300010] |Chicken pecks revisited...
[225000300020] |Cool, Eric Bakovic over at Language Log posted about "Langage SMS".
[225000300030] |This was the topic of my very first substantiative post.
[225000330010] |Reduntant Blogging. Redundant Blogging.
[225000330020] |Eugene Volokh over at The Volokh Conspiracy complained yesterday that some interpretations of Bushisms are not particularly fair because the interpretations are not taking into account the genuine ambiguity of the comments (okay, that's my version).
[225000330030] |In my zeal to put in my 2 cents, I posted a comment to that affect, only to realize now that I was basically repeating what Volokh himself had said in his original post.
[225000330040] |Since Eugene Volokh was some kind of genius wunderkind or something, I'm not all that ashamed that we basically think alike.
[225000330050] |Maybe HE'S ashamed of thinking like The Lousy Linguist.
[225000330060] |But I'm not ashamed.
[225000330070] |Oh no, not me.
[225000340010] |Crystal Clear ...
[225000340020] |Thanks to cko (who lent/loaned/leaned me her copy of Crystal's Language Death) I was finally able to read Chapter 2 Why Should We Care and Chapter 3 Why Do Languages Die.
[225000340030] |I have to say, I found the general writing style disappointing.
[225000340040] |It’s a lightweight volume that reads like it was pasted together from notes and speeches (which it may very well have been).
[225000340050] |He tends to make the same points over and over, in no systematic order.
[225000340060] |I read nothing in the first three chapters of this book which caused me to re-evaluate my gut feeling that there may be some favorable outcomes to language death (and we linguists ought to study that possibility more closely).
[225000340070] |Only 3 main points relate to why language death is bad:
[225000340080] |I. Languages are like an ecosystem = ecosystems have mutually reinforcing relationships between members/elements (i.e., hurt one, hurt the system)
[225000340090] |II.
[225000340100] |Languages are repositories of data (i.e., we can learn stuff from them: history, culture, linguistic feature space)
[225000340110] |III.
[225000340120] |Language = identity
[225000340130] |There’s no proof of (I) and Crystal is quick to caution against taking the analogy too far (he claims that humans are in complete control of language death factors; I suspect he is wrong about that, but ...); nonetheless, I suspect that it’s somewhat analogous.
[225000340140] |However, by the same ecosystem analogy, it may be the case that some language death may have favorable outcomes (which has been my guess all along).
[225000340150] |As I noted on cko’s blog recently, “I suspect that recent work in language learning and evolution by Partha Niyogi and folks like him will bear greatly on this topic in the coming decade.”
[225000340160] |Argument (III) is garbled at best.
[225000340170] |Crystal claims that “Identity makes members of a community recognizably the same” (p39).
[225000340180] |Hmmmmmm.
[225000340190] |I thought it was the opposite -- identity makes members of a community recognizably different.
[225000340200] |In any case, this argument is vague at best, and does not relate directly to language death.
[225000340210] |There are various cultural factors that go in to “identity”, whatever that is.
[225000340220] |It is argument (II) which I find most compelling, and the one I agree with most readily and without debate.
[225000340230] |Yes, I agree that all languages have unique linguistic properties that are well worth studying in themselves.
[225000340240] |But just because we find valuable data in every language does NOT mean we should stop language death per se. we need a broader understanding of the system of language interaction and language evolution, otherwise zealously stopping language death may be as irresponsible as zealously causing language death.
[225000340250] |Like a protected species over-grazing or over-hunting a locale, language over-population may serve some ecosystem harm.
[225000340260] |We just don't know.
[225000350010] |Blomis #4 -- Innateness Again
[225000350020] |I posted a challenge recently to The Innateness Hypothesis (aka Universal Grammar) as discussed in Juan Uriagereka's article on language evolution.
[225000350030] |Mark Liberman over at Language Log makes a similar challenge with far greater detail and authority than I could.
[225000390010] |Buffalo Learning
[225000390020] |So, regarding the post below, I wonder if there are any hypotheses within the language learning/machine learning communities regarding the maximum amount of polysemy a learning algorithm can handle and still succeed?
[225000400010] |Allies vs. Enemies
[225000400020] |More on frequency and meaning.
[225000400030] |Here are the results of a “kitchen experiment” meant to test weather the relationship type “ally” could be inferred reliably from mere co-occurrences and conjunction words.
[225000400040] |Assumption: If two names are conjoined by “and”, they are probably allies, not enemies.
[225000400050] |Method: I took four names that have clear ally/enemy relationships and Googled each individually; then I Googled each combination in quotes (switching the names as well).
[225000400060] |The actual search queries were of the form "WINSTON CHURCHILL and FRANKLIN ROOSEVELT" but I edited them a bit in the table below to make them fit.
[225000400070] |Results: Allies15,343 (14,700 + 643) --Adolf Hitler and benito mussolini 11,317 (10,500 + 817) -- FRANKLIN ROOSEVELT + WINSTON CHURCHILL
[225000400080] |Enemies4280 (2,600 + 1,680) -- WINSTON CHURCHILL + Adolf Hitler1348 (596 + 752) -- FRANKLIN ROOSEVELT + Adolf Hitler511 (504 + 7) -- WINSTON CHURCHILL+ benito mussolini5 (4 + 1) -- FRANKLIN ROOSEVELT + benito mussolini
[225000400090] |Discussion: The assumption is weakly supported.
[225000400100] |Roosevelt is conjoined with his ally Churchill more than 4 times as often as his enemy Hitler and more than 2000 times as often as Mussolini.
[225000400110] |Churchill is conjoined with his ally Roosevelt more than twice as often as he is conjoined with his enemy Hitler and more than 10 times as often as Mussolini.
[225000400120] |The Flip-Flop Effect: The most linguistically interesting result is the more than ten-fold increase in hits that the “FRANKLIN ROOSEVELT and WINSTON CHURCHILL” query got over its “WINSTON CHURCHILL and FRANKLIN ROOSEVELT” brethren.
[225000400130] |An even greater effect is seen with Hitler/Mussolini flip-flop.
[225000400140] |Why is the Roosevelt-first collocation so much more frequent?
[225000400150] |My hunch is that there is some salience issue at work.
[225000400160] |The more salient member of the collocation will tend to be listed first.
[225000400170] |Flaws: Surely there are more flaws to this kitchen experiment than can be enumerated easily.
[225000400180] |But the one obvious flaw that deserves mention is the normalization problem.
[225000400190] |Deciding which form of each name to use as a search was not trivial.
[225000400200] |Roosevelt is often referred to by his initials “FDR”, and both Hitler and Mussollini are commonly referred to by last name only.
[225000400210] |So this was an experiment in term collocation frequency at best, not person reference.
[225000400220] |Note: I'm certain that either Mark Liberman or Arnold Zwicky over at Language Log have use the term “kitchen experiment” in their posts before, but a search of that site produced nothing.
[225000400230] |Hmmm, am I just imagining this term has been used before?
[225000410010] |Ripe Tomatos
[225000410020] |Well, my doppelgänger Eugene Volokh over at the The Volokh Conspiracy has finally gotten 'round to mentioning the whole Burma/Myanmar controversy.
[225000410030] |Yet again!
[225000410040] |We are of like mind.
[225000410050] |Oooooooh, scary.
[225000410060] |Psssst ...
[225000410070] |I would have posted this comment on Volokh's blog, but I couldn't remember my login name, and they have a scary message over there for people who try to muck with the login process.
[225000410080] |So be it.
[225000420010] |Do You Think Computationally?
[225000420020] |This morning I received an email regarding the National Science Foundation’s new program called "Cyber-Enabled Discovery and Innovation" (CDI).
[225000420030] |I skimmed the email with little interest until I read this:
[225000420040] |CDI aims to create revolutionary science and engineering research outcomes made possible by innovations and advances in "computational thinking", defined as computational concepts, methods, models, algorithms, and tools.
[225000420050] |This phrase, I’m guessing, is meant to refer to thinking about computational methods.
[225000420060] |A colleague of mine has ranted several times about the mis-use of the term “computational” and its morphological variants and it’s because of phrases like this that he rants.
[225000420070] |Even if we ignore the juicy ambiguity of the phrase above and take it as it’s intended, what exactly does “computational” mean?
[225000420080] |Hal Daume wrote this:
[225000420090] |The crux of the argument is that if something is not a task that anyone performs naturally, then it's not a task worth computationalizing.
[225000420100] |I think he simply means “make a computer do it automatically” or something like that.
[225000420110] |And I take that to be the most sensible use of the word.
[225000420120] |But the word seems to get used to mean something else in a lot of cases.
[225000420130] |To make something computational is often like making something new &improved or extreme.
[225000420140] |It seems to be a marketing tool.
[225000420150] |People use it to make their work sound cutting edge and advanced.
[225000420160] |In other cases, it means using a computer to do what people used to do by hand.
[225000420170] |I Googled the word “computational” and these were all on the first page of hits (CL was number 1, hehe):
[225000420180] |Computational linguisticsComputational biologyComputational economicsComputational chemistryComputational geometry
[225000420190] |I don’t know if these disciplines have the same relationship to computational that linguistics does, but I can say this: I believe there is really no such thing as computational linguistics.
[225000420200] |As I have said in the Q &A section of my Companies That Hire Computational Linguists page:
[225000420210] |my use of the term “computational linguistics” is a cover term for a loosely related set of skills including but not limited to NLP, NLU, MT, AI, info extraction, speech processing, (takes a breath…) VUI, text mining, document understanding, machine learning, ad nauseum…
[225000430010] |Sunk Skunk Stunk
[225000430020] |My my, there are sooooooooo many things wrong with this headline:
[225000430030] |Officer uses BB gun to save skunk stuck in jar
[225000430040] |Psssst, ignore the facts of the story.
[225000430050] |As your linguist, I advise you to ignore facts whenever they’re inconvenient.
[225000440010] |Linguistics Wins Something!
[225000440020] |Or not.
[225000440030] |The Ig Nobel prizes were handed out October 4 and the internets is abuzz.
[225000440040] |The prize winners are being blogged about fast and furiously.
[225000440050] |In particular, both Andrew Sullivan and Language Log have highlighted the Linguistics winner, a group who proved, and I quote:
[225000440060] |rats sometimes cannot tell the difference between a person speaking Japanese backwards and a person speaking Dutch backwards
[225000440070] |So the linguistics winner of the Ig Nobel prize gets mention on major blogs.
[225000440080] |I would be slightly happier if it weren’t for the fact that there is no real Nobel Prize for linguistics.
[225000440090] |In fact, as far as I know, there isn’t a single major prize for linguistics at all.
[225000440100] |Mathematics has the Fields Medal, Economics has a whole slew of prizes.
[225000440110] |But Wikipedia’s page List of prizes, medals, and awards does not even have a category for linguistics.
[225000440120] |Joseph Stiglitz, a (real) Nobel Prize winning economist has made a convincing argument here that prizes are good for stimulating academic research.
[225000440130] |His point is that prizes are better than patents.
[225000440140] |I got the link from Greg Mankiw’s blog which presents some counter arguments.
[225000440150] |However, linguistics traditionally has neither prizes nor patent opportunities.
[225000440160] |Any wonder my field has spent 40 years mired in failed theories and vague assumptions.
[225000440170] |Computational linguistics, whatever that is, has begun to bring some financial opportunities to linguistics, but that has only been in the last 10 years and those opportunities are pretty much restricted to engineers, not linguists.
[225000440180] |What is the most effective way to financing and incentivize linguistics research?
[225000450010] |Don't Forget Recency Effects Too...
[225000450020] |As usual, Mark Liberman, of Language Log fame, has some instructive comments about linguistics, frequency effects, recency effects, and the state of the art in psycholinguistics:
[225000450030] |psychological research tells us that there is also a strong recency effect: in all sorts of tasks, words that we've heard or seen recently are processed more quickly.
[225000450040] |Again, we don't know how the recency effect arises in the brain, nor do we know whether the brain mechanisms underlying the frequency and recency effects are the partly or entirely the same.
[225000450050] |There is no lack of speculation on these questions, but we honestly just don't know at this point.
[225000460010] |Frequency effects in linguistics
[225000460020] |For the record, there are known to be a variety of “frequency effects” in language.
[225000460030] |A brief survey:
[225000460040] |Zipf's law: roughly speaking, the most frequent word in a corpus will be about twice as frequent as the second most frequent (i.e., twice as many tokens).
[225000460050] |Word recognition: Dahan et al (pdf) :“frequency affects the earliest moments of lexical access”
[225000460060] |Sentence processing: Lau et al : Frequency effects “give rise to reaction time differences in sentence processing tasks"
[225000470010] |More on Frequency
[225000470020] |Yesterday, Sally Thomason at Language Log posted a critique of recently published research regarding frequency and language change (I’ve noted one perhaps trivial relationship between frequency and linguistic structure here).
[225000470030] |In challenging the claim that ‘frequently used words are resistant to change’, she points out that frequency is NOT an all powerful mechanism.
[225000470040] |Crucially, she points out the following:
[225000470050] |regular sound change is indeed blind to frequency and all other nonphonetic contextual factors.
[225000470060] |So it is nonsense to say that frequent words resist change unless one qualifies the statement to exclude regular sound change.
[225000470070] |The role of frequency in various linguistic processes has become a hot topic in linguistics.
[225000470080] |As usual, the jury is far from in.
[225000470090] |A good primer is the collection in Bybee and Hopper’s Frequency and the Emergence of Linguistic Structure.
[225000470100] |Finally, Thomason ends her post with a fair point, that is best kept in mind when non-linguist try to “fix” the problems we silly linguist failed to solve:
[225000470110] |Failing to learn something about a field one wishes to contribute to is all too likely to lead to reinvention of the wheel at best, and to a garbage in/garbage out problem at worst.
[225000480010] |Still a Story ...
[225000480020] |Mark Liberman at language Log picks up the Myanmar vs. Burma debate, and notes "Mama is the literary pronunciation of the more colloquial Bama."
[225000480030] |My thoughts are here.
[225000490010] |Data and Models
[225000490020] |Mankiw on Greenspan and macro-economics:
[225000490030] |Better monetary policy, he suggests, is more likely to follow from better data than from better models.
[225000490040] |Relatively little modern macro has been directed at improving data sources.
[225000490050] |Perhaps that is a mistake.
[225000490060] |Methinks this same sentiment could be said of linguistics.
[225000490070] |However, I am ambivalent.
[225000490080] |On the one hand, I am trained in a department long dedicated to descriptive linguistics, so I’m frightened by the lack of good description for most of the world’s languages.
[225000490090] |I believe in supporting field linguists and old fashioned grammar writing tasks.
[225000490100] |But I’m equally frightened by the lack of good models of language, particularly of language change and evolution.
[225000490110] |I’m sympathetic to the recent flood of computationally minded engineers into the field of linguistics who have brought fresh approaches (e.g., statistical).
[225000490120] |Here’s a representative sample of very smart people bringing mathematical/computational modeling into linguistics:
[225000490130] |Sandiway Fong -- U. Arizona Partha Niyogi -- U. ChicagoJosh Tenenbaum -- MITCharles Yang -- U. Penn
[225000500010] |Huh?
[225000500020] |The good folks over at Cognitive Daily are usually pretty sharp about the research they review.
[225000500030] |But I'm afraid they've managed to make me chortle with a little bit of condescension with today's post "The economic value of gossip."
[225000500040] |There may or may not be economic value to gossip (the article they reference is about what appears to be a variation on a common game-theoretic experiment economists call the ultimate game, hat tip to Greg Mankiw who posted on a related topic a couple days ago), but they print the following quote from New York Times journalist John Tierney without the slightest hint of jest:
[225000500050] |Language, according to the anthropologist Robin Dunbar, evolved because gossip is a more efficient version of the “social grooming” essential for animals to live in groups.
[225000500060] |Folks, I freely admit that theories about what caused language to develop in humans are rarely if ever based on more than thoughtful speculation.
[225000500070] |This is the case simply because there is precious little hard evidence regarding the origins of language.
[225000500080] |Fine.
[225000500090] |My acknowledgment of that is now on record.
[225000500100] |That said, this claim that language evolved BECAUSE OF its gossip function strikes me as a clear case of bullshit.
[225000500110] |But hey, I could be wrong.
[225000500120] |I would point the curious reader to Jackendoff's Foundations of Language as a fair primer on these issues.
[225000510010] |Homer's Saxomaphone
[225000510020] |"Reduplication in English Homeric Infixation" (pdf) by Alan C. L. Yu, U. Chicago.
[225000510030] |saxophone >saxo-ma-phone Mississippi >Missi-ma-ssippitelephone >tele-ma-phone Alabama >Ala-ma-bamawonderful >wonder-ma-ful dialectic >dia-ma-lectic
[225000520010] |Yet Another Theory of Language Evolution...
[225000520020] |Chris Chatham, a grad student in Cognitive Neuroscience at U. Colorado, has an interesting review of yet another language evolution theory:
[225000520030] |Why There Aren't Right-Handed Apes, Or: Handedness and The Evolution of Language.
[225000530010] |You Smuther Froggin Archmucking &*$#@%!
[225000530020] |“Swearing at work 'boosts team spirit, morale'”
[225000540010] |Onna Kotoba -- "Women's Japanese"
[225000540020] |Hal Daume over at his natural language processing blog articulates a lament well known to linguists:
[225000540030] |At the end of my four years, I was speaking to a frien (who was neither a conversation partner nor a prof) in Japanese and after about three turns of conversation, he says to me (roughly): "you talk like a girl."
[225000540040] |As I posted in his comments section, this is a familiar situation.
[225000540050] |English speaking men (and probably others) often learn a form of Japanese that could be referred to as “women’s Japanese”.
[225000540060] |The most probable reason for this is the large percentage of female teachers of Japanese.
[225000540070] |I have no clue what the actual percentage is.
[225000540080] |If anyone knows, please post me a comment.
[225000540090] |I've never studied Japanese, so I don't know the facts, but Wikipedia has a page called Gender differences in spoken Japanese which makes the following claim: “Feminine speech includes the use of specific personal pronouns... omission of the copula da, use of feminine sentence finals such as wa, and the more frequent use of the honorific prefixes o and go.”
[225000540100] |I did a quick bit of Googling and offer this as a brief bibliography of articles and research on the subject (with the caveat that I haven’t reviewed any of this and make no claims regarding the veracity of these works):
[225000540110] |I sound like what in Japanese? by Matthew Rusling
[225000540120] |Manifestations of Gender Distinction in the Japanese Language. by Alexander Schonfeld
[225000540130] |Stanford Japanese page.
[225000540140] |Unknown author.
[225000540150] |Gender performance and intonation in a Japanese sentence-final particle yo.ne [PPT]Yumiko Enyo, University of Hawaii, Manoa
[225000540160] |Takarazuka: Sexual Politics and Popular Culture in Modern Japan. by Jennifer Robertson
[225000540170] |Ore wa ore dakara ['Because I'm me']: A study of gender and language in the documentary Shinjuku Boys. by Claire Maree
[225000540180] |Here are several papers from the 9th International Pragmatics Conference (July 10 - 15, 2005; Riva del Garda, Italy)
[225000540190] |The construction of Standard Japanese women's language from 1920's to 1945. by Rumi Washi Nagoya Gakuin University
[225000540200] |Constructing Linguistic Femininity in Contemporary Japan: Scholarly and Popular Representations. by Janet S. Shibamoto Smith and Shigeko Okamoto.
[225000550010] |"X Experiments"
[225000550020] |In an earlier post, I used the term “kitchen experiment” to refer to a brief, rather unscientific attempt at empirical data gathering, the sort of thing one might do in the morning, in the kitchen, while drinking a cup of coffee.
[225000550030] |At the time, I thought I had picked up the term from the Language Log folks, but I was unable to find the term using their search engine.
[225000550040] |Alas!
[225000550050] |The mystery was solved this morning when I discovered I had mis-remembered the term.
[225000550060] |Mark Liberman uses the term Breakfast Experiment™ in his latest post.
[225000550070] |It's not clear to me if he has seriously trademarked the term or not, but to be safe, I'll keep using my "kitchen experiments" variant.
[225000560010] |An analysis of 'exempt'
[225000560020] |I've just started a new blog for a dissertation support group at SUNY Buffalo.
[225000560030] |This is a copy of a post I put over there.
[225000560040] |I'm analyzing constructions involving a class of verbs Len Talmy named 'barrier verbs' like ban, prevent, and protect.
[225000560050] |Here’s one interesting tidbit about a word that is some what barrier-like: by a large margin, the word exempt most often occurs as a predicate adjective in copula constructions (hence, it is POS tagged JJ) as in the BNC example below.
[225000560060] |“In certain circumstances, the vehicle will also be exempt from Vehicle Excise Duty Road Tax.”
[225000560070] |Code: A0JGenre: W_misc Subject: W_nat_scienceMedium: m_pub
[225000560080] |(TOP (S (PP (IN In) (NP (JJ certain) (NNS circumstances) (, ,))) (NP (DT the) (NN vehicle)) (VP (MD will) (ADVP (RB also)) (VP (VB be) (ADJP (JJ exempt) (PP (IN from) (NP (NN Vehicle) (NN Excise) (NN Duty) (NN Road) (NN Tax) (. .))))))))
[225000560090] |First Pass Analysis: the word exempt is like open, it can either be a state or an accomplishment, but it is most highly salient as a state.
[225000560100] |the door was open
[225000560110] |the door was opened (by X)
[225000560120] |the organization was exempt
[225000560130] |the organization was exempted (by X)
[225000560140] |The passive sentences have an accomplishment reading.
[225000560150] |But these are rare.
[225000560160] |I think the reason the overwhelming majority of occurrences of the word exempt in the BNC are predicate adjectives is because the outcome state is the salient aspect of the event of exempting.
[225000560170] |The actor of the exempting event is almost irrelevant (it's typically a law: not animate, not volitional, indirect causer).
[225000560180] |Contrast this with a speech act barrier verb like to bar:
[225000560190] |A judge barred Britney Spears from seeing her children.
[225000560200] |In (1), the actor of the barring event is an animate, volitional, direct causer.
[225000570010] |Witty Linguistic Chickens
[225000570020] |I just ran across this cute article (pdf) by Bonatti et al which unapologetically takes a stand in the great rules vs. statistics debate currently raging within linguistics.
[225000570030] |It’s a useful follow-up to my previous posts regarding frequency and language.
[225000570040] |I like the article because it engages in the kind of point-by-point debate that is common in lab meetings (which is often missing in published material); but I also love the wit and sense of humor the authors have.
[225000570050] |The article starts with a jab at Italian drivers, and ends with a metaphorical playfulness rarely seen (outside of Jackendoff’s work, of course).
[225000570060] |Here are the first and final paragraphs (but the 2 page article is well worth the read):
[225000570070] |With the possible exception of Italian traffic regulations, any rule will generate a statistically detectable advantage for items instantiating the rule.
[225000570080] |Thus, although attempts to reduce structural phenomena …to statistical computations …have been unsuccessful so far …, it would be no surprise if one or another statistical measure would correlate with the structural phenomena under investigation.
[225000570090] |But would this mean the statistics caused the apparently rule-abiding behaviors, or are the statistics epiphenomena of underlying structures?
[225000570100] |Questions about chickens and eggs are always difficult to settle…Thus, although we admire demonstrations of powerful statistical abilities in humans, we remain convinced that it is the linguistic chicken that lays statistical eggs, and not the statistical eggs that hatch into linguistic chickens.
[225000580010] |FUBARed in Buffalo?
[225000580020] |Ouch!
[225000580030] |Big time Hahvahd Economist Edward L. Glaeser claims the city of Buffalo is screwed with a capital SCREWED!
[225000580040] |(Hat tip to Mankiw for the link).
[225000580050] |Glaeser's argument seems to be that tax dollars are better spent helping the people of Buffalo, not the place.
[225000580060] |The article walks through, with painful detail, the history of Buffalo's decline as well as the history of Buffalo's many failed renewal projects (the latest being the insane Bass Pro debacle and the Seneca Casino, neither of which are likely to do for the city what renewal advocates want: bring prosperity to the average citizen).
[225000580070] |I'm posting this non-linguistics related comment because I came to Buffalo almost 10 years ago to study linguistics (I'll finish that diss someday, hehe) and I had the exact same thought that every one else who comes to Buffalo has: this ciy has "a lot of potential".
[225000580080] |Well, it's a friggin decade later and everyone here is STILL saying that.
[225000580090] |It's depressing.
[225000580100] |At some point, potential must be reached, or else it's sad.
[225000580110] |And that's where Buffalo is right now; a city which has failed to reach it's potential for decades, and Glaeser has a good read on why.
[225000580120] |Now, if Glaeser could figure out why this linguist with a lot of potential is still "working on" that diss.
[225000580130] |Ich bin Buffalo!
[225000590010] |What I Love About Buffalo
[225000590020] |Given my sad post below on Buffalo’s failed renewal, I thought it only fair to make it clear that there are some things about Buffalo that I love, quite dearly.
[225000590030] |ARTAlbright-Knox Art GalleryArtvoiceBabik Buffalo Film Seminars
[225000590040] |The Buffalo Film Seminars take place Tuesday nights at 7 p.m. promptly at the Market Arcade Film and Arts Center in downtown Buffalo, the only eight-screen publicly-owned film theater in the United States.
[225000590050] |Hallwalls Contemporary Arts CenterIrish Classical Theatre
[225000590060] |Located in the heart of Buffalo's thriving Theatre District, Irish Classical Theatre Company (ICTC) is Western New York's premier stage for the greatest works of dramatic literature.
[225000590070] |FOODAllen Street Hardware CafeBetty’sBistro Europa Bill Rapaport's Buffalo Restaurant Guide La Tee Da Mother’s
[225000590080] |NeighborhoodsAllentown
[225000590090] |"In the end it is easier to experience Allentown than to describe it"
[225000590100] |Elmwood Neighborhood
[225000590110] |Elmwood Village Named One of 10 Great Neighborhoods in America
[225000590120] |Hertel Avenue
[225000590130] |ACADEMICSSUNY Buffalo Linguistics Department
[225000600010] |Computational Linguistics vs. NLP
[225000600020] |What is the difference between Computational Linguistics and Natural Language Processing?
[225000600030] |(Hint: There is no official answer to this question).I had my 476th version of this conversation just now (because we’re in the hiring process for a new “CL lead” and having challenges defining the job) and I made the off-the-cuff claim that it’s the same as the difference between science and engineering.
[225000600040] |An engineer tries to build things while a scientist is in essence a reverse-engineer, dedicated to trying to figure out how the world works.
[225000600050] |Human language is a system that already exists, and it works in some way that no one really understands.
[225000600060] |Linguistics and cognitive scientists have been studying it for decades (well, you could make the claim for millenia).
[225000600070] |They are now joined by a group of specialists whose skill set involves computer programming and statistics.
[225000600080] |Computational linguistics, then, involves trying to figure out how human language works using computational tools (e.g., automated methods of corpus analysis like Tgrep2 [UPDATE 12/02/2010: dead link, for Tgrep2 tutorials, see HERE] and Perl scripting, learning models, etc) while NLP involves building tools that involve language input or output like voice user interfaces, machine translators, entity recognizers, etc.
[225000600090] |It can be the case that a single person is both a computational linguist and an NLP developer.That’s my answer, for now… (my previous thoughts are here).
[225000610010] |Language Philosophy & Legal Interpretation
[225000610020] |Randy Barnett over at The Volokh Conspiracy references Lawrence Solum, a law professor at University of Illinois College of Law, who wrote a lengthy post on constitutional interpretation called "Semantic and Normative Originalism: Comments on Brian Leiter’s “Justifying Originalism.”
[225000610030] |I first became interested in linguistics via language philosophy and speech act theory, so I always have a soft spot for debates that involve theories of meaning, as this legal one does (Solum actually references Grice, hehe).
[225000610040] |It’s a long and complicate post involving legal issues I have no special knowledge of, but I’m interested in teasing apart the Grice reference to see if it has legs, or if it’s yet another example of naïve linguistics gone wrong.
[225000620010] |The Semantics of Sex
[225000620020] |What is the meaning of the construction ‘to have sex with X’?
[225000620030] |I ask because Scott Adams of Dilbert fame linked to this story on his friggin hilarious Dilbert blog: “Man who had sex with bike in court”.
[225000620040] |The article explains the following:
[225000620050] |"A man has been placed on the sex offenders’ register after being caught trying to have sex with a bicycle…The accused was holding the bike and moving his hips back and forth as if to simulate sex." (emphasis mine)
[225000620060] |There are so many delicious linguistic oddities here that it’s hard to know where to start.
[225000620070] |First, he was caught “trying” to have sex with the bike.
[225000620080] |Apparently, sex.with.a.bike’(x) is an accomplishment predicate.
[225000620090] |What are the criteria for the successful completion of the task of having sex with a bike?
[225000620100] |Whatever they are, the accused failed to complete the task, at least according to his accusers (the bike has yet to issue a formal statement).
[225000620110] |Second, it seems to me that ‘to have sex with X’ is ambiguous between
[225000620120] |a) ‘to have sex together with X’ b) ‘to achieve sexual gratification from X’
[225000620130] |These are two different events, but the ‘have sex with’ construction gets used to mean both.
[225000620140] |The obvious features of animacy and reciprocation are important criteria to distinguish the two semantic meanings.
[225000620150] |I’m willing to stipulate that one can ‘have sex with’ a sex toy or sex doll (and so are the producers of Lars and the Real Girl).
[225000620160] |But, the narrowly construed semantics of (a) require animacy and reciprocation of both participants.
[225000620170] |Finally, ‘as if’ serves a curios discourse function in the above quote that I can’t quite express just yet because the second sentence is nearly synonymous without it.
[225000620180] |It seems to express a certain hesitation to admit the proposition.
[225000620190] |The proposition ‘X was simulating sex with a bike’ is so preposterous, that one does not really want to commit to its veracity.
[225000620200] |I think ‘as if’ is acting like an evidential of some sort.
[225000620210] |It’s kinda like ‘I’m just saying…’:
[225000620220] |For example (imaginary quote): “Look dude, I’m not saying I know for sure what this guy was thinking.
[225000620230] |I’m just saying, when I walked in, the guy’s pants were off, his hips were gyrating, and the bike wasn’t complaining, as far as I could tell..
[225000620240] |I mean, I’m just saying…”
[225000630010] |Buffalo Buzz
[225000630020] |As a follow-up to my earlier, unusually non-linguistics posts on Buffalo’s economy which I discussed here (this is also featured on Mankiw’s post here) and here.
[225000630030] |I’d like to note that this week, Buffalo’s famed weekly magazine Artvoice included an extended response to Glaeser’s critique of Buffalo’s renewal woes, What It Will Take by Bruce Fisher.
[225000630040] |Bruce Fisher is Deputy Erie County Executive; he presumably knows the details of Buffalo’s economic situation better than Glaeser.
[225000630050] |I skimmed the article (rather quickly) at Spot Coffee this morning while doing laundry and I was impressed that Fisher does what Glaeser does not, provide pragmatic suggestions to fix Buffalo’s problems, but he also seems to tread awfully close to the deep end of silly Canada-envy and claim that Buffalo should follow Ontario’s lead.
[225000630060] |It makes sense to look to models of urban renewal like Toronto and Ottowa for ideas, but there is a peculiarly USA-American tendency (amongst liberals particularly) to fawn over Canada as if it’s some sort of Utopia.
[225000630070] |I’ll happily stipulate that I like Canada, love Toronto, and am impressed by many aspects of Canadian society.
[225000630080] |But I’m not predisposed to gushing.
[225000630090] |Anyhoo, Fisher basically agrees with Glaeser that “if federal funds come the way they’ve always come, nothing here will change.”
[225000630100] |He then goes on to disagree with the assertion that Buffalo is a lost cause (that’s my phraseology).
[225000630110] |Fisher’s basic claim is this: “Quality attracts and retains density.”
[225000630120] |So, he reasons (contra Glaeser), we should invest in Buffalo the place.
[225000630130] |He wants to invest (public money, of course) in changing what he refers to as “land-use policy”, especially the policy of suburbs, and so he’s in favor of regionalism.
[225000630140] |I’ll leave it to you to read the entire article to appreciate Fisher’s complete argument.
[225000630150] |I’m no macro-economist (though it has become increasingly my hobby over the last few years), so I’m not in a position to decide if Glaeser’s or Fisher’s prescriptions for Buffalo’s future are wisest.
[225000630160] |As an unapologetic urbanite who has lived within the city borders of Buffalo for 8 of the last 10 years, I have no problem with disparaging the evils of suburbia, but I also see that preservation does not seem to be doing much good.
[225000630170] |If the taxpayers of New York state and Kansas and Arizona and Washington (etc.) are going to invest hundreds of millions of dollars into Buffalo over the next ten years, I’m trending towards Glaeser’s position that it should be spent on the people (to me, that means primary education and law enforcement: a well educated, safe populace is more powerful than any other force on Earth).
[225000640010] |Dream Job ... or ... WTF!
[225000640020] |This job announcement was posted to The Linguist List just yesterday:
[225000640030] |Performance Space 122 in association with Movement Research and Instituto Cervantes seeks an English speaking cognitive linguist for 4 weeks of exploratory research with Spanish choreographer Juan Dominguez.
[225000640040] |The individual chosen will provide Juan with the linguistic knowledge and will guide him during one on one research sessions (Nov 26-Dec 14, Mon-Fri, for 3 hours/day) and during a larger workshop with ten participants (Dec 17-21).
[225000640050] |During the individual research with Juan, the main focus will be studying how language is built, how we use it, and how we understand reality through language.
[225000640060] |In the four previous workshops, Juan has tried experiments that influence the way of perceiving time and space through the verbs of movement.
[225000640070] |This will be a continuation of that research and experimentation.
[225000640080] |During the larger workshop the linguist will spend the first two days giving the participants an introduction about verbs with special focus on the verbs of movement.
[225000640090] |The linguist will be present during the workshop (5 hours/day) as a reference for further questions and of course to give his or her point of view about what the participants will work on. (my emphasis)
[225000640100] |PS122 is a legitimate place which promotes itself as a "multi-disciplinary arts center dedicated to finding, developing and presenting new artistic creations from a diversity of cultures and points of view."
[225000640110] |They're giving themselves 4 weeks with one linguist to figure out
[225000640120] |1) how language is built2) how we use it3) how we understand reality through language
[225000640130] |On top of that, they hope to "influence the way of perceiving time and space through the verbs of movement."
[225000640140] |Good Luck.
[225000650010] |Nounhood
[225000650020] |Jessica Hagy is by far one of the most witty and creative bloggers.
[225000650030] |Her Indexed site never ceases to impress me with its range of clever reasoning.
[225000650040] |And now, she tackles grade school linguistics.
[225000660010] |The Perils of Prescriptivism and PGSLTS!
[225000660020] |If ever there was evidence that prescriptivist maxims are unnatural and ultimately subservient to psycholinguistic priming, this sentence is it.
[225000660030] |It comes from an email sent in to Andrew Sullivan which he posted online here:
[225000660040] |The people on whose doors I knocked on universally described the candidate as thrilling...
[225000660050] |The linguistically delicious part is the unnecessary repetition of on.
[225000660060] |The author appears to be trying to form an NP with a relative clause that would perhaps be better rendered as “The people whose doors I knocked on”.
[225000660070] |However, still suffering from post grade school linguistics traumatic stress syndrome (PGSLTS), the author is consciously trying to avoid ending sentences with prepositions (but seems to re-analyze the rule to apply to phrases as well); desperate for grammaticality, the author at first valiantly tries to avoid ending the phrase with on by dislocating it to the front of the RC, but then arrives at the verb “knocked” which is probably naturally primed to be followed by a preposition (at least in this context, perhaps treating the verb as a particle-verb, knocked-on), and so tacks on another on, ya know, just to be sure.
[225000660080] |Alas!
[225000660090] |The power of priming wins the day.
[225000680010] |This goes without saying...
[225000680020] |>
[225000680030] |I got this link from Polyglot Conspiracy.
[225000680040] |Too frikkin funny.
[225000690010] |yeah right
[225000690020] |There are 3 interpretations of “yeah, right” in American English, but I only have two of them in my dialect (I’m originally from California).
[225000690030] |I’m in my late 30s and I hear this particular version from younger folks a lot (I can imagine my teenage niece saying it this way), but I’ve also heard it from a 30-ish father of 3, so I’m not sure what generation it’s most closely associated with (perhaps I just missed it).
[225000690040] |The three interpretations I know of are as follows:
[225000690050] |1) Normal (factual agreement): yeah right = ‘yes, that is correct’2) Sarcastic (opposite meaning): yeah right = ‘no way in hell’3) Back-channel (sentiment agreement): yeah right = ‘mm-hmm’
[225000690060] |Thanks to the influence of Seinfeld and Friends throughout the 90s, (2) sarcastic is probably the default use these days, but it is the 3rd use that I don’t have in my dialect.
[225000690070] |I would say that (3) is in the same class of back-channel expressions as “you go girl!”
[225000690080] |These three interpretations all involve different prosodic realizations; roughly, they have different tones.
[225000690090] |I’ll dig deep into my past when I studied the tone languages Mandarin Chinese and Cantonese (12 years ago) and when I actually took a phonetics course (10 years ago) to see if I can offer a plausible hypothesis about the F0 differences.
[225000690100] |1) Normal: yeah = falling mid-low; right = falling mid-low2) Sarcastic: yeah = rising low-high; right = rising low-high3) Back-channel: yeah = steady mid-mid; right = rising low-high
[225000690110] |I have little confidence in my intuitions about the prosodic properties of (1), but I feel (2) and (3) are a pretty good guess.
[225000690120] |BTW, I happen to run across this paper by Joseph Tepperman et al. from USC: “YEAH RIGHT”: SARCASM RECOGNITION FOR SPOKEN DIALOGUE SYSTEMS.
[225000690130] |I haven’t read it, but it seems somewhat relevant to my point: “This paper presents some experiments toward sarcasm recognition using prosodic, spectral, and contextual cues.”
[225000700010] |Oh, you fools!
[225000700020] |Geoffrey Pullum has a cute post over at Language Log today about the uses of language, the least of which, he declares, is to inform:
[225000700030] |I'm sorry, I don't want to sound cynical and jaded, but language is not for informing.
[225000700040] |His whole post is worth the read, but this sparked my memory about a paper I wrote many years ago.
[225000700050] |In my life previous to linguistics, I was a damned filthy English major but I took a course once that had something to do with discourse and conversation analysis (but, ya know, utterly vacuous in the way only English department courses can be) and I recall being frustrated by the assumption in the literature that communication was fundamentally "cooperative".
[225000700060] |Being the damned filthy English major that I was, I wrote an entire seminar paper without doing any empirical research at all, not even a Liberman-esque Breakfast Experiment; rather, I argued from my gut (as Colbert might say) that human communication was fundamentally competitive with each participant trying to "win" something, or at least in some sense trying to outperform the other.
[225000700070] |Unfortunately, that's about all I can remember of the whole event.
[225000720010] |"X Collar Job"
[225000720020] |The Polyglot Conspiracy blogs about a construction which may count as a new snowclone.
[225000720030] |white collar = professional class job
[225000720040] |blue collar = working class job
[225000720050] |green collar = eco-friendly job
[225000720060] |pink collar = a job that is traditionally performed by women
[225000730010] |Corporate Semantics
[225000730020] |I took precious time out of my busy day (I'm giggling right now) to complete my company's "Employee Preferences Survey".
[225000730030] |Part of the survey provided a series of work environment descriptions of two different jobs and asked me to decide my preferences between them (assuming everything thing else about the jobs were the same).
[225000730040] |However, the differences between the two were often pinned to my semantic judgments of lexical items.
[225000730050] |I cut-and-pasted a few of the actual questions below.
[225000730060] |Seldom vs.
[225000730070] |SometimesJob 1: Company seldom recognizes employees' individual performance and work contributions Job 2: Company sometimes recognizes employees' individual performance and work contributions
[225000730080] |Sometimes vs. FrequentlyJob 1: Company sometimes recognizes employees' individual performance and work contributions Job 2:Company frequently recognizes employees' individual performance and work contributions
[225000730090] |Another part of the survey asked me to rate on a scale of 1-100 how likely I would be to leave my current job for a new one of the given description.
[225000730100] |In the description below, taking (a) and (g) together leads to the conclusion that my current pay must be WAYYYYYYYYYYYY below "market rate".
[225000730110] |Is this what my current employer believes?
[225000730120] |Is it time for me to ask for a raise, or am I to draw the inference that this hypothetical job will be offer me (an only me) substantial compensation?
[225000730130] |More importantly, how do I answer the question?
[225000730140] |a) Company typically pays well below "market rate" b) Direct manager is one of the worst in quality c) Always working on challenging and "leading-edge" work in your field d) Company frequently recognizes employees' individual performance e) Depending on your performance, bonus can add up to 20% to your pay f) About 10% out-of-town business travel g) Base pay 30% more than current h) Coworkers are above average in quality
[225000730150] |OMG!!!
[225000730160] |It actually asked me to rate my agreement level with THIS:
[225000730170] |I am “emotionally attached” to (company name).
[225000730180] |Dude!
[225000730190] |WTF?
[225000730200] |Is someone out there actually “emotionally attached” to their company?
[225000730210] |If yes, would they be willing to admit this on a survey?
[225000730220] |(I am willing to stipulate that some professors may in fact be “emotionally attached” to their university employers, especially if they teach at the same school they graduated from.
[225000730230] |I could see professors Harvard and Stanford being a tad “emotionally attached” to their employer –but we all know that professors at Harvard and Stanford don’t count; they’re not normal people like the rest of us, hehe).
[225000730240] |Finally, double how's are tough:
[225000730250] |How satisfied are you with how the performance management system reviews your accomplishments?
[225000740010] |“I don’t believe in X”
[225000740020] |Is this a snowclone?
[225000740030] |I can’t find it in the database or on the queue.
[225000740040] |It’s certainly a different use of ‘believe’ than “I don’t believe in unicorns”.
[225000740050] |In this special use of believe, the speaker does in fact believe X exists, but they disagree with it in some way.
[225000740060] |Google results for “I don’t believe in”:
[225000740070] |While X is almost always a noun, it can be a VP as in (d), and it’s often an eventive nominal as in (e) or a deverbal nominal as in (a).
[225000760010] |Killer App!
[225000760020] |Finally, someone has found a way to make automatic speech transcription USEFUL!
[225000760030] |Robot to Replace Professional Bloggers: NEC's PaPeRo is Programmed to Listen and Blog!
[225000760040] |By Mark Rollins (march 2007)
[225000760050] |…the PaPeRo looks like a cute anime critter that apparently you are supposed to be comfortable having a chat with.
[225000760060] |That's right.
[225000760070] |You interface with the PaPeRo via conversation, and you tell it about your day, just like you would write in your blog.
[225000760080] |The PaPeRo then processes your conversation by searching for keywords.
[225000760090] |It surfs the net for you, and finds you the appropriate video and audio images that you would use for a blog entry.
[225000770010] |"krispy"
[225000770020] |Sigh.
[225000770030] |Yet more proof that I am woefully ignorant of pop culture.
[225000770040] |I only just today discovered The Urban Dictionary.
[225000770050] |I love the fact that users can vote thumbs up or down on a definition, and the one with the best up-to-down ratio gets ranked first.
[225000770060] |I discovered this because a colleague of mine swears he caused the tipping point of the urban usage of “krispy” in white, nerd sub-culture (in Buffalo, at least).
[225000770070] |I’d be shocked if anyone wanted to challenge him for that title, so I’ll give it to him uncontested.
[225000770080] |Now, who the hell is Rihanna?
[225000780010] |The Ling-O-Sphere
[225000780020] |I spent a good deal of Sunday afternoon trolling around linguistics blogs.
[225000780030] |While there are dozens of linguists with blogs, it’s hard to keep track of them all.
[225000780040] |The linguist List has a modest static list here.
[225000780050] |When I scan the blog roll at Language Log, it’s not even clear which ones are dedicated primarily to linguistics since many of the blog names are intentionally obscure.
[225000780060] |Also, many are defunct or stale as wishydig recently noted .
[225000780070] |I found a couple which had no posting in 2 years, many none for months.
[225000780080] |(UPDATE: while doing something else mildly productive, I literally clicked on EVERY single blog listed in Language Log's blog roll.
[225000780090] |If you deleted each one that was either dormant for at least 6 months or had little linguistics content, you’d delete at least 70%).
[225000780100] |It would be nice to create a single site that aggregates all of our posts with regular updates.
[225000780110] |I mean something beyond Technorati or Digg or del.icio.us.
[225000780120] |I put the term “linguistics" into each of the three major social bookmarking sites above and frankly, the results were far from encouraging.
[225000780130] |Even though Technorati has a “blogs” tab, the first page of hits were not really linguistics blogs, as far as I could tell (the second page was more relevant).
[225000780140] |The Digg results were disappointing, to say the least.
[225000780150] |One reference to a Chomsky interview and one to a study on swearing, but again, none of the top hits appeared to be from blogs I would consider “linguistic blogs” (e.g., none are on the Language Log Other language blogs list).
[225000780160] |The del.icio.us returns at least put Language Log on top, but most of the first page returns were resource pages for computational linguistics, not blogs per se.
[225000780170] |Imagine a site which automatically checks a given set of linguistics websites, then updates a topic cloud which clusters posts according to relevance for a particular topic, with links to each post within the cloud, plus a blog roll of all participating blogs on the right margin.
[225000780180] |I could imaging this happening in one of two ways (I prefer the first, but it's computationally complicated):
[225000780190] |1) Search the participating blogs and perform some sort of cluster analysis of the words in each post, taking all the posts together as a corpus (perhaps an LSA style analysis), then create the cloud.
[225000780200] |2) Create a fixed set of topic key words, and search for semantically similar words in each post.
[225000780210] |I could imagine WordNet being used for this
[225000780220] |Whadda y'all thank?
[225000800010] |Geeking Out
[225000800020] |Though I’m not really a geek or nerd myself, I have spent a great deal of the last 10 years living and working amongst the amusing creatures and I find a few of their habits have creeped into my general behavior.
[225000800030] |And so it was that I found myself today quite distracted by the various terms software developers use to refer to the things that they put into data structures (like vectors and arrays).
[225000800040] |Please note that this is a linguistics inquiry, not a programming one.
[225000800050] |There may be prescriptive uses of these terms, but as a linguist, I’m interested in the descriptive facts of how people actually use them.
[225000800060] |Programming tutorials will often refer to these things as members, elements, or items, but they are not consistent with their terms.
[225000800070] |For example, one Java author uses both “objects” and “elements” here:
[225000800080] |The main key difference is that this one doesn't actually remove objects at the end; we just leave them inside. [clip] Printing is accomplished using an Enumerator; which we use to march through every element printing as we move along. (emphasis added)
[225000800090] |Here’s the creator of Python, Guido van Rossum, using both “item” and “element”:
[225000800100] |insert(i, x)Insert an item at a given position.
[225000800110] |The first argument is the index of the element before which to insert, so a.insert(0,
x)
inserts at the front of the list, and a.insert(len(a),
x)
is equivalent to a.append(
x)
. (emphasis added)
[225000800120] |The folks at cppreference.com use “element” for lists, vectors, &Double-ended Queues and “item” for sets, multisets, multimaps and maps here:
[225000800130] |insert (Vectors) inserts elements into the containerinsert (Double-ended Queues) inserts elements into the containerinsert (Lists) inserts elements into the containerinsert (Sets) insert items into a containerinsert (Multisets) inserts items into a containerinsert (Multimaps) inserts items into a containerinsert (Maps) insert items into a container (emphasis added; modified from a table)
[225000800140] |But in another place, they switch from elements to members:
[225000800150] |Individual elements of a vector can be examined with the [] operator.[clip]Two vectors are equal if:1.
[225000800160] |Their size is the same, and2.
[225000800170] |Each member in location i in one vector is equal to the the member in location i in the other vector.
[225000800180] |There are two things at play here: 1) lexical preferences and 2) discourse preferences.
[225000800190] |Though we may have a default preference for a particular term, in certain contexts we may choose another term, (e.g., to avoid repetition).
[225000800200] |Exactly what the relevant context is, and what function the choice serves, is not clear to me.
[225000800210] |I suspect that one factor is whether or not the author wants to foreground the content of the container or the structure of the container.
[225000800220] |In classic empirical fashion, I performed a lightweight Kitchen Experiment to collect some facts about usage.
[225000800230] |I Googled the constructions “X in a vector” and “X in an array” where “X” was replaced systematically by a series of possible “item” words.
[225000800240] |The info below present the results ordered by number of hits (in its infinite wisdom, Blogger kindly removed my formatted tables and replaced them with tabbed lines).
[225000800250] |"X in a vector"
[225000800260] |"X in an array"
[225000800270] |Of course, and as always, I continue my use of the term Kitchen Experiment to avoid being sued by Mark Liberman for trademark infringement.
[225000810010] |Linguists vs. Economists
[225000810020] |CAVEAT: I have knitted this post together from random scraps of thoughts I’ve been collecting for the better part of two months.
[225000810030] |I make no apologies for its speculative nature and erratic structure.
[225000810040] |Here we go…
[225000810050] |A rather rambling post on observing versus prescribing.
[225000810060] |My Basic Question: What’s the cognitive difference (if any) between linguistic decisions and economic decisions?
[225000810070] |My Other Question: What’s the difference between the way linguists study data and the way economists study data?
[225000810080] |As I’ve said on this blog before, over the last few years I’ve become increasingly interested in economics, particularly macro-economics.
[225000810090] |I follow several econ blogs like Greg Mankiw, Brad DeLong and Freakonomics (have not yet read the actual Freakonomics book, though)and have read a variety of popular econ books by Jeffrey Sachs and Joseph Stiglitz.
[225000810100] |For what it’s worth (not much) Mankiw can count me as a member of the Pigou Club.
[225000810110] |But the more I read, the more uncomfortable with basic economics analysis I am becoming, and I think I’ve finally hit on why.
[225000810120] |It’s because, as a linguist, I’ve been trained in descriptive analysis, but it’s my impression that economists are in the business of prescriptions.
[225000810130] |Specifically, they prescribe optimal decision making.
[225000810140] |All economists I’ve encountered (liberal, conservative, libertarian) assume a model of human decision making where some decisions are good and others are bad (presumably based on the utility of the outcomes of those decisions).
[225000810150] |If I’ve gotten this generalization wrong, please let me know.
[225000810160] |Though I’ve never studied it formally, this seems to be the basic point of game theory.
[225000810170] |I think it’s fair to say that few linguists think of linguistic decisions as either good or bad.
[225000810180] |Linguists avoid labeling decisions as “bad” because that would be a fundamentally prescriptivist approach.
[225000810190] |Linguists simply record decisions as facts (e.g., Americans say “ain’t” in context Y but not in context Z) then try to model the decision making as it is.
[225000810200] |Part of the difference is probably due to the fact that economists work under the assumption that economic decisions are conscious and rational while linguists work under the assumption that linguistic decisions are automatic or unconscious; it’s not as though we think people are weighing the option between ain’t and am not then making a rational choice regarding which to choose to articulate.
[225000810210] |It is the natural human language system that makes the decision, and it’s our job as linguists to figure out what factors influence that decision.
[225000810220] |Weighing the option between a 25 cent/lb orange from California vs. a 35 cent /lb orange from Florida seems like a very different kind of decision.
[225000810230] |This descriptive/prescriptive distinction is fundamental to studying linguistics, and it’s also the one point above all others that causes the most friction between linguists and non-linguists when discussing language issues.
[225000810240] |Popular perception holds that linguists want to change the way people talk.
[225000810250] |But this is pure fantasy.
[225000810260] |Maybe high school English teacher do, but linguists don’t. Linguists are first and foremost observers.
[225000810270] |Can this also be said of economists?
[225000810280] |Are they trained to avoid prescriptions?
[225000810290] |My impression is, no. Quite the opposite.
[225000810300] |Economists are trained to provide prescriptions.
[225000810310] |Economists seem to be primarily concerned with policy.
[225000810320] |Policy analysis is about prescribing decisions.
[225000810330] |For a recent example of this, see Mankiw’s post Must or Should.
[225000810340] |It seems to be the case that what Mankiw calls positive vs. normative statements, is basically the same thing as what linguist call descriptive vs. prescriptive statement.
[225000810350] |This all started with Mankiw referencing some neuroeconomists support for the claim that the comparative size of a reward affects people’s happiness (not just the absolute size).
[225000810360] |According to Mankiw, he and Brad DeLong have different views about people’s basic motivations (a descriptive fact).
[225000810370] |DeLong made this descriptive comment September 3, 2006:
[225000810380] |My point was that the rich are spiteful--that they enjoy the envy of the poor.
[225000810390] |Here’s Mankiw’s summary from Sept 3, 2006:
[225000810400] |Brad seems to see the rich as especially mean and spiteful.
[225000810410] |I see them as some combination of more talented, hard-working, and lucky than average but otherwise like everyone else.
[225000810420] |(Or maybe Brad views everyone as mean and spiteful and the rich as having more opportunities to exercise these vile attributes.)
[225000810430] |I wonder if our varying perspectives on human nature can partly explain our different positions on public policy.
[225000810440] |It seems to me the basic problem facing both Mankiw and DeLong (and any policy minded economist) is the challenge of teasing their respective descriptions of how people really do make decisions apart from their respective prescriptions for how governments should affect people’s decision making.
[225000810450] |Economics is fundamentally about human decision making, making it, in essence, a branch of cognitive science.
[225000810460] |I recognize my own need to tread carefully on the ground of economics since I have no training in the area.
[225000810470] |I will gladly stipulate that economists like Mankiw and DeLong are very smart, well trained scholars who understand the complexities and nuances of economic decision making far better than the vast majority of people on this Earth (certainly far better than little ol’ me).
[225000810480] |I fear I might be reducing their analyses down to an absurd straw man out of my own ignorance of their actual, detailed work which may be far more savvy about descriptions of decision making than I am currently giving them credit for.
[225000810490] |Nonetheless, in the spirit of off-the-top-of-your-head blogging, I wonder if they are trained in the descriptive/prescriptive distinction in the way the linguists are?
[225000810500] |Linguists are explicitly trained to describe language as it really is (if North American English speakers say "ain't" and end their sentences with prepositions, then so be it, we study that).
[225000810510] |In fact, most linguists are trained to despise prescriptivism as inherently unnatural and counterproductive.
[225000810520] |Language is what it is.
[225000810530] |But when I read the popular work of economists (Mankiw's blog, Sach's books, Stgilitz's book, etc) I often wonder if they are keeping the distinction straight.
[225000810540] |It seems like they confuse their prescriptions for the way people ought to make decisions with the facts of how people really make decisions (and that was the point of the neuro-economics study).
[225000810550] |Any policy suggestions linguists may recommend (e.g., anti-English only) are based on our understanding of the facts about how people really use language, not on opinion about how we feel people should use language.
[225000810560] |In other words, it seems like economic policy suggestions from Mankiw et al. are based on their (educated, intelligent) opinion about how people ought to make decisions, with the, at least implicit, hope of changing the decision making system of the average person.
[225000810570] |I doubt any serious linguist would make policy suggestions in the hopes of changing the way people naturally make linguistic decisions.
[225000810580] |Do economists try to sculpt economic policy suggestions to try to change the way people naturally make decisions?
[225000810590] |My hunch is, yes.
[225000810600] |And I fear this is a fool’s errand.
[225000810610] |I have a close friend who did a B.A. in economics at Berkeley in the early nineties and graduated 3rd in his class.
[225000810620] |He went on to UCLA law school, and finally got a Harvard MBA.
[225000810630] |The guy knows his stuff.
[225000810640] |I emailed him the following questions about his economics education regarding this distinction: Were you explicitly trained to tease these two things apart?
[225000810650] |Do you recall this distinction being a principle of economic analysis?
[225000810660] |Do you think economists try to sculpt economic policy suggestions to try to change the way people naturally make decisions?
[225000810670] |His answer:
[225000810680] |The answer is that I don't remember much emphasis on doing things one way or the other.
[225000810690] |We clearly did want to create theories that would help people make better choices, but there were also theories about why people made bad choices where there were market break downs (generally problems with information or instances where group benefits conflict with individual benefits).
[225000810700] |Clearly though I don't remember us ever being told "do not try to change people only observe."
[225000810710] |Changing for the better would have been seen as good I think (emphasis added)
[225000810720] |Interesting.
[225000810730] |We seem to have hit on the crux of the issue.
[225000820010] |Chomsky on Prescriptivism
[225000820020] |I just skimmed a 1991 interview with Chomsky where he discusses the role of prescriptivism in writing and formal language (HT: London Language).
[225000820030] |Q.
[225000820040] |In College English in 1967, you wrote that “a concern for the literary standard language—prescriptivism in its more sensible manifestations—is as legitimate as an interest in colloquial speech.”
[225000820050] |Do you still believe that a sensible prescriptivism is preferable to linguistic permissiveness?
[225000820060] |If so, how would you define a sensible prescriptivism?
[225000820070] |A.
[225000820080] |I think sensible prescriptivism ought to be part of any education.
[225000820090] |I would certainly think that students ought to know the standard literary language with all its conventions, its absurdities, its artificial conventions, and so on because that’s a real cultural system, and an important cultural system.
[225000820100] |They should certainly know it and be inside it and be able to use it freely.
[225000820110] |I don’t think people should give them any illusions about what it is. It’s not better, or more sensible.
[225000820120] |Much of it is a violation of natural law.
[225000820130] |In fact, a good deal of what’s taught is taught because it’s wrong.
[225000820140] |You don’t have to teach people their native language because it grows in their minds, but if you want people to say, “He and I were here” and not “Him and me were here,” then you have to teach them because it’s probably wrong.
[225000820150] |The nature of English probably is the other way, “Him and me were here,” because the so-called nominative form is typically used only as the subject of the tense sentence; grammarians who misunderstood this fact then assumed that it ought to be, “He and I were here,” but they’re wrong.
[225000820160] |It should be “Him and me were here,” by that rule.
[225000820170] |So they teach it because it’s not natural.
[225000820180] |Or if you want to teach the so-called proper use of shall and will—and I think it’s totally wild—you have to teach it because it doesn’t make any sense.
[225000820190] |On the other hand, if you want to teach people how to make passives you just confuse them because they already know, because they already follow these rules.
[225000820200] |So a good deal of what’s taught in the standard language is just a history of artificialities, and they have to be taught because they’re artificial.
[225000820210] |But that doesn’t mean that people shouldn’t know them.
[225000820220] |They should know them because they’re part of the cultural community in which they play a role and in which they are part of a repository of a very rich cultural heritage.
[225000820230] |So, of course, you’ve got to know them.
[225000830010] |frik!
[225000830020] |Okay, like most straight men over 30, I’m in love with Sarah Chalke from Scrubs.
[225000830030] |A big part of the infatuation comes from the way she says “frik!”.
[225000830040] |In this one minute YouTube of Elliot Reid moments, there are a couple of nice examples of “frik” (included a precious “double frik”) near the end
[225000830050] |In my non-blog, non-professional life, I swear like a drunken sailor (always have, always will).
[225000830060] |I love cursing and make no apologies for it.
[225000830070] |Given my comfort with, even preference for, all breeds of vile, contemptuous speech, I am surprised to find myself taken with a special fondness for the euphemism “frik” and its variants.
[225000830080] |But I love it.
[225000830090] |frikkin -- 638,000 Google hits.
[225000830100] |The Urban Dictionary's def:
[225000830110] |In between "fuckin" and "effin".
[225000830120] |A term used in the classroom or where your not allowed to cuss.
[225000830130] |friggin -- 669,000 Google hits.
[225000830140] |The Urban Dictonary's def:
[225000830150] |A word used by cowards who are too afraid to say "fucking"
[225000830160] |299,000 for freekin18,300,000 for freakin3,870,000 for frickin
[225000840010] |On Errors
[225000840020] |Apropos of my previous rant, I just discovered Invented Usage which has several recent posts on errors.
[225000840030] |In one, they ask a damned good question:
[225000840040] |isn't it a contradiction for something to be a common linguistic error?
[225000860010] |The Preposition 'from'
[225000860020] |Having just now discovered I missed National Preposition Day, I offer a post relating to the preposition from and my dissertation.
[225000860030] |It has long been noted that the English preposition from most typically occurs with Source (Huddleston and Pullum 2002; Van Valin and LaPolla 1997; Jolly 1991; Clark and Carpenter 1989a; Clark and Carpenter 1989b; Quirk 1985; Vestergaard 1977; Wood 1967).
[225000860040] |(1) a. Chris returned from California. b. Hide took the book from Atsuko. c. Mike drove from Buffalo to Toronto.
[225000860050] |There are some uses, however, where it appears to occur with Goals and Themes as well.
[225000860060] |(2) a.
[225000860070] |The fence blocked the car from the driveway. b.
[225000860080] |The tent shielded the kids from the rain.
[225000860090] |When from occurs with verbs denoting barrier events like bar, ban, block, shield it can marks NPs representing either the unattained goal of the entity being blocked, or the restrained theme which failed to attain its goal.
[225000860100] |Interestingly, it can also mark VPs as in (3):
[225000860110] |(3) The judge barred the journalists from entering the courtroom.
[225000860120] |In my first qualifying paper (SUNY Buffalo’s linguistics department uses qualifying papers in lieu of a master’s thesis) I argued that from acts not as a preposition, but rather as a complementizer with barrier verbs.
[225000860130] |The class of English “barrier verbs” (as originally sketched by Len Talmy) are negative causative object control verbs which encode the relationships between a goal directed participant (or “agonist”), its goal and a barrier participant (or “antagonist”).
[225000860140] |They are negative verbs in the sense of Laka 1994: they encode the negation of an event.
[225000860150] |Think of the verb neglect.
[225000860160] |If you neglect to do X, then X did not happen.
[225000860170] |With barrier verbs, if you ban X from doing Y, then Y did not happen.
[225000860180] |These verbs fall into the following general constructional template:
[225000860190] |NP1 verb NP2 from NP3/VP.
[225000860200] |In this construction, the subject of a barrier verb (NP1) acts as the barrier (either directly or indirectly) to the achievement of a goal event (NP3/VP) by a goal-directed participant NP2).
[225000860210] |I have recently discovered that Idan Landau has a detailed analysis of negative verbs in Hebrew which, extended to English prevent, suggests the complementizer interpretation as well.
[225000860220] |Though we have very different theoretical frameworks, I think we share some conclusions.
[225000860230] |I’m interested in a variety of the phenomenon associated with barrier verbs (including the potential for a coercion analysis of the NP replacement of complement VPs)
[225000870010] |Speech-to-Text Searching
[225000870020] |A colleague just pointed me to the new search engine EVERYZING which searches digital media audio and video (YouTube, podcasts, etc.) for your search terms using a commercially available speech-to-text engine.
[225000870030] |I’m not in love with the results, but it’s a great application for a classic computational linguistics technology.
[225000870040] |How does EVERYZING work?
[225000870050] |EveryZing creates a text index of the audio data from audio and video files, using the industry's leading speech-to-text technology from BBN Technologies, to enable search within the spoken words of media, not just within the metadata.
[225000870060] |In the interests of full disclosure, though I do not work for BBN, my company does have some customers in common with them and we have utilized BBN products in service of a couple contracts (but I have not personally had any contact with BBN personnel or products).
[225000900010] |“A Star-Making Turn”
[225000900020] |I guess this is my week for movies and linguistics.
[225000900030] |I just saw Juno and while being completely smitten by the movie I couldn’t help but think of the cliché that this is a “star-making turn” for its young star, Ellen Page.
[225000900040] |Before turning myself towards that construction, I want to say that Juno is one of the best comedies to come out in years.
[225000900050] |Its author, Diablo Cody, has a back story worthy of its own screenplay, but you can Google around yourself to follow-up on that.
[225000900060] |The characters have dialogue that snaps in the fashion of classic noir and screwball comedies and the cast is exceptional.
[225000900070] |Though it’s somewhat in the same genre as the disappointingly vacuous film Waitress, it’s a lot snappier and smarter; if you’re a fan of the recent great neo-noir film Brick you’ll probably love Juno.
[225000900080] |Now, on to linguistics: As I thought about the construction “a star-making turn” I thought it was an unusual NP and I wasn’t sure why, but it has something to do with the metaphorical mapping of turns coupled with the ambiguity of the adjective “star-making”.
[225000900090] |To begin, I Googled the phrase a bit to confirm my intuition that this construction is most common to entertainment news, and that seems to be true as these few examples should attest:
[225000900100] |--Lily Allen Plots U.S. Takeover, Ben Gibbard Plans Star-Making Turn In Geeky Flick and More--…including his star-making turn in the sleeper hit comedy, --Rudy Giuliani's star-making turn in "Monsters Attack Manhattan
[225000900110] |Then I wanted to see what, if anything, could replace star, so I got the following Google results:
[225000900120] |42,400 for "a * making turn"205,000 for "an * making turn"
[225000900130] |There are few variants for “star”, although “epoch” came up more than I could have predicted and “career” appeared too, but I was surprised to find one use of match-making turn (this will turn out to be quite an instructive example).
[225000900140] |epoch-making turn--This is an epoch-making turn for Iran…--Then came an epoch-making turn in the history of student politics from 1966.--Classical economists' emphasis on labor was certainly an epoch-making turn if one thinks about it.--The appearance of our first book triggered an epoch-making turn in the Japanese media's treatment of homosexuals.--These have been composed at various times and languages, each at an epoch-making turn in the long history of the religion.
[225000900150] |career-making turn--an actress' career-making turn
[225000900160] |match-making turnHumming to himself an air from "Faust" no one would have thought that he was deliberately contemplating doing a match-making turn, but certain it is that his brain was busy devising means of suggesting to Arthur what a splendid girl Martha was.
[225000900170] |I believe that match-making turn above is a different use than star-making, but I’ll get to that in a moment.
[225000900180] |The head noun turn can be used to mean either of the following:
[225000900190] |1) A change in direction (making a left turn)2) The opportunity to do something (to take a turn Xing)
[225000900200] |In the case of (1), the turn would presumably refer to an anonymous actor turning from the path of anonymity to the path of fame.
[225000900210] |The case of (2), however, is more complicated: there would seem to be a metaphorical mapping to the concept of a person X taking a turn doing Y, in the case where Y causes the performer of the turn to become a kind of Z (I suspect that Z must be some kind of category name).
[225000900220] |But is it inherent in the act of Ying that one becomes a Z, or is it a special case that this time around in the otherwise ordinary and banal performance of Ying, X happened to become a Z?
[225000900230] |Let me put it this way: imagine I formed two lines of people.
[225000900240] |·In line A, each person steps up and got a turn starring in a movie, but the movies are mostly dull and ordinary and few people ever see them, but one in a thousand make the actor famous.
[225000900250] |Line A is easy to get into and is quite long.
[225000900260] |·In line B, each person steps up and gets a turn starring in a movie that is guaranteed to make the actor a star (I’m very choosey about whom I allow to stand in line B).
[225000900270] |So, in using the phrase “a star-making turn”, are critics saying that an actor is in line A or line B? (a similar ambiguity exists in (1) as well, as far as I can tell).
[225000900280] |This could be stated as a structural attachment ambiguity of the phrase “star-making”.
[225000900290] |Is it (a) or (b) below:
[225000900300] |a) [[star-making]ADJ[turn]N]NP
[225000900310] |b) [star-making-turn]N
[225000900320] |There is now the contrast between “a star-making turn” and a “match-making turn”.
[225000900330] |They require quite different mappings, don’t they?
[225000900340] |Whereas a “star-making turn” causes the performer of the turn to become a kind of star, a “match-making turn” does NOT cause the performer of the turn to become a kind of match.
[225000900350] |I remember a friend of mine years ago saying something to the effect you linguists make language more complicated than it really is.
[225000900360] |Well, that may be the case.
[225000900370] |Language may indeed be primal and simple.
[225000900380] |But it remains the case that in reading the phrase “a star-making turn”, we all somehow navigate multiple metaphorical mappings and structural ambiguity.
[225000900390] |How exactly that is done remains a mystery.
[225000900400] |Now go see Juno.
[225000900410] |It’s an adorable frikkin movie.
[225000920010] |"double-bagging"
[225000920020] |Scott Adams (of Dilbert fame) posted about a term he learned from his in-laws: double-bagging.
[225000920030] |His story about what it means in this context is cute, but you can read about it at his blog.
[225000920040] |What I find interesting is the use of an ordinary term typically referring to using two bags instead of one for groceries (as reinforcement) for the unusual situation involving the dog Millie.
[225000920050] |Like most linguists, I was required to study some historical linguistics and socio-linguistics involving language change.
[225000920060] |My memory is fuzzy, but I recall vaguely that there are models of neologism formation that account for the various ways an existing term gets transferred to a new domain.
[225000920070] |What's interesting to me about "double-bagging" is that the salient part of the term is the instrument, not the action, because the only thing the two uses share is the need for two bags.
[225000920080] |The way in which the two bags get used in each situation is, in fact, quite different.
[225000920090] |So, rather than foregrounding the similarity of the situations (the way metaphor might), this is a case where two unrelated situations happen to share an instrument in common and it is the instrument which forms the neologism.
[225000920100] |I wonder if instruments in general lend themselves to this kind of linguistic process?
[225000920110] |Are there other cases where two dissimilar situations share an instrument (used in different ways) but have the instrument form a neologism?
[225000930010] |“donkeys” and “fish”
[225000930020] |I’m a degenerate poker player and make no apologies.
[225000930030] |I should be ashamed of the fact that I’ve spent more time this week playing poker than writing my dissertation, but I’m not, hehe.
[225000930040] |I play mostly here.
[225000930050] |Poker players constitute their own speech community of sorts and there has developed a set of lexical items unique to poker (there are a variety 0f poker terms lists online and they’re all about the same).
[225000930060] |Recently, I was curious about the origins of the term donkey in poker, so I Googled it and found this claim at About.com here and there are a couple things I find off about it:
[225000930070] |Definition: If a poker player is called a 'donkey,' he's a bad player who makes blatantly bad poker plays.
[225000930080] |A weak player.
[225000930090] |Donkey is also shortened to "donk" by many players to announce that they're playing badly or planning to, as in "I'm going to donk it up tonight."
[225000930100] |Also Known As: fish, pigeons (my italics)
[225000930110] |First, I think they are wrong to claim donkey and fish are synonyms.
[225000930120] |A donkey is a bad player who wins (or sucks-out); a fish is a bad player who loses.
[225000930130] |A donkey is bad.
[225000930140] |A fish is good.
[225000930150] |Donkeys are bad because they take my money.
[225000930160] |Fish are good because they give me money.
[225000930170] |Second, I’ve never heard the term pidgeon in the poker games I play in or in the excessive TV coverage, but it does occur in some of the poker glossaries as a synonym for fish.
[225000930180] |It may be a British English usage, I dunno.
[225000930190] |There are a variety of animal terms in poker, but their metaphorical associations are not always transparent.
[225000930200] |Shark: The most obvious is shark, a very strong player.
[225000930210] |Clearly, sharks are notoriously vicious, top-of-the-food-chain predators.
[225000930220] |Fish: The term fish may have been derived from shark, since fish are largely helpless, bottom-of-the-food-chain prey.
[225000930230] |It’s a nice bit of structuralist lexicon building, if that is the case.
[225000930240] |Donkey: I can’t find a discussion of the origins of this term in poker, so I have to take a guess at its metaphorical associations (classic back-formation hypothesizing …I’ll almost certainly be wrong, hehe).
[225000930250] |In poker, donkeys are stupid and lucky.
[225000930260] |It’s easy to see associating stupid with donkeys (no offense to donkey lovers), but the lucky part takes some work.
[225000930270] |I’ll hypothesize that the salient feature is the fact that, while stupid, real donkeys pack a potent kick that can hurt you.
[225000930280] |In poker, donkeys can play stupidly, but make a lucky hand and hurt you by taking lots of chips.
[225000930290] |Both kinds of donkeys are stupid but dangerous.
[225000930300] |The Nuts: This terms refers to the best possible hand.
[225000930310] |I found the following story about its origin here (I have no way of verifying its veracity):
[225000930320] |This cool poker term dates way back to the Wild West where cowboys would gather round a table, preferably in a saloon but alternatively around a campfire, and play cards.
[225000930330] |Back then poker players would not always bet with cash or chips.
[225000930340] |It was a more rustic time, and men would often bet their horse and wagon on a poker hand.
[225000930350] |Legend has it that when a cowboy bet his wagon he would unscrew the nuts from his wagon wheels and place them in the pot.
[225000930360] |The reason behind this gesture was that in the event that he lost the pot he could not leap up, hop into his wagon and ride away with his wager.
[225000930370] |The fact that he was willing to put those nuts in the pot as surety for the strength of his hand resonated through the prairie, and came to be synonymous with the best hand.
[225000930380] |A cowboy would only bet "the nuts" when he was convinced that his hand was the best out there. (emphasis added)
[225000930390] |In an interesting extension of the term, ESPN has taken to using the term The Nuts to refer to a series of videos highlighting the little oddball or quirky aspects of poker.
[225000930400] |Though I couldn’t find an official site for the videos, I found this unofficial site here.
[225000930410] |What ESPN has done is take a term that is familiar to a speech community with one meaning and extend its usage by playing on an unrelated usage (craziness).
[225000950010] |How Does Language Work?
[225000950020] |I only just now stumbled on to Liberman's post here about Stanly Fish's dark view of the future of The Humanities written about in "Will the Humanities Save Us?" and "The Uses of the Humanities, Part Two".
[225000950030] |In 1996 I made the decision to quit graduate school in English Literature, near the beginning of a career in a field I was well suited to, to start fresh in a field I was woefully undertrained for: linguistics.
[225000950040] |I did this partly because I had lost the faith, so to speak.
[225000950050] |I shared Fish's "moments of aesthetic wonderment" but I just couldn't see what I would spend the next 30 years of my life doing.
[225000950060] |What do English professors do?
[225000950070] |I never found a satisfying answer to that question.
[225000950080] |Linguistics, on the other hand, drew me in precisely because there were (and still are) so many unanswered questions.
[225000950090] |But the king daddy of them all, the fundamental question of linguistics, is this: How does language work?
[225000950100] |In the same way that you may look at a river and ask how does this work (Where does the water come from? Where does it go?), linguistics look at human languages and ask how they work.
[225000950110] |Linguists are essentially reverse engineers.
[225000950120] |It is as if we have found a mystery box that does something: produces language.
[225000950130] |It appears to behave systematically and at least somewhat predictably.
[225000950140] |We'd like to know how it does that.
[225000950150] |And the most tantalizing thing about linguistics is this: we have no answer to the fundamental question.
[225000950160] |We still don't know how language works.
[225000950170] |I'm looking forward to reading Fish's article's more closely, but I fear we agree.
[225000960010] |Language Evolution Blog
[225000960020] |Thanks to Psycholinguistics Arena's blog posts aggregator, I just discovered the blog Language Evolution.
[225000960030] |The blog "aims to act as an informal resource, meeting place and forum for the academic field of language evolution."
[225000960040] |Looks like an interesting blog.
[225000970010] |Prepositions Don't Count
[225000970020] |From the often hilarious blog Totally Not Crazy we find another example of PGSLTSS (which I first blogged about here):
[225000970030] |In any case I really hope Paris sells her next house to someone I'm tangentially connected__, so I can see what bold new choices she's forging in her design palette. (emphasis added)
[225000970040] |The totally-not-crazy author might be sane (doubt it) but he’s clearly suffering from post grade school linguistics traumatic stress syndrome (PGSLTSS).
[225000970050] |Having properly completed the ditransitive construction sell X to Y, he was faced with both the terror of a possibly redundant “to” AND ending a sentence with a preposition (gasp!) because he decided to fill the dative argument of sell with the dislocated argument of connected, leaving the preposition stranded.
[225000970060] |Fearing retribution from The Nuns (perhaps) he chose to take the conservative route, and delete the last preposition, just in case.
[225000970070] |No biggie, right?
[225000970080] |Prepositions don’t really count anyway, hehe.
[225000980010] |"The Stuff I Just Thought Up"
[225000980020] |In a post titled “The Perils of Popularising Science”, Jason Zevin, a cognitive neuroscientist, gets in a little Pinker bashing.
[225000980030] |The whole post is worth reading, but here's a little nugget:
[225000980040] |This is why it is always so disorienting to talk to people who have just read or are reading anything by Steven Pinker (such as his recent piece "The Moral Instinct" in the New York Times Magazine).
[225000980050] |Often, these people know all kinds of amazing things--including things I'm pretty sure aren't true.
[225000980060] |This is not to say that Pinker is a charlatan (although some researchers might actually go this far; a colleague just vandalised my copy of "The Stuff of Thought", changing it to "The Stuff I Just Thought Up").
[225000980070] |The problem is that our field is one with many open questions, many confusing and apparently mutually exclusive data points, not to mention a dizzying array of theoretical perspectives to consider.
[225000980080] |I have mixed feelings about Pinker.
[225000980090] |I admire his contribution to psycholinguistics, even while disagreeing with some of his major conclusions.
[225000980100] |He was a brilliant empirical researcher who moved the science of linguistics forward.
[225000980110] |His early work on the acquisition of argument structure continues to be influential and relevant.
[225000980120] |His popular works are well written and entertaining and have inspired new linguists.
[225000980130] |But of late, he seems to have jumped off the deep end of rationality and come to the conclusion that his opinions and intuition are more than that; they are now fact.
[225000980140] |I think we would all be better off if Pinker got off the lecture circuit and back into the lab and started studying verbs again.
[225000980150] |(HT: Andrew Sullivan)
[225000990010] |You Say 'poverty of stimulus', I say 'innateness hypothesis'...
[225000990020] |I’ve posted on the Innateness Hypothesis several times before and I just ran across an article by Geoffrey Pullum and Barbara Scholz reviewing that very topic: Empirical assessment of stimulus poverty arguments (pdf).
[225000990030] |Thought y'all might enjoy the read.
[225000990040] |AbstractThis article examines a type of argument for linguistic nativism that takes the following form: (i) a fact about some natural language is exhibited that allegedly could not be learned from experience without access to a certain kind of (positive) data; (ii) it is claimed that data of the type in question are not found in normal linguistic experience; hence (iii) it is concluded that people cannot be learning the language from mere exposure to language use.
[225000990050] |We analyze the components of this sort of argument carefully, and examine four exemplars, none of which hold up.
[225000990060] |We conclude that linguists have some additional work to do if they wish to sustain their claims about having provided support for linguistic nativism, and we offer some reasons for thinking that the relevant kind of future work on this issue is likely to further undermine the linguistic nativist position.
[225000990070] |Enjoy!
[225000990080] |(HT: Language Evolution blog)
[225001010010] |Why Should I Learn a Foreign Language?
[225001010020] |We’re not that far from the Universal Translator , right?
[225001010030] |Welcome to SpeakLike, the first instant messaging service for accurate, real-time translation chat across different languages.
[225001010040] |You type text in your language, and others see it in theirs.
[225001010050] |Skype has their version tooUniversal Chat Language Translator and Speaker for Skype
[225001010060] |It goes without saying that the boys and girls at Carnegie Mellon have already developed their version and gotten it to market: Franklin 12-Language Speaking Global Translator.
[225001010070] |(HT Blogos)
[225001020010] |Words and Meaning
[225001020020] |In discussing the recent Japanese phenomenon of cell phone novels, a reader of Andrew Sullivan’s blog tries to explain why the Japanese language is well suited to this style:
[225001020030] |The use of Chinese characters also serves to compact sentences.
[225001020040] |Since you don't have to actually spell out entire words, as in English, but can represent them with an ideogram, you can say a lot more in a much smaller space.
[225001020050] |I will provisionally accept that kanji and kana make typing out written Japanese on a cell phone more efficient than typing out English (in the sense of requiring fewer key strokes; I'd have to test to see if this is really true), but I reject the logical fallacy that this mechanical efficiency leads to greater meaning.
[225001020060] |This strikes me as a variation of a phenomenon Ben Zimmer over at Language Log has written about regarding the all too often misrepresented meaning of the Chinese word for ‘crisis’ wēijī .
[225001020070] |Underlying both of these is the naïve belief that logograms are inherently more meaningful than alphabetic words.
[225001020080] |This belief, I reject.
[225001020090] |I could be wrong about this, but my hunch is that the human language system takes all written representations of language and converts them into an internal mental representation it’s happy with.
[225001020100] |There may be differences between the way the brain accesses the meaning of kanji and the way the brain access the meaning of alphabetic words (in terms of recognition), but I don’t see any reason to believe that the internal semantic representation of kanji is somehow different than the representation of words.
[225001020110] |If I’m wrong and there is a difference, this would be an interesting piece of data for the Sapir-Whorf folks.
[225001020120] |FYI: The Sapir-Whorf hypothesis (aka linguistic relativity) has re-emerged in recent years.
[225001020130] |Some of the most interesting empirical work is being done by Buffalo’s own Jürgen Bohnemeyer and his Spatial language and cognition in Mesoamerica project.
[225001030010] |The Perils of Semantic Annotation
[225001030020] |One of the most challenging tasks a linguist can engage in is that of annotating natural language text for semantics.
[225001030030] |It is simultaneously interesting, tedious and tricky, which makes it altogether maddening.
[225001030040] |We perform this task for a variety of reasons.
[225001030050] |Sometimes to create training data for learning algorithms (which was a big topic of discussion at last year's NAACL HLT) or to explicate the semantics of events like the FrameNet project.
[225001030060] |Part of my dissertation is very FrameNet-like, so I do a lot of annotating (I will save my bile-filled hateful remarks about the general crappiness of annotator apps for another post).
[225001030070] |Generally speaking, the annotator's task is to read naturally occurring sentences, then identify and tag the semantic roles of the participants involved in the particular event represented by the sentence.
[225001030080] |It would be easy if all of English was composed of sentences like "Bobby kicked the ball"; that would be sweet.
[225001030090] |"Bobby" is an AGENT, "the ball" is a PATIENT.
[225001030100] |Done.
[225001030110] |Let's move on.
[225001030120] |But that's not how real language works, is it?
[225001030130] |In any case, I have been annotating sentences involving the verb "exclude" recently and I find it's a particularly challenging set.
[225001030140] |The BNC “exclude” sentence below was difficult to annotate because the exclude event is not clear about its participants:
[225001030150] |The new Minister for Health, Dr Noel Browne, a dedicated reformer of the health services and much concerned in-particular with the eradication of tuberculosis in Ireland, modified the earlier bill to exclude the compulsion elements.
[225001030160] |At first, I thought “Dr Noel Browne” was the agent doing the excluding, but then I realized it was the bill which excluded.
[225001030170] |But which bill?
[225001030180] |I concluded that “the earlier bill” is NOT participating in the exclude event because, logically, it must be the version of the bill that came AFTER the early one which did the excluding.
[225001030190] |So, this requires a presupposed later bill.
[225001030200] |So, should I annotate the good Dr. as the agent, or leave this participant alone (FrameNet's annotator app has the ability to mark an unexpressed element, and I believe this is exactly why, but I don't use their app).
[225001030210] |Also, it’s not clear if the “to” means “in order to” as a purpose statement.
[225001030220] |Is the bill explicitly, directly excluding, or was that simply the intent of the changes?
[225001030230] |If it’s indirect, that makes Dr. Noel a better candidate for the agent of exclusion.
[225001030240] |Ugh!
[225001050010] |Fancy Corpus Search Tool
[225001050020] |I've only just now discovered the entirely online corpus search utility Sketch Engine by Adam Kilgarriff, Pavel Rychlý, and Jan Pomikálek.
[225001050030] |It can replicate a lot of what I do with tgrep2 and Python scripts, but a lot faster (I mean, A LOT faster).
[225001050040] |It has the advantages of being fast, easy to use, covering corpora from multiple languages (plus allowing you to add new corpora) and providing user friendly output.
[225001050050] |One disadvantage is the brevity of the sketches it provides.
[225001050060] |For example, I performed a sketch of the verb "prevent" in the BNC and it returned a list of subjects and objects that occur with the verb.
[225001050070] |Sweet!
[225001050080] |This is really important stuff if you're interested in FrameNet type semantic description (see my related post here).
[225001050090] |Unfortunately, it maxed out at 100 (that's a small sample of the 10,000+ examples).
[225001050100] |Nonetheless, this utility goes a long way to providing the sort of user-friendly (yet still sophisticated) online corpus query tools that I think the average non-computationally minded linguist would benefit from greatly.
[225001050110] |I've used Mark Davies' BNC interface a lot too and that's also an excellent, entirely online search tool.
[225001050120] |Davies provides a nice interface to a variety of corpora here.
[225001060010] |Die Buch, Die Tisch, Die Stuhl
[225001060020] |I never took grammatical gender seriously when I studied German.
[225001060030] |I just made everything feminine ‘cause, ya know, that was the easy one.
[225001060040] |The rest of my German was so bad, I figured it didn’t really matter anyway, right?
[225001060050] |(I frikkin LOVED studying Mandarin Chinese because, ya know, who needs morphology?)
[225001060060] |Now Heidi Harley has convinced me I was right all along.
[225001060070] |She blogs about Dalila Ayoun’s research on French gender:
[225001060080] |…native French speakers don't agree on the genders of French nouns.
[225001060090] |They really don't agree.
[225001060100] |Fifty-six native French speakers, asked to assign the gender of 93 masculine words, uniformly agreed on only 17 of them.
[225001060110] |Asked to assign the gender of 50 feminine words, they uniformly agreed only 1 of them.
[225001060120] |Some of the words had been anecdotally identified as tricky cases, but others were plain old common nouns.[clip]… second language speakers of French, take heart!
[225001060130] |Make your grammatical gender agreement mistakes with confidence.
[225001060140] |There's a chance that your native-speaker interlocutor will agree with your version!
[225001060150] |Danke, Heidi!
[225001060160] |Viel Danke!
[225001060170] |Pssst, I should note that David Zubin has done a variety of cognitive linguistic studies on German gender.
[225001060180] |Most recently, this one:
[225001060190] |Köpcke, Klaus-Michael and David A. Zubin 2003.
[225001060200] |“Metonymic pathways to neuter-gender human nominals in German”.
[225001060210] |In Metonymy and Pragmatic Inferencing, Panther, Klaus-Uwe and Linda L. Thornburg (eds.), 149–166.
[225001070010] |"garfield minus garfield"
[225001070020] |I never liked the comic Garfield.
[225001070030] |But under this guy's interpretation here, I find it brilliant!
[225001070040] |I haven't laughed out loud to a comic in years.
[225001070050] |These versions swing from hilarious, to sad and poignant, then back to hilarious.
[225001070060] |Who would have guessed that when you remove Garfield from the Garfield comic strips, the result is an even better comic about schizophrenia, bipolor disorder, and the empty desperation of modern life?
[225001070070] |Friends, meet Jon Arbuckle.
[225001070080] |Let’s laugh and learn with him on a journey deep into the tortured mind of an isolated young everyman as he fights a losing battle against lonliness and methamphetamine addiction in a quiet American suburb.
[225001070090] |(HT Andrew Sullivan)
[225001080010] |an ear for accents
[225001080020] |This women has a gifted "ear" for accents.
[225001080030] |She starts in England, moves through Europe, on to Australia, then makes her way from west to east through North America.
[225001080040] |I'll note that her Texas and South Carolina are pure stereotype, but damn she nails California and Toronto.
[225001080050] |(HT Andrew Sullivan)
[225001090010] |"yeah right" again
[225001090020] |Eureka!
[225001090030] |I posted about the prosody of the phrase "yeah right" some time ago here.
[225001090040] |In particular, I claimed there are 3 three interpretations of the phrase, but I don't have one of them in my dialect (Northern California), namely what I called "back-channel (sentiment agreement)" which is roughly equivalent to ‘mm-hmm’.
[225001090050] |However, I had no sound files.
[225001090060] |Now I've found a near perfect example of this mystery prosody in the trailer for Juno, about 36 seconds in (here).
[225001090070] |You can also read my most excellent review of Juno here.
[225001100010] |Blog Love, Italian Style
[225001100020] |Sitemeter consistently shows referrals to my blog from the Italian language blog Taccuino di traduzione 2.0 which Google translates as Translation Notebook 2.0.
[225001100030] |Unfortunately, I lack Italian language skills, so I am unable to enjoy the blogs postings.
[225001100040] |But I thought I'd pass it along to any of you who may wish to indulge.
[225001100050] |The latest post has a great painting of the famed Algonquin Roundtable titled "A Vicious Circle" by Natalie Ascencios.
[225001100060] |Meaning no offense to the superior original, but my lack of Italian drove me to Google translate the whole post.
[225001100070] |Reading this poor translation makes me want to run out, study Italian real quick, then read the rest of the blog:
[225001100080] |We have waited months and months in sweet Titlepage pending, the site should offer conversations (and why not talk) passionate and fiery editorial on the latest news, a new model Algonquin Round Table, with videointerviste choirs, forums on different literary genres For readers who do not give up ever, a blog, reviews, reports, awards, cotillons and who knows what else.
[225001100090] |All false promises.
[225001100100] |Although well prepared on the subject, the presenter (which surely read as a young Hamlet in jeans and black sweater, in some alternative theatre company) is uncomfortable in front of the camera (average training, anyone?), The writers guests look around terrified, set design probably is the work of a student to first weapons, the conversation is woody, boring and, above all, language, not to mention the editing of footage (used scissors?).
[225001100110] |A great sin.
[225001100120] |But this can only improve.
[225001100130] |ciao
[225001110010] |Jason Wins, hehe
[225001110020] |As if it wasn’t obvious, I decided to reiterate Jason’s point from the previous post, regarding the ante-previous post by taking my post and running through Google’s English to Italian translation.
[225001110030] |A thing of beauty, haha.
[225001110040] |Enjoy:
[225001110050] |Invece di commentare i miei commenters per quanto riguarda il mio post Blog di Amore, stile italiano, ho deciso di fare questo è un post --
[225001110060] |In risposta a Jason's acerbic commento "Credo che la più grande macchina di traduzione è stato solo uno scherzo, la pubblicazione della traduzione automatica. :) ",
[225001110070] |Con la presente risposta nel seguente modo:
[225001110080] |Non essere talkin 'trash' bout mio prezioso Google traduzioni; senza di loro, non potrei mai leggere la mia e-mail amico spagnolo Ana invia.
[225001110090] |Il suo inglese è peggiore di quella di Google traduzioni, in modo I'll take Google (rimshot!).
[225001110100] |E lei non crede che ci sia qualcosa di poetico nella prima riga.
[225001110110] |Ho potuto vedere alcuni 20th Century poeta americano Wallace Stevens iscritto come questo:
[225001110120] |Abbiamo aspettato mesi e mesiIn attesa di Titlepage dolce,Il sito dovrebbe offrire conversazioni(E perché non parlare)Ardente e appassionato editorialeLe ultime notizie, un nuovo modelloAlgonquin Round Table
[225001120010] |On Google Translations
[225001120020] |Instead of commenting to my commenters regarding my post Blog Love, Italian Style, I decided to make this it’s own post –
[225001120030] |In response to Jason’s acerbic comment “I think the biggest machine translation joke was just posting the machine translation itself. :)”,
[225001120040] |I hereby reply thusly:
[225001120050] |Don't be talkin' trash 'bout my precious Google translations; without them, I could never read the emails my Spanish friend Ana sends.
[225001120060] |Her English is worse than the Google translations, so I'll take Google (rimshot!).
[225001120070] |And don't you think there is something poetic in the first line.
[225001120080] |I could see some 20th Century American poet like Wallace Stevens writing this:
[225001120090] |We have waited months and months In sweet Titlepage Pending, The site should offer conversations (and why not talk) Passionate and fiery editorial On the latest news, a new model Algonquin Round Table
[225001140010] |Wireless Phone Calls and Speech Production
[225001140020] |There is a new viral video going around involving a “voiceless phone call”.
[225001140030] |Tom Simonite writes on NewScientist.com:
[225001140040] |A neckband that translates thought into speech by picking up nerve signals has been used to demonstrate a "voiceless" phone call for the first time.
[225001140050] |With careful training a person can send nerve signals to their vocal cords without making a sound.
[225001140060] |These signals are picked up by the neckband and relayed wirelessly to a computer that converts them into words spoken by a computerised voice.[clip]The system demonstrated at the TI conference can recognise only a limited set of about 150 words and phrases, says Callahan, who likens this to the early days of speech recognition software.
[225001140070] |At the end of the year Ambient plans to release an improved version, without a vocabulary limit.
[225001140080] |Instead of recognising whole words or phrases, it should identify the individual phonemes that make up complete words.
[225001140090] |I have no clue how this actually works (there’s an HMM in there somewhere, right?), but its implications for models of speech production ought to be significant.
[225001140100] |The folks over at Haskins Lab ought to be interested, I should think.
[225001140110] |(HT Andrew Sullivan)
[225001140120] |Here's the video.
[225001140130] |Cool stuff.
[225001150010] |On The Cognitive Properties of Skin
[225001150020] |After posting on the voiceless phone call story below, I began searching around for more information on how the device actually works.
[225001150030] |Failing to find any relevant patents pending (suspicious, I thought) I began searching for information on Michael Callahan, the wunderkind who appears to be the principle inventor, though many are probably involved.
[225001150040] |After some searching, the most specific information I have yet found on the technology behind the voiceless phone was found in this article from the University of Illinois at Urbana-Champaign Engineering department website.
[225001150050] |Note the passage I have emphasized:
[225001150060] |“Once we hit upon the idea of direct input, we were off and running,” explained Thomas Coleman, a project team member.
[225001150070] |The young researchers discovered that information sent from the brain can be accurately measured through the conductive properties of the skin.
[225001150080] |Typically, according to Coleman, these measurements are obtained through rigid metallic electrodes which neither respond to natural movements of the body nor to increasing skin moisture.
[225001150090] |They often become very uncomfortable under prolonged use.
[225001150100] |"Our system uses proprietary technology to gather neurological information through encapsulated conductive gel pads, shielding the embedded electrode from the skin,” Coleman said.
[225001150110] |‘The Audeo’ device we developed applies gentle pressure over the vocal cords, while the form-fitting band automatically adjusts in diameter, accommodating head and neck movements to maintain efficient contact.”
[225001150120] |From there, team members created a computer program, which reads the intercepted neurological signals, and communicates a ‘response,’ both on the screen and as an audio signal.
[225001150130] |Initial work centered on determining the differences between a ‘yes’ and a ‘no’ response, which could be recognized by the computer.
[225001150140] |The software has since been enhanced to effectively ‘learn’ and adapt to the user’s neurological signals without the need of extensive training.
[225001150150] |The equipment analyzes the user during a one-time calibration process and generates a personalized user identity. (my emphasis; quote marks had to be manually inserted to replace funny characters, but i tried to represent the original faithfully)
[225001150160] |This is how far removed from serious neuroscience I am.
[225001150170] |I had no clue.
[225001150180] |I realized some information could be gathered from the skin, like Galvanic skin response, but I must say I’m shocked to learn that phonemic information regarding unarticulated utterances can be retrieved from the skin around a person’s neck.
[225001150190] |Clearly, there is more to this story.
[225001150200] |I’ll keep digging.
[225001160010] |The Ling-O-Sphere Revisited
[225001160020] |In December I posted about an idea regarding my desire to see a linguistics blog aggregator that "automatically checks a given set of linguistics websites, then updates a topic cloud which clusters posts according to relevance for a particular topic" (see my full post and relevant comments here ).
[225001160030] |I see now that William Cohen at his Cranial Darwinism blog has recently posted two new academic papers on the automatic discovery of blog topics (aka, latent topic modeling) as well automatic methods of modeling blog influence.
[225001160040] |Daume has posted on related topics in the past as well (see here for one relevant post).
[225001160050] |Having skimmed the first paper a bit, I see lots of scary words and phrases like "Latent Dirichlet Allocation" and "probabilistic framework"; I'm neck deep in finishing my dissertation (or failing to finish it; I'll be able to distinguish the two in about 3 weeks), so my interest in struggling through challenging papers is low, but they look well worth the read ... someday ... sigh.
[225001170010] |"According to Google,..."
[225001170020] |Being both a poker player and former writing teacher, I am better acquainted than most with just how stupid the average person is.
[225001170030] |The fear that this day would come has lurked in my mind for some time, but today, I re-discovered the ugly truth that people just don't understand even the most basic tenants of reason, research, skepticism, and critical thinking.
[225001170040] |Through a series of blog links, I happened on to the comment thread for a popular TV/radio talk show host's web page (I refuse to link to it).
[225001170050] |The topic regarded one of the current presidential candidates' alleged ethnicity (clearly false/ridiculous hypotheses peppered the thread).
[225001170060] |I have long since been accustomed to idiocy regarding high profile public figures, so none of this interested me, until I skimmed past one commenter whose attempt at validating the allegation started with "According to google,..." and proceeded to quote some unspecified website.
[225001170070] |This would be a classic case of argument from authority were it not for the fact that the mere Google search engine alone was being treated as the authority in question.
[225001170080] |If Google returns it, it must be true.
[225001170090] |There is a scary group of idiots out there who, deep in their hearts, believe that Google magically filters their search returns for QUALITY.
[225001170100] |Hence, Google is being treated as a primary source.
[225001170110] |"Burn down the mission, if we're gonna stay alive..."
[225001180010] |Google Linguistics
[225001180020] |Erin made the following well-taken point in a comment to this earlier post:
[225001180030] |This appeal to the authority of Google is troublesome in linguistics, since we often refer to Google results for evidence for hypotheses about usage.
[225001180040] |That is documents indexed by Google as a data source, rather than its search results as authoritative figure, of course, but this may not be obvious to the average Joe. :\
[225001180050] |I have used Google repeatedly to find instances of constructions that I could not find using standard corpus linguistics methods with hand compiled corpora like the BNC.
[225001180060] |Typically I’m looking for any instance, just to prove people really do say the thing I’m claiming is possible.
[225001180070] |For example, I needed to find some examples of passivized complements embedded under 60 different barrier verbs following this pattern:
[225001180080] |a.
[225001180090] |I banned John from being examined by the doctor.b.
[225001180100] |I banned John from getting examined by the doctor.
[225001180110] |Many of the verbs I wanted to search for are low frequency in the BNC (e.g., barricade, derail, hamper, etc) so the likelihood of finding examples of passivized complements using say a Tgrep2 search is low.
[225001180120] |So, I ventured into the scary land of Google Linguistics.
[225001180130] |I used the search query “verbed * from being” and “verbed * from getting” Within a short time, I had multiple examples for most of the verbs I was looking for.
[225001180140] |I can’t imagine performing this task more efficiently with any other tool.
[225001180150] |Google really worked well under those circumstances.
[225001180160] |Let me note that I have not used Google hit counts or page counts to derive any statistics regarding frequency of occurrence, though.
[225001180170] |When I do this sort of thing, I’m careful to use my common sense to decide if a return is from a native speaker or not, and often what I do is skim a page to see if there are any obvious ESL errors.
[225001180180] |Also, I use my own intuition regarding the acceptability of a usage (by pure coincidence, Peter Ludlow from U. Toronto will be here in Buffalo this week giving a talk on the role of linguistic intuitions).
[225001180190] |One of the more thorough discussions of the use of search engines in linguistics research is Adam Kilgarriff’s “Googleology is bad science”, a squib from Computational Linguistics (2007, v33, 1)
[225001180200] |He writes that the web is attractive to linguists because it is “enormous, free, immediately available, and largely linguistic”.
[225001180210] |But, he points out four major flaws:
[225001180220] |1. search engines do not lemmatise or part-of-speech tag2. search syntax is limited3. there are constraints on numbers of queries and numbers of hits per query4. search hits are for pages, not for instances.
[225001180230] |Kilgarriff offers this alternative: “work like the search engines, downloading and indexing substantial proportions of the web, but to do so transparently, giving reliable figures, and supporting language researchers’ queries”
[225001180240] |The squib goes on to detail how we might go about doing that in a principled way.
[225001180250] |It’s well worth the read.
[225001190010] |Speaking English
[225001190020] |Steven Levitt, Freako-economist, posted this tempting morsel recently:
[225001190030] |I got an email the other day from a blog reader who tells me that there are now more non-native English speakers than native English speakers.
[225001190040] |Having silly expectations of writers, I foolishly assumed Levitt would tell us all WHERE this fact held true.
[225001190050] |If he is referring to The U.S., then it's quite a remarkable claim.
[225001190060] |China, not so much.
[225001190070] |He seems to be claiming that some change has occurred where a once predominately English speaking country is no longer so.
[225001190080] |Unfortunately, his post never answers this, rather he is just looking for a cute way to transition from a story about Malaysian baby names to a modestly humorous email about Jello.
[225001190090] |It's a blogger's prerogative to tease readers into reading on, so no harm done.
[225001190100] |But, I can't help wondering just what was he referring to in his introductory sentence?
[225001190110] |Has Malaysia ever been predominantly English speaking?
[225001190120] |As far as I know, no.
[225001190130] |The current Ethnologue report says this: "National or official language: Malay.
[225001190140] |Also includes Burmese, Chinese Sign Language, Eastern Panjabi (43,000), Malayalam (37,000), Sylheti, Telugu (30,000)."
[225001190150] |No English.
[225001190160] |So, can any of you, dear readers, come up with a once predominately English speaking country that is no longer so?
[225001190170] |A nice little challenge.