close

Separating tweet from chaff

By The Economist
From The Economist
Published: April 03, 2014

Apr 1st 2014, 9:33 by S.P. | LONDON

WHAT Twitter tells you about the world largely depends on whom you choose to follow. Personal experience of hours wasted on the microblogging service suggests that few of the 15 billion "tweets" posted every month are of any interest at all. But taken as a whole, many believe the aggregated musings of 241m people tapping away on their phones might form an interesting data set which can provide real-time information on the state of the economy.

The latest attemptto extrapolate a signal from the noise focuses on the American labour market. Researchers at the University of Michigan have created indexesof job losses, job searches and postings. Counting phrases such as "lost my job" or "help wanted", the researchers think they can gauge what's going on in the labour market weeks before official data is compiled. Anyone who has seen "Trading Places" knows how valuable that can be.

Does it work? Sort of. The researchers don't claim their new-fangled index can predict unemployment, for example, merely that it foresees the direction in which forecasters are likely to err. And even that only happens haphazardly and after a pretty intense massaging of the data. Plenty of tweets have to be ignored, for example if they comment about unemployment statistics ("looks like plenty of people lost their jobs, the official data suggest") rather than personal circumstances. Nor do phrases that might have been correlated to job losses, such as "sacked" or "let go", make the cut, for example. That suggests only the terms that are known to correlate with joblessness in the period concerned were included. Annoyingly, such things are rarely constant. Google Flu Trends, which aggregates web searches for phrases like "flu remedy" to paint a picture of influenza outbreaks, was long held as a breakthrough for "Big Data" enthusiasts. But then its famed correlations stopped working, as Tim Harford wrote in the Financial Times this weekend.

More to the point, is the Twitter-mining useful? As the researchers point out, official labour-market data in America is published often and without sampling error. So there is little that a social-media based sample can offer, beyond perhaps greater granularity as to where, specifically, people are losing their jobs, for example. Their aim is to prove that areas which aren't so well covered by official statistics can be usefully tracked through Twitter.

Plenty of hedge funds claim, sometimes cryptically, to use Twitter as a data source. Some merely use it as a newswire, tracking developments from people they trust, for example when a Wall Street Journalreporter tweets that a planned merger between two companies is off. But a few claim that they can gauge the sentiment around a company's shares using proprietary algorithms. This latest research suggests that, for now, mining social media is a useful add-on to the old-fashioned way of doing things

 

 

 

從社群貼文 預測經濟趨勢

2014-04-03 Web only 作者:經濟學人

推特如何向你描述這個世界,主要取決於你追蹤了哪些人;而由筆者的推特使用經驗來看,大部分「推文」的重要性實在不高。不過,許多人相信,若以總體推文來看,從2.41億使用者指間發出的推文,或許會成為有趣的資料組,可以提供經濟現況的即時資訊。

最近一次嘗試,將焦點放上了美國勞動市場。密西根大學的研究者創立了失業、求職及調職指數,他們相信,藉由計算「失去工作」、「需要協助」等詞,就能在官方數據公佈前數周,估算勞動市場的情況。

他們成功了嗎?某種程度上是的。研究者並未聲稱他們建立的這個新指數可以預測失業率,而是指出,此指數可以預測失業率的動向。就算如此,他們也得花費許多心力處理資料;許多推文得忽略,例如評論失業數據、而非描述自身狀態的推文,此外,不一定與失業有關的推文也不能計入。也就是說,只有特定期間內確定與失業有關的字詞才會計算在內。煩人的是,那通常不會恆常不變;Google流感趨勢利用字詞搜尋統計描繪流感傳染的情況,一直是「大數據」支持者口中的重大突破,但哈佛德(Tim Harford)本周在《金融時報》指出,Google流感趨勢的相關性已然失效。

推特資料挖掘是否有用?正如研究者所說,美國勞動市場的官方統計數據發佈間隔較短,取樣亦不致出現錯誤,因此,社群媒體取樣能提供的東西並不多。他們的目標是證明,推特可以有效追蹤官方數據涵蓋不夠完整的領域。

許多避險基金聲稱它們以推特為資料來源。部分避險基金只是將推特當成新聞來源,透過它們信任的人來追蹤當前情勢;不過,少數避險基金聲稱,它們可以利用專利演算法,推斷市場對某間公司股價的看法。密西根大學這項最新的研究則顯示,挖掘社群媒體資料,目前比較適合用來彌補傳統手段的不足。(黃維德編譯)

arrow
arrow
    全站熱搜
    創作者介紹
    創作者 專業家教輔導 的頭像
    專業家教輔導

    《全職家教達人》王老師──台大畢,身兼補教與家教全方位經歷,幫您目標達陣!

    專業家教輔導 發表在 痞客邦 留言(0) 人氣()