The Enron Corpus is a database of over 600,000 emails generated by 158 employees[1] of the Enron Corporation in the years leading up to the company's collapse in December 2001. The corpus was generated from Enron email servers by the Federal Energy Regulatory Commission (FERC) during its subsequent investigation.[2] A copy of the email database was subsequently purchased for $10,000 by Andrew McCallum, a computer scientist at the University of Massachusetts Amherst.[3] He released this copy to researchers, providing a trove of data that has been used for studies on social networking and computer-mediated communication.
^Klimt, Bryan; Yiming Yang (2004). "The Enron Corpus: A New Dataset for Email Classification Research". pp. 217–226. CiteSeerX 10.1.1.61.1645.
^"The Enron Email Corpus Archived 2011-03-08 at the Wayback Machine" Retrieved March 5, 2011.
^Markoff, John. "Armies of Expensive Lawyers, Replaced by Cheaper Software". New York Times March 5, 2011. p A1.
The EnronCorpus is a database of over 600,000 emails generated by 158 employees of the Enron Corporation in the years leading up to the company's collapse...
The Enron scandal was an accounting scandal involving Enron Corporation, an American energy company based in Houston, Texas. When news of widespread fraud...
Enron Corporation was an American energy, commodities, and services company based in Houston, Texas. It was founded by Kenneth Lay in 1985 as a merger...
profit CORPUS, a dissident Catholic organisation EnronCorpus, a database of emails from the disgraced American energy company Enron Habeas corpus, a legal...
The Brown University Standard Corpus of Present-Day American English, better known as simply the Brown Corpus, is an electronic collection of text samples...
The Corpus of Contemporary American English (COCA) is a one-billion-word corpus of contemporary American English. It was created by Mark Davies, retired...
convicted felon and former financier who was the chief financial officer of Enron Corporation, an energy trading company based in Houston, Texas, until he...
British National Corpus (BNC) is a 100-million-word text corpus of samples of written and spoken English from a wide range of sources. The corpus covers British...
was convicted of federal felony charges relating the Enron scandal. Skilling, who was CEO of Enron during the company's collapse, was eventually sentenced...
former subsidiary of Enron Corporation, formed in 2003 to own and manage the majority of Enron's overseas assets, formerly known as "Enron International"....
1959) is an American former Vice President of Corporate Development at the Enron Corporation. Watkins was called to testify before committees of the U.S...
The Oxford English Corpus (OEC) is a text corpus of 21st-century English, used by the makers of the Oxford English Dictionary and by Oxford University...
the founder, chief executive officer and chairman of Enron. He was heavily involved in Enron's accounting scandal that unraveled in 2001 into the largest...
The trial of Kenneth Lay, former chairman and CEO of Enron, and Jeffrey Skilling, former CEO and COO, was presided over by federal district court Judge...
is a corpus that is annotated with verbal propositions and their arguments—a "proposition bank". Although "PropBank" refers to a specific corpus produced...
The Cambridge International Corpus (CIC) is a collection of over 800 million words of real spoken and written English . The texts are stored in a database...
Kurt Eichenwald detailing the Enron scandal. Conspiracy of Fools tells the story of the 2001 collapse of Enron. Enron's Chief Financial Officer (CFO)...
English (BoE) is a representative subset of the 4.5 billion words COBUILD corpus, a collection of English texts. These are mainly British in origin, but...
The International Corpus of English (ICE) is a set of text corpora representing varieties of English from around the world. Over twenty countries or groups...
The Quranic Arabic Corpus (Arabic: المدونة القرآنية العربية, romanized: al-modwana al-Qurʾāni al-ʿArabiyya) is an annotated linguistic resource consisting...
The NatWest Three, also known as the Enron Three, are the British businessmen Giles Darby, David Bermingham and Gary Mulgrew. In 2002, they were indicted...
TIMIT is a corpus of phonemically and lexically transcribed speech of American English speakers of different sexes and dialects. Each transcribed element...
The TenTen Corpus Family (also called TenTen corpora) is a set of comparable web text corpora, i.e. collections of texts that have been crawled from the...
The American National Corpus (ANC) is a text corpus of American English containing 22 million words of written and spoken data produced since 1990. Currently...