- Published on
Regional Variation of Slang
- Authors
- Name
- Ian Atha
- @IanAtha
Regional Variation of Slang via Computational Methods
I discovered the Slang Metric, a numerical formula that predicts whether or not a lexeme (word) is slang. This research was part of my senior thesis for Linguistics at Grinnell College.
It's notable, because defining slang in a scientific manner has been very difficult.
Assuming you have a comprehensive dataset of lexeme usage across various regions, the Slang Metric is simply the coefficient of variance of the normalized frequencies of lexemes.
In other words:
We define as the absolute count of instances of in the region.
We define as the normalized counts of instances of lexemes.
is the standard deviation of the word's usage frequency across different regions.
is the mean (average) usage frequency of the word across these regions.
If , your lexeme is slang!
Beautiful, right?!