Sax: Symbolic Aggregate approXimation — SAX is the first symbolic representation for time series that allows for dimensionality reduction and indexing with a lower-bounding distance measure. In classic data mining tasks such as clustering, classification, index, etc., SAX is as good as well-known representations such as Discrete Wavelet Transform (DWT) and Discrete Fourier Transform (DFT), while requiring less storage space. In addition, the representation allows researchers to avail of the wealth of data structures and algorithms in bioinformatics or text mining, and also provides solutions to many challenges associated with current data mining tasks. One example is motif discovery, a problem which we recently defined for time series data. There is great potential for extending and applying the discrete representation on a wide class of data mining tasks. Source code has “non-commercial” license. (via rdamodharan on Delicious)
Open Source OSCON (RedMonk) — The business of selling open source software, remember, is dwarfed by the business of using open source software to produce and sell other services. And yet historically, most of the focus on open source software has accrued to those who sold it. Today, attention and traction is shifting to those who are not in the business of selling software, but rather share their assets via a variety of open source mechanisms. (via Simon Phipps)