<!--{{{-->
<link rel='alternate' type='application/rss+xml' title='RSS' href='index.xml'/>
<!--}}}-->
Background: #fff
Foreground: #000
PrimaryPale: #8cf
PrimaryLight: #18f
PrimaryMid: #04b
PrimaryDark: #014
SecondaryPale: #ffc
SecondaryLight: #fe8
SecondaryMid: #db4
SecondaryDark: #841
TertiaryPale: #eee
TertiaryLight: #ccc
TertiaryMid: #999
TertiaryDark: #666
Error: #f88
/*{{{*/
body {background:[[ColorPalette::Background]]; color:[[ColorPalette::Foreground]];}

a {color:[[ColorPalette::PrimaryMid]];}
a:hover {background-color:[[ColorPalette::PrimaryMid]]; color:[[ColorPalette::Background]];}
a img {border:0;}

h1,h2,h3,h4,h5,h6 {color:[[ColorPalette::SecondaryDark]]; background:transparent;}
h1 {border-bottom:2px solid [[ColorPalette::TertiaryLight]];}
h2,h3 {border-bottom:1px solid [[ColorPalette::TertiaryLight]];}

.button {color:[[ColorPalette::PrimaryDark]]; border:1px solid [[ColorPalette::Background]];}
.button:hover {color:[[ColorPalette::PrimaryDark]]; background:[[ColorPalette::SecondaryLight]]; border-color:[[ColorPalette::SecondaryMid]];}
.button:active {color:[[ColorPalette::Background]]; background:[[ColorPalette::SecondaryMid]]; border:1px solid [[ColorPalette::SecondaryDark]];}

.header {background:[[ColorPalette::PrimaryMid]];}
.headerShadow {color:[[ColorPalette::Foreground]];}
.headerShadow a {font-weight:normal; color:[[ColorPalette::Foreground]];}
.headerForeground {color:[[ColorPalette::Background]];}
.headerForeground a {font-weight:normal; color:[[ColorPalette::PrimaryPale]];}

.tabSelected{color:[[ColorPalette::PrimaryDark]];
	background:[[ColorPalette::TertiaryPale]];
	border-left:1px solid [[ColorPalette::TertiaryLight]];
	border-top:1px solid [[ColorPalette::TertiaryLight]];
	border-right:1px solid [[ColorPalette::TertiaryLight]];
}
.tabUnselected {color:[[ColorPalette::Background]]; background:[[ColorPalette::TertiaryMid]];}
.tabContents {color:[[ColorPalette::PrimaryDark]]; background:[[ColorPalette::TertiaryPale]]; border:1px solid [[ColorPalette::TertiaryLight]];}
.tabContents .button {border:0;}

#sidebar {}
#sidebarOptions input {border:1px solid [[ColorPalette::PrimaryMid]];}
#sidebarOptions .sliderPanel {background:[[ColorPalette::PrimaryPale]];}
#sidebarOptions .sliderPanel a {border:none;color:[[ColorPalette::PrimaryMid]];}
#sidebarOptions .sliderPanel a:hover {color:[[ColorPalette::Background]]; background:[[ColorPalette::PrimaryMid]];}
#sidebarOptions .sliderPanel a:active {color:[[ColorPalette::PrimaryMid]]; background:[[ColorPalette::Background]];}

.wizard {background:[[ColorPalette::PrimaryPale]]; border:1px solid [[ColorPalette::PrimaryMid]];}
.wizard h1 {color:[[ColorPalette::PrimaryDark]]; border:none;}
.wizard h2 {color:[[ColorPalette::Foreground]]; border:none;}
.wizardStep {background:[[ColorPalette::Background]]; color:[[ColorPalette::Foreground]];
	border:1px solid [[ColorPalette::PrimaryMid]];}
.wizardStep.wizardStepDone {background:[[ColorPalette::TertiaryLight]];}
.wizardFooter {background:[[ColorPalette::PrimaryPale]];}
.wizardFooter .status {background:[[ColorPalette::PrimaryDark]]; color:[[ColorPalette::Background]];}
.wizard .button {color:[[ColorPalette::Foreground]]; background:[[ColorPalette::SecondaryLight]]; border: 1px solid;
	border-color:[[ColorPalette::SecondaryPale]] [[ColorPalette::SecondaryDark]] [[ColorPalette::SecondaryDark]] [[ColorPalette::SecondaryPale]];}
.wizard .button:hover {color:[[ColorPalette::Foreground]]; background:[[ColorPalette::Background]];}
.wizard .button:active {color:[[ColorPalette::Background]]; background:[[ColorPalette::Foreground]]; border: 1px solid;
	border-color:[[ColorPalette::PrimaryDark]] [[ColorPalette::PrimaryPale]] [[ColorPalette::PrimaryPale]] [[ColorPalette::PrimaryDark]];}

#messageArea {border:1px solid [[ColorPalette::SecondaryMid]]; background:[[ColorPalette::SecondaryLight]]; color:[[ColorPalette::Foreground]];}
#messageArea .button {color:[[ColorPalette::PrimaryMid]]; background:[[ColorPalette::SecondaryPale]]; border:none;}

.popupTiddler {background:[[ColorPalette::TertiaryPale]]; border:2px solid [[ColorPalette::TertiaryMid]];}

.popup {background:[[ColorPalette::TertiaryPale]]; color:[[ColorPalette::TertiaryDark]]; border-left:1px solid [[ColorPalette::TertiaryMid]]; border-top:1px solid [[ColorPalette::TertiaryMid]]; border-right:2px solid [[ColorPalette::TertiaryDark]]; border-bottom:2px solid [[ColorPalette::TertiaryDark]];}
.popup hr {color:[[ColorPalette::PrimaryDark]]; background:[[ColorPalette::PrimaryDark]]; border-bottom:1px;}
.popup li.disabled {color:[[ColorPalette::TertiaryMid]];}
.popup li a, .popup li a:visited {color:[[ColorPalette::Foreground]]; border: none;}
.popup li a:hover {background:[[ColorPalette::SecondaryLight]]; color:[[ColorPalette::Foreground]]; border: none;}
.popup li a:active {background:[[ColorPalette::SecondaryPale]]; color:[[ColorPalette::Foreground]]; border: none;}
.popupHighlight {background:[[ColorPalette::Background]]; color:[[ColorPalette::Foreground]];}
.listBreak div {border-bottom:1px solid [[ColorPalette::TertiaryDark]];}

.tiddler .defaultCommand {font-weight:bold;}

.shadow .title {color:[[ColorPalette::TertiaryDark]];}

.title {color:[[ColorPalette::SecondaryDark]];}
.subtitle {color:[[ColorPalette::TertiaryDark]];}

.toolbar {color:[[ColorPalette::PrimaryMid]];}
.toolbar a {color:[[ColorPalette::TertiaryLight]];}
.selected .toolbar a {color:[[ColorPalette::TertiaryMid]];}
.selected .toolbar a:hover {color:[[ColorPalette::Foreground]];}

.tagging, .tagged {border:1px solid [[ColorPalette::TertiaryPale]]; background-color:[[ColorPalette::TertiaryPale]];}
.selected .tagging, .selected .tagged {background-color:[[ColorPalette::TertiaryLight]]; border:1px solid [[ColorPalette::TertiaryMid]];}
.tagging .listTitle, .tagged .listTitle {color:[[ColorPalette::PrimaryDark]];}
.tagging .button, .tagged .button {border:none;}

.footer {color:[[ColorPalette::TertiaryLight]];}
.selected .footer {color:[[ColorPalette::TertiaryMid]];}

.sparkline {background:[[ColorPalette::PrimaryPale]]; border:0;}
.sparktick {background:[[ColorPalette::PrimaryDark]];}

.error, .errorButton {color:[[ColorPalette::Foreground]]; background:[[ColorPalette::Error]];}
.warning {color:[[ColorPalette::Foreground]]; background:[[ColorPalette::SecondaryPale]];}
.lowlight {background:[[ColorPalette::TertiaryLight]];}

.zoomer {background:none; color:[[ColorPalette::TertiaryMid]]; border:3px solid [[ColorPalette::TertiaryMid]];}

.imageLink, #displayArea .imageLink {background:transparent;}

.annotation {background:[[ColorPalette::SecondaryLight]]; color:[[ColorPalette::Foreground]]; border:2px solid [[ColorPalette::SecondaryMid]];}

.viewer .listTitle {list-style-type:none; margin-left:-2em;}
.viewer .button {border:1px solid [[ColorPalette::SecondaryMid]];}
.viewer blockquote {border-left:3px solid [[ColorPalette::TertiaryDark]];}

.viewer table, table.twtable {border:2px solid [[ColorPalette::TertiaryDark]];}
.viewer th, .viewer thead td, .twtable th, .twtable thead td {background:[[ColorPalette::SecondaryMid]]; border:1px solid [[ColorPalette::TertiaryDark]]; color:[[ColorPalette::Background]];}
.viewer td, .viewer tr, .twtable td, .twtable tr {border:1px solid [[ColorPalette::TertiaryDark]];}

.viewer pre {border:1px solid [[ColorPalette::SecondaryLight]]; background:[[ColorPalette::SecondaryPale]];}
.viewer code {color:[[ColorPalette::SecondaryDark]];}
.viewer hr {border:0; border-top:dashed 1px [[ColorPalette::TertiaryDark]]; color:[[ColorPalette::TertiaryDark]];}

.highlight, .marked {background:[[ColorPalette::SecondaryLight]];}

.editor input {border:1px solid [[ColorPalette::PrimaryMid]];}
.editor textarea {border:1px solid [[ColorPalette::PrimaryMid]]; width:100%;}
.editorFooter {color:[[ColorPalette::TertiaryMid]];}

#backstageArea {background:[[ColorPalette::Foreground]]; color:[[ColorPalette::TertiaryMid]];}
#backstageArea a {background:[[ColorPalette::Foreground]]; color:[[ColorPalette::Background]]; border:none;}
#backstageArea a:hover {background:[[ColorPalette::SecondaryLight]]; color:[[ColorPalette::Foreground]]; }
#backstageArea a.backstageSelTab {background:[[ColorPalette::Background]]; color:[[ColorPalette::Foreground]];}
#backstageButton a {background:none; color:[[ColorPalette::Background]]; border:none;}
#backstageButton a:hover {background:[[ColorPalette::Foreground]]; color:[[ColorPalette::Background]]; border:none;}
#backstagePanel {background:[[ColorPalette::Background]]; border-color: [[ColorPalette::Background]] [[ColorPalette::TertiaryDark]] [[ColorPalette::TertiaryDark]] [[ColorPalette::TertiaryDark]];}
.backstagePanelFooter .button {border:none; color:[[ColorPalette::Background]];}
.backstagePanelFooter .button:hover {color:[[ColorPalette::Foreground]];}
#backstageCloak {background:[[ColorPalette::Foreground]]; opacity:0.6; filter:'alpha(opacity:60)';}
/*}}}*/
/*{{{*/
* html .tiddler {height:1%;}

body {font-size:.75em; font-family:arial,helvetica; margin:0; padding:0;}

h1,h2,h3,h4,h5,h6 {font-weight:bold; text-decoration:none;}
h1,h2,h3 {padding-bottom:1px; margin-top:1.2em;margin-bottom:0.3em;}
h4,h5,h6 {margin-top:1em;}
h1 {font-size:1.35em;}
h2 {font-size:1.25em;}
h3 {font-size:1.1em;}
h4 {font-size:1em;}
h5 {font-size:.9em;}

hr {height:1px;}

a {text-decoration:none;}

dt {font-weight:bold;}

ol {list-style-type:decimal;}
ol ol {list-style-type:lower-alpha;}
ol ol ol {list-style-type:lower-roman;}
ol ol ol ol {list-style-type:decimal;}
ol ol ol ol ol {list-style-type:lower-alpha;}
ol ol ol ol ol ol {list-style-type:lower-roman;}
ol ol ol ol ol ol ol {list-style-type:decimal;}

.txtOptionInput {width:11em;}

#contentWrapper .chkOptionInput {border:0;}

.externalLink {text-decoration:underline;}

.indent {margin-left:3em;}
.outdent {margin-left:3em; text-indent:-3em;}
code.escaped {white-space:nowrap;}

.tiddlyLinkExisting {font-weight:bold;}
.tiddlyLinkNonExisting {font-style:italic;}

/* the 'a' is required for IE, otherwise it renders the whole tiddler in bold */
a.tiddlyLinkNonExisting.shadow {font-weight:bold;}

#mainMenu .tiddlyLinkExisting,
	#mainMenu .tiddlyLinkNonExisting,
	#sidebarTabs .tiddlyLinkNonExisting {font-weight:normal; font-style:normal;}
#sidebarTabs .tiddlyLinkExisting {font-weight:bold; font-style:normal;}

.header {position:relative;}
.header a:hover {background:transparent;}
.headerShadow {position:relative; padding:4.5em 0em 1em 1em; left:-1px; top:-1px;}
.headerForeground {position:absolute; padding:4.5em 0em 1em 1em; left:0px; top:0px;}

.siteTitle {font-size:3em;}
.siteSubtitle {font-size:1.2em;}

#mainMenu {position:absolute; left:0; width:10em; text-align:right; line-height:1.6em; padding:1.5em 0.5em 0.5em 0.5em; font-size:1.1em;}

#sidebar {position:absolute; right:3px; width:16em; font-size:.9em;}
#sidebarOptions {padding-top:0.3em;}
#sidebarOptions a {margin:0em 0.2em; padding:0.2em 0.3em; display:block;}
#sidebarOptions input {margin:0.4em 0.5em;}
#sidebarOptions .sliderPanel {margin-left:1em; padding:0.5em; font-size:.85em;}
#sidebarOptions .sliderPanel a {font-weight:bold; display:inline; padding:0;}
#sidebarOptions .sliderPanel input {margin:0 0 .3em 0;}
#sidebarTabs .tabContents {width:15em; overflow:hidden;}

.wizard {padding:0.1em 1em 0em 2em;}
.wizard h1 {font-size:2em; font-weight:bold; background:none; padding:0em 0em 0em 0em; margin:0.4em 0em 0.2em 0em;}
.wizard h2 {font-size:1.2em; font-weight:bold; background:none; padding:0em 0em 0em 0em; margin:0.4em 0em 0.2em 0em;}
.wizardStep {padding:1em 1em 1em 1em;}
.wizard .button {margin:0.5em 0em 0em 0em; font-size:1.2em;}
.wizardFooter {padding:0.8em 0.4em 0.8em 0em;}
.wizardFooter .status {padding:0em 0.4em 0em 0.4em; margin-left:1em;}
.wizard .button {padding:0.1em 0.2em 0.1em 0.2em;}

#messageArea {position:fixed; top:2em; right:0em; margin:0.5em; padding:0.5em; z-index:2000; _position:absolute;}
.messageToolbar {display:block; text-align:right; padding:0.2em 0.2em 0.2em 0.2em;}
#messageArea a {text-decoration:underline;}

.tiddlerPopupButton {padding:0.2em 0.2em 0.2em 0.2em;}
.popupTiddler {position: absolute; z-index:300; padding:1em 1em 1em 1em; margin:0;}

.popup {position:absolute; z-index:300; font-size:.9em; padding:0; list-style:none; margin:0;}
.popup .popupMessage {padding:0.4em;}
.popup hr {display:block; height:1px; width:auto; padding:0; margin:0.2em 0em;}
.popup li.disabled {padding:0.4em;}
.popup li a {display:block; padding:0.4em; font-weight:normal; cursor:pointer;}
.listBreak {font-size:1px; line-height:1px;}
.listBreak div {margin:2px 0;}

.tabset {padding:1em 0em 0em 0.5em;}
.tab {margin:0em 0em 0em 0.25em; padding:2px;}
.tabContents {padding:0.5em;}
.tabContents ul, .tabContents ol {margin:0; padding:0;}
.txtMainTab .tabContents li {list-style:none;}
.tabContents li.listLink { margin-left:.75em;}

#contentWrapper {display:block;}
#splashScreen {display:none;}

#displayArea {margin:1em 17em 0em 14em;}

.toolbar {text-align:right; font-size:.9em;}

.tiddler {padding:1em 1em 0em 1em;}

.missing .viewer,.missing .title {font-style:italic;}

.title {font-size:1.6em; font-weight:bold;}

.missing .subtitle {display:none;}
.subtitle {font-size:1.1em;}

.tiddler .button {padding:0.2em 0.4em;}

.tagging {margin:0.5em 0.5em 0.5em 0; float:left; display:none;}
.isTag .tagging {display:block;}
.tagged {margin:0.5em; float:right;}
.tagging, .tagged {font-size:0.9em; padding:0.25em;}
.tagging ul, .tagged ul {list-style:none; margin:0.25em; padding:0;}
.tagClear {clear:both;}

.footer {font-size:.9em;}
.footer li {display:inline;}

.annotation {padding:0.5em; margin:0.5em;}

* html .viewer pre {width:99%; padding:0 0 1em 0;}
.viewer {line-height:1.4em; padding-top:0.5em;}
.viewer .button {margin:0em 0.25em; padding:0em 0.25em;}
.viewer blockquote {line-height:1.5em; padding-left:0.8em;margin-left:2.5em;}
.viewer ul, .viewer ol {margin-left:0.5em; padding-left:1.5em;}

.viewer table, table.twtable {border-collapse:collapse; margin:0.8em 1.0em;}
.viewer th, .viewer td, .viewer tr,.viewer caption,.twtable th, .twtable td, .twtable tr,.twtable caption {padding:3px;}
table.listView {font-size:0.85em; margin:0.8em 1.0em;}
table.listView th, table.listView td, table.listView tr {padding:0px 3px 0px 3px;}

.viewer pre {padding:0.5em; margin-left:0.5em; font-size:1.2em; line-height:1.4em; overflow:auto;}
.viewer code {font-size:1.2em; line-height:1.4em;}

.editor {font-size:1.1em;}
.editor input, .editor textarea {display:block; width:100%; font:inherit;}
.editorFooter {padding:0.25em 0em; font-size:.9em;}
.editorFooter .button {padding-top:0px; padding-bottom:0px;}

.fieldsetFix {border:0; padding:0; margin:1px 0px 1px 0px;}

.sparkline {line-height:1em;}
.sparktick {outline:0;}

.zoomer {font-size:1.1em; position:absolute; overflow:hidden;}
.zoomer div {padding:1em;}

* html #backstage {width:99%;}
* html #backstageArea {width:99%;}
#backstageArea {display:none; position:relative; overflow: hidden; z-index:150; padding:0.3em 0.5em 0.3em 0.5em;}
#backstageToolbar {position:relative;}
#backstageArea a {font-weight:bold; margin-left:0.5em; padding:0.3em 0.5em 0.3em 0.5em;}
#backstageButton {display:none; position:absolute; z-index:175; top:0em; right:0em;}
#backstageButton a {padding:0.1em 0.4em 0.1em 0.4em; margin:0.1em 0.1em 0.1em 0.1em;}
#backstage {position:relative; width:100%; z-index:50;}
#backstagePanel {display:none; z-index:100; position:absolute; margin:0em 3em 0em 3em; padding:1em 1em 1em 1em;}
.backstagePanelFooter {padding-top:0.2em; float:right;}
.backstagePanelFooter a {padding:0.2em 0.4em 0.2em 0.4em;}
#backstageCloak {display:none; z-index:20; position:absolute; width:100%; height:100px;}

.whenBackstage {display:none;}
.backstageVisible .whenBackstage {display:block;}
/*}}}*/
/***
StyleSheet for use when a translation requires any css style changes.
This StyleSheet can be used directly by languages such as Chinese, Japanese and Korean which need larger font sizes.
***/
/*{{{*/
body {font-size:0.8em;}
#sidebarOptions {font-size:1.05em;}
#sidebarOptions a {font-style:normal;}
#sidebarOptions .sliderPanel {font-size:0.95em;}
.subtitle {font-size:0.8em;}
.viewer table.listView {font-size:0.95em;}
/*}}}*/
/*{{{*/
@media print {
#mainMenu, #sidebar, #messageArea, .toolbar, #backstageButton, #backstageArea {display: none ! important;}
#displayArea {margin: 1em 1em 0em 1em;}
/* Fixes a feature in Firefox 1.5.0.2 where print preview displays the noscript content */
noscript {display:none;}
}
/*}}}*/
<!--{{{-->
<div class='header' macro='gradient vert [[ColorPalette::PrimaryLight]] [[ColorPalette::PrimaryMid]]'>
<div class='headerShadow'>
<span class='siteTitle' refresh='content' tiddler='SiteTitle'></span>&nbsp;
<span class='siteSubtitle' refresh='content' tiddler='SiteSubtitle'></span>
</div>
<div class='headerForeground'>
<span class='siteTitle' refresh='content' tiddler='SiteTitle'></span>&nbsp;
<span class='siteSubtitle' refresh='content' tiddler='SiteSubtitle'></span>
</div>
</div>
<div id='mainMenu' refresh='content' tiddler='MainMenu'></div>
<div id='sidebar'>
<div id='sidebarOptions' refresh='content' tiddler='SideBarOptions'></div>
<div id='sidebarTabs' refresh='content' force='true' tiddler='SideBarTabs'></div>
</div>
<div id='displayArea'>
<div id='messageArea'></div>
<div id='tiddlerDisplay'></div>
</div>
<!--}}}-->
<!--{{{-->
<div class='toolbar' macro='toolbar closeTiddler closeOthers +editTiddler > fields syncing permalink references jump'></div>
<div class='title' macro='view title'></div>
<div class='subtitle'><span macro='view modifier link'></span>, <span macro='view modified date'></span> (<span macro='message views.wikified.createdPrompt'></span> <span macro='view created date'></span>)</div>
<div class='tagging' macro='tagging'></div>
<div class='tagged' macro='tags'></div>
<div class='viewer' macro='view text wikified'></div>
<div class='tagClear'></div>
<!--}}}-->
<!--{{{-->
<div class='toolbar' macro='toolbar +saveTiddler -cancelTiddler deleteTiddler'></div>
<div class='title' macro='view title'></div>
<div class='editor' macro='edit title'></div>
<div macro='annotations'></div>
<div class='editor' macro='edit text'></div>
<div class='editor' macro='edit tags'></div><div class='editorFooter'><span macro='message views.editor.tagPrompt'></span><span macro='tagChooser'></span></div>
<!--}}}-->
To get started with this blank TiddlyWiki, you'll need to modify the following tiddlers:
* SiteTitle & SiteSubtitle: The title and subtitle of the site, as shown above (after saving, they will also appear in the browser title bar)
* MainMenu: The menu (usually on the left)
* DefaultTiddlers: Contains the names of the tiddlers that you want to appear when the TiddlyWiki is opened
You'll also need to enter your username for signing your edits: <<option txtUserName>>
These InterfaceOptions for customising TiddlyWiki are saved in your browser

Your username for signing your edits. Write it as a WikiWord (eg JoeBloggs)

<<option txtUserName>>
<<option chkSaveBackups>> SaveBackups
<<option chkAutoSave>> AutoSave
<<option chkRegExpSearch>> RegExpSearch
<<option chkCaseSensitiveSearch>> CaseSensitiveSearch
<<option chkAnimate>> EnableAnimations

----
Also see AdvancedOptions
<<importTiddlers>>
http://www.ee.ic.ac.uk/hp/staff/dmb/voicebox/voicebox.html
* formant: Formants are the distinguishing or meaningful frequency components of human speech and of singing. By definition, the information that humans require to distinguish between vowels can be represented purely quantitatively by the frequency content of the vowel sounds
http://en.wikipedia.org/wiki/Formant

* Finish reading lecture 1 on basic DSP http://www.ee.columbia.edu/~dpwe/e6820/lectures/L01-intro+dsp.pdf

* to read lec4: on pitch and speech perception
** Kind of speech sound: vowel, glide, fricative, nasal, stop burst

* Diphthongs (gliding vowels) are represented by two symbols, for example English "same" as /seɪm/, where the two vowel symbols are intended to represent approximately the beginning and ending tongue positions.
* Fricatives are consonants produced by forcing air through a narrow channel made by placing two articulators close together. English [s], [z], [ʃ], and [ʒ] are examples of this.
* A nasal consonant (also called nasal stop or nasal continuant) is produced with a lowered velum in the mouth, allowing air to escape freely through the nose
* A stop, plosive, or occlusive is a consonant sound produced by stopping the airflow in the vocal tract. All languages in the world have stops and most have at least [p], [t], [k], [n], and [m]
* Formant

-> ''extract formant features from spectrogram, need phoneme segmentation first''

* Sox to convert PCM file to wav file (http://www.ling.upenn.edu/phonetics/sox.html)
** rename to have extension .raw
** sox.exe -r 16000 -s -w file.raw file.wav
-w: 2-word sample or equivalently -2 (2bytes - 16bit sample)
-s: signed
** convert back to wav
sox foo.wav foo.raw
http://www.ee.columbia.edu/~dpwe/e6820/matlab/2008-04-03-ftrclass.diary : for phoneme recognition
http://home.cc.umanitoba.ca/~robh/howto.html 
Ellis lec3
*classification system
** preprocessing / segmentation(STFT/locate vowel): signal -> segment
** feature extraction (formant): segment -> feature vector
** classification
''stop at slide 24 - Gaussian Mixture model''
* http://www.stanford.edu/class/cs276b/handouts/lecture2.ppt
! Recommendation system
Web resources (contain lots of links):
http://www.paulperry.net/notes/cf.asp
http://jamesthornton.com/cf/
Data:
EachMovie dataset: 73,000 users, 1600 movies, 2.5 million ratings
other data?
Software:
Cofi: http://www.nongnu.org/cofi/
CoFE: http://eecs.oregonstate.edu/iis/CoFE/

Efficient implementations
Clustering
Representation of preferences: non-Euclidean space?
Min-hash, locality-sensitive hashing (LSH)
Social networks?

!WorldNet
http://www.cogsci.princeton.edu/~wn/
Java API available (already installed)
Useful tool for semantic analysis
Represents the English lexicon as a graph
Each node is a “synset” – a set of words with similar meanings
Nodes are connected by various relations such as hypernym/hyponym (X is a kind of Y), troponym, pertainym, etc.
Could use for query reformulation, document classification, …

!Google APIs
http://www.google.com/apis/
Web service for querying Google from your software
You can use SOAP/WSDL or the custom Java library that they provide (already installed)
Limited to 1,000 queries per day per user, so get started early if you’re going to use this!
Three types of request:
Search: submit query and params, get results
Cache: get Google’s latest copy of a page
Query spell correction
Note: within search requests you can use special commands like link, related, intitle, etc.

!Stanford WebBase
http://www-diglib.stanford.edu/~testbed/doc2/WebBase/ 
They offer various relatively small web crawls (the largest is about 100 million pages) offering cached pages and link structure data
Includes specialized crawls such as Stanford and UC-Berkeley
They provide code for accessing their data
More on this next week


* Convert StarChallenge dataset from pcm to wav

* Todo: ''check Ellis site for Matlab command on Spectrogram''

* To read: hu96speech Speech Recognition Using Syllable-Like Units, continue Ellis lec3 at slide on GMM
* find dataset
* read mpeg1 into video audio tracks
* detect silent note in audio tracks
* find features for video news
* speech recognition
* binary download: http://ffdshow.faireal.net/mirror/ffmpeg/
* svn and git: http://ffmpeg.mplayerhq.hu/download.html
* documentation:
http://ffmpeg.mplayerhq.hu/ffmpeg-doc.html

* Usage
** convert to raw avi (becomes very large file): ffmpeg.exe -i compressed.any -vcodec rawvideo uncompressed.avi
** convert to new format (output the same quality): ffmpeg.exe -i compressed.any -sameq newFile
!3 April
* http://www.cnpbagwell.com/audio.html
* http://sox.sourceforge.net/
SoX is a command line utility that can convert various formats of computer audio files in to other formats. It can also apply various effects to these sound files during the conversion. As an added bonus, SoX can play and record audio files on several unix-style platforms.
* Sound Processing Kit
http://www.music.helsinki.fi/research/spkit/documentation/SPKit.html
Sound Processing Kit is an object-oriented class library for audio signal processing. Sound Processing Kit (abbreviated as SPKit) includes classes for various signal processing tasks, but most importantly, it introduces a way of implementing sound processing algorithms in a simple object-oriented manner. Sound Processing Kit is implemented in C++.

SPKit is designed to be portable. The current version requires a bare-bones C++ 2.0 compatible compiler (templates and exceptions are not needed). ANSI C standard libraries are required. The source code should compile with little or no modification on most UNIX compatible platforms.

There are three versions of the SPKit class library: a "generic" version, a libaudiofile version, and an SGI version. The generic version is the most portable one and does not require extra libraries audio file I/O. On the other hand, the number of supported sound file formats is very limited. The libaudifile version requires the Audio File Library system for file I/O. Audio File Library is with many Linux distributions and has been ported to other operating systems as well. The home page of Audio File Library is at http://www.68k.org/~michael/audiofile/. The SGI version is written specifically for the SGI/IRIX system. For more information this and other updates, see the release notes. 

* Libsndfile
http://www.mega-nerd.com/libsndfile/
Libsndfile is a C library for reading and writing files containing sampled sound (such as MS Windows WAV and the Apple/SGI AIFF format) through one standard library interface. It is released in source code format under the Gnu Lesser General Public License.

The library was written to compile and run on a Linux system but should compile and run on just about any Unix (including MacOSX). It can also be compiled and run on Win32 systems using the Microsoft compiler and MacOS (OS9 and earlier) using the Metrowerks compiler. There are directions for compiling libsndfile on these platforms in the Win32 and MacOS directories of the source code distribution. 

* Open Source Audio Library Project
http://osalp.sourceforge.net/

!
* CLAM 	http://clam.iua.upf.edu/download.html
    *  Read and write multichannel audio files in virtually any format (wav, aiff, mp3, ogg...)
    * Implement filters in the frequency domain
    * Implement applications based on the FFT
    * Compute statistical descriptors on the sound and its spectral features
    * Passivate/activate objects into XML
    * Use graphical plots to debug signal data objects
    * Implement applications based on the LPC model
    * Play audio in any platform Use frame-based signal constructs and dump them into XML or SDIF
    * Connect simple Processing objects dynamically to form Networks
    * Use the event-based Control mechanism to control run-time behaviour of Processing objects
    * Create statical compositions of Processing objects
    * Convert MIDI files into XML
    * Implement a basic MIDI-controlled synthesizer
    * Input and output MIDI messages into a system
    * Implement spectral analysis based applications
    * Implement a complex synthesizer based on the SMS model
    * Implement analysis/synthesis applications including transformations and graphical user interface
    * Create real-time spectral-domain analysis/synthesis applications, controlled from a GUI
    * Define Processing Networks graphically and dynamically so as to build rapid prototypes
    * ...

There are a few frameworks that overlap with CLAM's goals. If you are only interested in audio analysis and feature extraction you should check:

    * Marsyas
    * Maaate

If you are interested only in audio synthesis you should check:

    * STK
    * Open Sound World
    * Aura

And if you are looking for a framework with both analysis and synthesis capabilities check:

    * CSL
CSL is a simple yet powerful library of sound synthesis and signal processing functions. It is packaged as an object-oriented class hierarchy for standard DSP and computer music techniques, and is suitable for integration into existing applications, or use as a stand-alone synthesis/processing server. CSL is similar to the JSyn (Burke), CommonLispMusic (Schottstaedt), STK (Cook), and Cmix (Lansky) frameworks in that it is integrated as a library into a general-purpose programming language, rather than being a separate “sound compiler” as in the Music-N family of languages (Pope).
http://fastlabinc.com/CSL/index.html

    * SndObj http://music.nuim.ie//musictec/SndObj/main.html
The Sound Object Library is an object-oriented audio processing library. It provides objects for synthesis and processing of sound that can be used to build applications for computer-generated music. The core code, including soundfile and text input/output, is fully portable across several platforms. Platform-specific code includes realtime audio IO and MIDI input support for Linux (OSS,ALSA and Jack), Windows (MME and ASIO), MacOS X (CoreAudio, but no MIDI at moment), Silicon Graphics (Irix) machines and any Open Sound System-supported UNIX. Binaries for Windows (compiled with cygwin g++ and MS-Visual C++ 6.0), MacOS X, RedHat Linux (Intel x86) and Irix (6.5) are available. The source code for the core library classes can be compiled under any C++ compiler.

In any case, CLAM presents both conceptual and practical differences with all of them. If you are interested in a thorough presentation of CLAM alternatives and how they compare to our framework please refer to X. Amatriain's phd.
http://www.sil.org/computing/speechtools/index.htm
*Speech Analyzer
Use this program for recording, transcribing and analyzing sound files. Speech Analyzer does not have the 3 second limitation that WinCECIL has.
*Phonology Assistant
Use this program for managing transcribed Speech Analyzer sound files and/or transcribed data without sound files. Provides extensive phonetic charting and phonological querying capability.
*IPA Help
Use this program to learn to hear, transcribe and produce the sounds of the International Phonetic Alphabet. 

* Speech synthesis software
http://www.speech.cs.cmu.edu/flite/
http://www.cstr.ed.ac.uk/projects/festival/

*http://www.cs.chalmers.se/~aarne/course-langtech/lectures/lang11.html
* Query by humming: musical information retrieval in an audio database

http://www.utdallas.edu/~loizou/speech/colea.htm

* formant tracking, see files
ftrack.m -> frmnts.m
formants.m
/***
|''Name:''|CryptoFunctionsPlugin|
|''Description:''|Support for cryptographic functions|
***/
//{{{
if(!version.extensions.CryptoFunctionsPlugin) {
version.extensions.CryptoFunctionsPlugin = {installed:true};

//--
//-- Crypto functions and associated conversion routines
//--

// Crypto "namespace"
function Crypto() {}

// Convert a string to an array of big-endian 32-bit words
Crypto.strToBe32s = function(str)
{
	var be = Array();
	var len = Math.floor(str.length/4);
	var i, j;
	for(i=0, j=0; i<len; i++, j+=4) {
		be[i] = ((str.charCodeAt(j)&0xff) << 24)|((str.charCodeAt(j+1)&0xff) << 16)|((str.charCodeAt(j+2)&0xff) << 8)|(str.charCodeAt(j+3)&0xff);
	}
	while (j<str.length) {
		be[j>>2] |= (str.charCodeAt(j)&0xff)<<(24-(j*8)%32);
		j++;
	}
	return be;
};

// Convert an array of big-endian 32-bit words to a string
Crypto.be32sToStr = function(be)
{
	var str = "";
	for(var i=0;i<be.length*32;i+=8)
		str += String.fromCharCode((be[i>>5]>>>(24-i%32)) & 0xff);
	return str;
};

// Convert an array of big-endian 32-bit words to a hex string
Crypto.be32sToHex = function(be)
{
	var hex = "0123456789ABCDEF";
	var str = "";
	for(var i=0;i<be.length*4;i++)
		str += hex.charAt((be[i>>2]>>((3-i%4)*8+4))&0xF) + hex.charAt((be[i>>2]>>((3-i%4)*8))&0xF);
	return str;
};

// Return, in hex, the SHA-1 hash of a string
Crypto.hexSha1Str = function(str)
{
	return Crypto.be32sToHex(Crypto.sha1Str(str));
};

// Return the SHA-1 hash of a string
Crypto.sha1Str = function(str)
{
	return Crypto.sha1(Crypto.strToBe32s(str),str.length);
};

// Calculate the SHA-1 hash of an array of blen bytes of big-endian 32-bit words
Crypto.sha1 = function(x,blen)
{
	// Add 32-bit integers, wrapping at 32 bits
	add32 = function(a,b)
	{
		var lsw = (a&0xFFFF)+(b&0xFFFF);
		var msw = (a>>16)+(b>>16)+(lsw>>16);
		return (msw<<16)|(lsw&0xFFFF);
	};
	// Add five 32-bit integers, wrapping at 32 bits
	add32x5 = function(a,b,c,d,e)
	{
		var lsw = (a&0xFFFF)+(b&0xFFFF)+(c&0xFFFF)+(d&0xFFFF)+(e&0xFFFF);
		var msw = (a>>16)+(b>>16)+(c>>16)+(d>>16)+(e>>16)+(lsw>>16);
		return (msw<<16)|(lsw&0xFFFF);
	};
	// Bitwise rotate left a 32-bit integer by 1 bit
	rol32 = function(n)
	{
		return (n>>>31)|(n<<1);
	};

	var len = blen*8;
	// Append padding so length in bits is 448 mod 512
	x[len>>5] |= 0x80 << (24-len%32);
	// Append length
	x[((len+64>>9)<<4)+15] = len;
	var w = Array(80);

	var k1 = 0x5A827999;
	var k2 = 0x6ED9EBA1;
	var k3 = 0x8F1BBCDC;
	var k4 = 0xCA62C1D6;

	var h0 = 0x67452301;
	var h1 = 0xEFCDAB89;
	var h2 = 0x98BADCFE;
	var h3 = 0x10325476;
	var h4 = 0xC3D2E1F0;

	for(var i=0;i<x.length;i+=16) {
		var j,t;
		var a = h0;
		var b = h1;
		var c = h2;
		var d = h3;
		var e = h4;
		for(j = 0;j<16;j++) {
			w[j] = x[i+j];
			t = add32x5(e,(a>>>27)|(a<<5),d^(b&(c^d)),w[j],k1);
			e=d; d=c; c=(b>>>2)|(b<<30); b=a; a = t;
		}
		for(j=16;j<20;j++) {
			w[j] = rol32(w[j-3]^w[j-8]^w[j-14]^w[j-16]);
			t = add32x5(e,(a>>>27)|(a<<5),d^(b&(c^d)),w[j],k1);
			e=d; d=c; c=(b>>>2)|(b<<30); b=a; a = t;
		}
		for(j=20;j<40;j++) {
			w[j] = rol32(w[j-3]^w[j-8]^w[j-14]^w[j-16]);
			t = add32x5(e,(a>>>27)|(a<<5),b^c^d,w[j],k2);
			e=d; d=c; c=(b>>>2)|(b<<30); b=a; a = t;
		}
		for(j=40;j<60;j++) {
			w[j] = rol32(w[j-3]^w[j-8]^w[j-14]^w[j-16]);
			t = add32x5(e,(a>>>27)|(a<<5),(b&c)|(d&(b|c)),w[j],k3);
			e=d; d=c; c=(b>>>2)|(b<<30); b=a; a = t;
		}
		for(j=60;j<80;j++) {
			w[j] = rol32(w[j-3]^w[j-8]^w[j-14]^w[j-16]);
			t = add32x5(e,(a>>>27)|(a<<5),b^c^d,w[j],k4);
			e=d; d=c; c=(b>>>2)|(b<<30); b=a; a = t;
		}

		h0 = add32(h0,a);
		h1 = add32(h1,b);
		h2 = add32(h2,c);
		h3 = add32(h3,d);
		h4 = add32(h4,e);
	}
	return Array(h0,h1,h2,h3,h4);
};


}
//}}}
/***
|''Name:''|DeprecatedFunctionsPlugin|
|''Description:''|Support for deprecated functions removed from core|
***/
//{{{
if(!version.extensions.DeprecatedFunctionsPlugin) {
version.extensions.DeprecatedFunctionsPlugin = {installed:true};

//--
//-- Deprecated code
//--

// @Deprecated: Use createElementAndWikify and this.termRegExp instead
config.formatterHelpers.charFormatHelper = function(w)
{
	w.subWikify(createTiddlyElement(w.output,this.element),this.terminator);
};

// @Deprecated: Use enclosedTextHelper and this.lookaheadRegExp instead
config.formatterHelpers.monospacedByLineHelper = function(w)
{
	var lookaheadRegExp = new RegExp(this.lookahead,"mg");
	lookaheadRegExp.lastIndex = w.matchStart;
	var lookaheadMatch = lookaheadRegExp.exec(w.source);
	if(lookaheadMatch && lookaheadMatch.index == w.matchStart) {
		var text = lookaheadMatch[1];
		if(config.browser.isIE)
			text = text.replace(/\n/g,"\r");
		createTiddlyElement(w.output,"pre",null,null,text);
		w.nextMatch = lookaheadRegExp.lastIndex;
	}
};

// @Deprecated: Use <br> or <br /> instead of <<br>>
config.macros.br = {};
config.macros.br.handler = function(place)
{
	createTiddlyElement(place,"br");
};

// Find an entry in an array. Returns the array index or null
// @Deprecated: Use indexOf instead
Array.prototype.find = function(item)
{
	var i = this.indexOf(item);
	return i == -1 ? null : i;
};

// Load a tiddler from an HTML DIV. The caller should make sure to later call Tiddler.changed()
// @Deprecated: Use store.getLoader().internalizeTiddler instead
Tiddler.prototype.loadFromDiv = function(divRef,title)
{
	return store.getLoader().internalizeTiddler(store,this,title,divRef);
};

// Format the text for storage in an HTML DIV
// @Deprecated Use store.getSaver().externalizeTiddler instead.
Tiddler.prototype.saveToDiv = function()
{
	return store.getSaver().externalizeTiddler(store,this);
};

// @Deprecated: Use store.allTiddlersAsHtml() instead
function allTiddlersAsHtml()
{
	return store.allTiddlersAsHtml();
}

// @Deprecated: Use refreshPageTemplate instead
function applyPageTemplate(title)
{
	refreshPageTemplate(title);
}

// @Deprecated: Use story.displayTiddlers instead
function displayTiddlers(srcElement,titles,template,unused1,unused2,animate,unused3)
{
	story.displayTiddlers(srcElement,titles,template,animate);
}

// @Deprecated: Use story.displayTiddler instead
function displayTiddler(srcElement,title,template,unused1,unused2,animate,unused3)
{
	story.displayTiddler(srcElement,title,template,animate);
}

// @Deprecated: Use functions on right hand side directly instead
var createTiddlerPopup = Popup.create;
var scrollToTiddlerPopup = Popup.show;
var hideTiddlerPopup = Popup.remove;

// @Deprecated: Use right hand side directly instead
var regexpBackSlashEn = new RegExp("\\\\n","mg");
var regexpBackSlash = new RegExp("\\\\","mg");
var regexpBackSlashEss = new RegExp("\\\\s","mg");
var regexpNewLine = new RegExp("\n","mg");
var regexpCarriageReturn = new RegExp("\r","mg");

}
//}}}
/***
|''Name:''|LegacyStrikeThroughPlugin|
|''Description:''|Support for legacy (pre 2.1) strike through formatting|
|''Version:''|1.0.2|
|''Date:''|Jul 21, 2006|
|''Source:''|http://www.tiddlywiki.com/#LegacyStrikeThroughPlugin|
|''Author:''|MartinBudden (mjbudden (at) gmail (dot) com)|
|''License:''|[[BSD open source license]]|
|''CoreVersion:''|2.1.0|
***/

//{{{
// Ensure that the LegacyStrikeThrough Plugin is only installed once.
if(!version.extensions.LegacyStrikeThroughPlugin) {
version.extensions.LegacyStrikeThroughPlugin = {installed:true};

config.formatters.push(
{
	name: "legacyStrikeByChar",
	match: "==",
	termRegExp: /(==)/mg,
	element: "strike",
	handler: config.formatterHelpers.createElementAndWikify
});

} //# end of "install only once"
//}}}
http://www.underbit.com/products/mad/ MAD: MPEG Audio Decoder
http://libmpeg2.sourceforge.net/ libmpeg2 is a free library for decoding mpeg-2 and mpeg-1 video streams.
http://bmrc.berkeley.edu/frame/research/mpeg/ Berkeley MPEG Tools
http://labrosa.ee.columbia.edu/matlab/mp3read.html

MPEG movies file
http://terpsichore.stsci.edu/~summers/viz/hgast/movies.html

Matlab:
mmreader ?
http://www.mathworks.com/access/helpdesk/help/techdoc/index.html?/access/helpdesk/help/techdoc/matlab_prog/f5-86556.html&http://www.google.com.sg/search?q=matlab+video+read&ie=utf-8&oe=utf-8&aq=t&rls=org.mozilla:en-US:official&client=firefox-a

mmfileinfo: read file structure
http://www.ee.columbia.edu/~dpwe/e6820/matlab/

*  timit61.phset: set of 61 phoneme definitions, categorized into 1=stops 2=fricatives 3=nasals 4=liquids 5=vowels 6=silence/gaps

* learn from DanEllis matlab diary & script
* watch histogram to intuitive in speech segmentation
* vowel simulation - 2008-01-31-acous.diary

Download: http://sourceforge.net/project/showfiles.php?group_id=22870&package_id=16937, http://sourceforge.net/project/showfiles.php?group_id=22870&package_id=16948

Using Visual C++ and OpenCV: http://opencvlibrary.sourceforge.net/VisualC%2B%2B
Installing OpenCV on Linux: http://opencvlibrary.sourceforge.net/InstallGuide_Linux
http://ucrel.lancs.ac.uk/corpora.html
http://www.speech.cs.cmu.edu/comp.speech/Section1/speechlinks.html

Moby phonetic: http://www.speech.cs.cmu.edu/comp.speech/Section1/Lexical/moby.html

http://liceu.uab.es/~joaquim/language_resources/spoken_res/recursos_corpus.html#phonetic_representation

Samba: http://www.phon.ucl.ac.uk/home/sampa/index.html

Phonetic Symbols Advice: http://www.phon.ucl.ac.uk/resource/phonetics/

SCRIBE - Spoken Corpus of British English: http://www.phon.ucl.ac.uk/resource/scribe/

eustace: http://www.cstr.ed.ac.uk/projects/eustace/download.html
http://wwwqbic.almaden.ibm.com/ : On-line collections of images are growing larger and more common, and tools are needed to efficiently manage, organize, and navigate through them. We have developed the QBIC system which lets you make queries of large image databases based on visual image content -- properties such as color percentages, color layout, and textures occurring in the images. Such queries use the visual properties of images, so you can match colors, textures and their positions without describing them in words. Content based queries are often combined with text and keyword predicates to get powerful retrieval methods for image and multimedia databases.

http://www.hermitagemuseum.org/html_En/07/hm7_41_1.html: QBIC color search demo

* Query on large image and video based on:
** example images
** user-constructed sketches and drawings
** selected color and texture pattern
** camera and object motion
** other graphical information

* Data model
** scenes (consist of object): object identification method
** video (shots): shot cut detection method. Shot is represented by r-frame (treated as still image), or synthesize r-frame for long panning shot

* Queries: feature
** Average color: 3D vector of Munsell color coordinates (e.g. find images similar to a selected color)
** Histogram color: 256-element histogram
** Texture: mathematical texture feature (coarseness, contrast, directionality) (e.g. select texture from a predefined set ..)
** Object shape: area circularity, eccentricity, ... 
** Query by sketch: automatically extracted reduced-resolution "edge map"
** Multiobject query: (e.g. red roeund object and a green textured object)


* Guiding principle: let computer do what they do best - quantifiable measurement - andh let humans do what they do best - attaching semantic meaning.
Pt. I	Multimedia Processing and Retrieval	 
 	Introduction / HongJiang Zhang	1
 	Ch. 1	Digital Audio / HongJiang Zhang, Hao Jiang	5
 	 	 	Audio Engineering and Psychoacoustics: Matching Signals to the Final Receiver, the Human Auditory System / E. Zwicker, U. T. Zwicker	11
 	 	 	Advances in Speech and Audio Compression / A. Gersho	23
 	 	 	A Tutorial on MPEG/Audio Compression / D. Pan	42
 	 	 	Structured Audio and Effects Processing in the MPEG-4 Multimedia Standard / E. D. Scheirer	57
 	 	 	Fundamental and Technological Limitations of Immersive Audio Systems / C. Kyriakakis	69
 	Ch. 2	Digital Image and Video Compression and Processing / Bing Zeng	81
 	 	 	Image and Video Coding - Emerging Standards and Beyond [et al.]	89
 	 	 	Comparison of International Standards for Lossless Still Image Compression / R. Arps, T. Truong	113
 	 	 	Embedded Image Coding Using Zerotrees of Wavelet Coefficients / J. M. Shapiro	124
 	 	 	The MPEG-4 Video Standard Verification Model / T. Sikora	142
 	 	 	Multimedia Data-Embedding and Watermarking Technologies / M. D. Swanson, M. Kobayashi, A. H. Tewfik	155
 	 	 	Algorithms for Manipulating Compressed Images / B. C. Smith, L. A. Rowe	179
 	 	 	Manipulation and Compositing of MC-DCT Compressed Video / S.-F. Chang, D. G. Messerschmitt	188
 	Ch. 3	Audio Retrieval and Navigation Interfaces / Philippe Aigrain	199
 	 	 	Representation-Based User Interfaces for the Audiovisual Library of Year 2000 [et al.]	205
 	 	 	Query by Humming: Music Information Retrieval in an Audio Database [et al.]	216
 	 	 	Content-Based Classification, Search, and Retrieval of Audio [et al.]	222
 	 	 	Toward Content-Based Audio Indexing and Retrieval and a New Speaker Discrimination Technique / L. Wyse, S. W. Smoliar	232
 	 	 	Open-Vocabulary Speech Indexing for Voice and Video Mail Retrieval [et al.]	237
 	Ch. 4	Content-Based Image Indexing and Retrieval / HongJiang Zhang	247
 	 	 	Query by Image and Video Content: The QBIC System [et al.]	255
 	 	 	Color Indexing / M. J. Swain, D. H. Ballard	265
 	 	 	A Scheme for Visual Feature Based Image Indexing / H. J. Zhang, D. Zhong	278
 	 	 	Interactive Learning with a "Society of Models" / T. P. Minka, R. W. Picard	289
 	 	 	The Bayesian Image Retrieval System; Pic Hunter Theory, Implementation, and Psychophysical Experiments [et al.]	295
 	Ch. 5	Content-Based Video Browsing and Retrieval / HongJiang Zhang	313
 	 	 	Automatic Partitioning of Full-Motion Video / H. J. Zhang, A. Kankanhalli, S. W. Smoliar	321
 	 	 	Structured Video Computing [et al.]	340
 	 	 	Video Parsing, Retrieval and Browsing: An Integrated and Content-Based Solution [et al.]	350
 	 	 	Extracting Story Units from Long Programs for Video Browsing and Navigation / M. Yeung, B.-L. Yeo, B. Liu	360
 	 	 	Video Skimming and Characterization through the Combination of Image and Language Understanding Techniques / M. A. Smith, T. Kanade	370
Pt. II	Systems, Networking and Tools	 
 	Introduction / Kevin Jeffay	383
 	Ch. 6	Multimedia Database Systems / Aidong Zhang	387
 	 	 	A Unified Data Model for Representing Multimedia, Timeline, and Simulation Data / J. D. N. Dionisio, A. F. Cardenas	391
 	 	 	Querying Multimedia Presentations Based on Content [et al.]	413
 	 	 	NetView: Integrating Large-Scale Distributed Visual Databases [et al.]	438
 	 	 	The X-tree: An Index Structure for High-Dimensional Data / S. Berchtold, D. A. Keim, H.-P. Kriegel	451
 	Ch. 7	Multimedia Operating Systems / Klara Nahrstedt	463
 	 	 	An Overview of the Rialto Real-Time Architecture [et al.]	467
 	 	 	Resource Kernels: A Resource-Centric Approach to Real-Time and Multimedia Systems [et al.]	476
 	 	 	A Hierarchical CPU Scheduler for Multimedia Operating Systems / P. Goyal, X. Guo, H. M. Vin	491
 	 	 	The Design, Implementation and Evaluation of SMART: A Scheduler for Multimedia Applications / J. Nieh, M. S. Lam	506
 	Ch. 8	Videoconferencing / Kevin Jeffay	521
 	 	 	An Empirical Study of Delay Jitter Management Policies / D. L. Stone, K. Jeffay	525
 	 	 	Media Scaling for Audiovisual Communication with the Heidelberg Transport System [et al.]	538
 	 	 	Retransmission-Based Error Control for Interactive Video Applications over the Internet / I. Rhee	544
 	 	 	What Video Can and Can't Do for Collaboration: A Case Study / E. A. Isaacs, J. C. Tang	554
 	 	 	vic: A Flexible Framework for Packet Video / S. McCanne, V. Jacobson	565
 	Ch. 9	Networking and Media Streaming / Ketan Mayer-Patel	577
 	 	 	The Performance of Two-Dimensional Media Scaling for Internet Videoconferencing / P. Nee, K. Jeffay, G. Danneels	581
 	 	 	Receiver-driven Layered Multicast / S. McCanne, V. Jacobson, M. Vetterli	593
 	 	 	A Survey of Packet Loss Recovery Techniques for Streaming Audio / C. Perkins, O. Hodson, V. Hardman	607
 	 	 	Adaptive FEC-Based Error Control for Internet Telephony / J.-C. Bolot, S. Fosse-Parisis, D. Towsley	616
 	 	 	RSVP: A New Resource ReSerVation Protocol [et al.]	624
 	 	 	Internet Telephony: Architecture and Protocols - An IETF Perspective / H. Schulzrinne, J. Rosenberg	635
 	Ch. 10	Multimedia Storage Servers / Prashant J. Shenoy, Harrick M. Vin	655
 	 	 	Multimedia Storage Servers: A Tutorial [et al.]	661
 	 	 	Random RAIDs with Selective Exploitation of Redundancy for High Performance Video Servers / Y. Birk	671
 	 	 	Disk Scheduling in a Multimedia I/O System / A. L. Narasimha Reddy, J. Wyllie	682
 	 	 	A Statistical Admission Control Algorithm for Multimedia Servers [et al.]	691
 	 	 	A Generalized Interval Caching Policy for Mixed Interactive and Long Video Workloads / A. Dan, D. Sitaram	699
 	 	 	On Optimal Piggyback Merging Policies for Video-On-Demand Systems / C. Aggarwal, J. Wolf, P. S. Yu	707
 	Ch. 11	Multimedia Synchronization / Lawrence A. Rowe	717
 	 	 	A Temporal Reference Framework for Multimedia Synchronization / M. J. J. Perez-Luque, T. D. C. Little	721
 	 	 	Human Perception of Media Synchronization / R. Steinmetz, C. Engler	737
 	 	 	Improved Algorithms for Synchronizing Computer Network Clocks / D. L. Mills	751
 	 	 	Nsync - A Toolkit for Building Intractive Multimedia Presentations [et al.]	761
 	 	 	A Method and Apparatus for Measuring Media Synchronization / B. K. Schmidt, J. D. Northcutt, M. S. Lam	771
 	Ch. 12	Authoring Systems / Dick C. A. Bulterman	777
 	 	 	The Amsterdam Hypermedia Model: Adding Time and Context to the Dexter Model / L. Hardman, D. C. A. Bulterman, G. van Rossum	781
 	 	 	HDM - A Model-Based Approach to Hypertext Application Design / F. Garzotto, P. Paolini, D. Schwabe	794
 	 	 	Automatic Temporal Layout Mechanisms / M. C. Buchanan, P. T. Zellweger	807
 	 	 	GRiNS: GRaphical INterface for Creating and Playing SMIL Documents [et al.]	817
 	 	 	Multiviews Interfaces for Multimedia Authoring Environments / M. Jourdan, C. Roisin, L. Tardif	828
 	 	 	A Multimedia System for Authoring Motion Pictures [et al.]	836


Special Interest Group in Information Retrieval
SIGIR
HIERARCHICAL MULTI-CLASS SELF SIMILARITIES paper
* spectral band

* simple method to segmentation, understand spectrogram, detect peaks and valleys (''done'')


* 39-element MFCC vectors
* 10ms average speech frame
/***
|''Name:''|SparklinePlugin|
|''Description:''|Sparklines macro|
***/
//{{{
if(!version.extensions.SparklinePlugin) {
version.extensions.SparklinePlugin = {installed:true};

//--
//-- Sparklines
//--

config.macros.sparkline = {};
config.macros.sparkline.handler = function(place,macroName,params)
{
	var data = [];
	var min = 0;
	var max = 0;
	var v;
	for(var t=0; t<params.length; t++) {
		v = parseInt(params[t]);
		if(v < min)
			min = v;
		if(v > max)
			max = v;
		data.push(v);
	}
	if(data.length < 1)
		return;
	var box = createTiddlyElement(place,"span",null,"sparkline",String.fromCharCode(160));
	box.title = data.join(",");
	var w = box.offsetWidth;
	var h = box.offsetHeight;
	box.style.paddingRight = (data.length * 2 - w) + "px";
	box.style.position = "relative";
	for(var d=0; d<data.length; d++) {
		var tick = document.createElement("img");
		tick.border = 0;
		tick.className = "sparktick";
		tick.style.position = "absolute";
		tick.src = "data:image/gif,GIF89a%01%00%01%00%91%FF%00%FF%FF%FF%00%00%00%C0%C0%C0%00%00%00!%F9%04%01%00%00%02%00%2C%00%00%00%00%01%00%01%00%40%02%02T%01%00%3B";
		tick.style.left = d*2 + "px";
		tick.style.width = "2px";
		v = Math.floor(((data[d] - min)/(max-min)) * h);
		tick.style.top = (h-v) + "px";
		tick.style.height = v + "px";
		box.appendChild(tick);
	}
};


}
//}}}
/* horizontal main menu */

#displayArea { margin: 1em 15.5em 0em 1em; } /* use the full horizontal width */

#topMenu { background: [[ColorPalette::PrimaryMid]]; color: [[ColorPalette::PrimaryPale]]; padding: 0.2em 0.2em 0.2em 0.5em; border-bottom: 2px solid #000000; }

#topMenu br { display: none; }

#topMenu .button, #topMenu .tiddlyLink, #topMenu a { margin-left: 0.25em; margin-right: 0.25em; padding-left: 0.5em; padding-right: 0.5em; color: [[ColorPalette::PrimaryPale]]; font-size: 1.15em; }

#topMenu .button:hover, #topMenu .tiddlyLink:hover { background: [[ColorPalette::PrimaryDark]]; }

 .firstletter{ float:left; width:0.75em; font-size:400%; font-family:times,arial; line-height:60%; }

.viewer .FOO table tr.oddRow { background-color: #bbbbbb; }
.viewer .FOO table tr.evenRow { background-color: #fff; } 


/*Invisible table*/

.viewer .invisibletable table { 
border-color: white;
 }

.viewer .invisibletable table td { 
font-size: 1em;
font-family: Verdana;
border-color: white;
padding: 10px 20px 10px 0px;
text-align: left;
vertical-align: top;
} 

.viewer .invisibletable table th { 
color: #005566;
background-color: white;
border-color: white;
font-family: Verdana;
font-size: 1.2em;
font-weight: bold;
padding: 10px 20px 10px 0px;
text-align: left;
vertical-align: top;
} 

/* GIFFMEX TWEAKS TO STYLESHEETPRINT (so that nothing but tiddler title and text are printed) */


@media print {#mainMenu {display: none ! important;}}
@media print {#topMenu {display: none ! important;}}
@media print {#sidebar {display: none ! important;}}
@media print {#messageArea {display: none ! important;}} 
@media print {#toolbar {display: none ! important;}}
@media print {.header {display: none ! important;}}
@media print {.tiddler .subtitle {display: none ! important;}}
@media print {.tiddler .toolbar {display; none ! important; }}
@media print {.tiddler .tagging {display; none ! important; }}
@media print {.tiddler .tagged {display; none ! important; }}
@media print {#displayArea {margin: 1em 1em 0em 1em;}}
@media print {.pageBreak {page-break-before: always;}}

a.button{
 border: 0;

} 

/*Color changes*/


#sidebarOptions input {
	border: 1px solid [[ColorPalette::TertiaryPale]];
}

#sidebarOptions .sliderPanel {
	background: [[ColorPalette::TertiaryPale]];
}

#sidebarOptions .sliderPanel a {
	border: none;
	color: [[ColorPalette::PrimaryMid]];
}

#sidebarOptions .sliderPanel a:hover {
	color: [[ColorPalette::Background]];
	background: [[ColorPalette::TertiaryPale]];
}

#sidebarOptions .sliderPanel a:active {
	color: [[ColorPalette::PrimaryMid]];
	background: [[ColorPalette::TertiaryPale]];
}

/*Makes sliders bold*/

.tuduSlider .button{font-weight: bold;
}

/* (2) Adjusts the color for all headlines so they are both readable and match my color schemes. */

h1,h2,h3,h4,h5 {
 color: #000;
 background: [[ColorPalette::TertiaryPale]];
}

.title {
color: [[ColorPalette::PrimaryMid]];
}

/* (2) Makes text verdana. */

body {
/* font-family: verdana;*/
font-size: 9pt;
}

/* (4) Allows for Greek - one way */

   .greek {
      font-family: Palatino Linotype;
      font-style: normal;
      font-size: 150%;
   }

/* (5) Shortens the height of the Header */

.headerShadow {
 padding: 1.5em 0em 1em 1em;
}

.headerForeground {
 padding: 2em 0em 1em 1em;
}

/* (8) Makes ordered and unordered lists double-spaced between items but single-spaced within items. */

/*.viewer li {
   padding-top: 0.5em;
   padding-bottom: 0.5em;

} */

/*Makes block quotes line-less*/

.viewer blockquote {
border-left: 0px;
margin-top:0em;
margin-bottom:0em; 
}

/* Cosmetic fixes that probably should be included in a future TW... */

.viewer .listTitle { list-style-type:none; margin-left:-2em; }
.editorFooter .button { padding-top: 0px; padding-bottom:0px; }

Important stuff. See TagglyTaggingStyles and HorizontalMainMenuStyles

[[Styles TagglyTagging]]
[[Styles HorizontalMainMenu]]

Just colours, fonts, tweaks etc. See MessageTopRight and SideBarWhiteAndGrey

body { 
  background: #eee; }
.headerForeground a { 
  color: #6fc;}
.headerShadow { 
  left: 2px; 
  top: 2px; }
.siteSubtitle { 
  padding-left: 1.5em; }

.shadow .title {
  color: #999; }

.viewer pre { 
  background-color: #f8f8ff; 
  border-color: #ddf }

.tiddler {
  border-top:    1px solid #ccc; 
  border-left:   1px solid #ccc; 
  border-bottom: 3px solid #ccc; 
  border-right:  3px solid #ccc; 
  margin: 0.5em; 
  background:#fff; 
  padding: 0.5em; 
  -moz-border-radius: 1em; }

#messageArea { 
  background-color: #eee; 
  border-color: #8ab; 
  border-width: 4px; 
  border-style: dotted; 
  font-size: 90%; 
  padding: 0.5em; 
  -moz-border-radius: 1em; }

#messageArea .button { text-decoration:none; font-weight:bold; background:transparent; border:0px; }

#messageArea .button:hover {background: #acd; }

.editorFooter .button { 
  padding-top: 0px; 
  padding-bottom:0px; 
  background: #fff;
  color: #000; 
  border-top:    1px solid #ccc; 
  border-left:   1px solid #ccc; 
  border-bottom: 2px solid #ccc; 
  border-right:  2px solid #ccc; 
  margin-left: 3px;
  padding-top: 1px;
  padding-bottom: 1px;
  padding-left: 5px;
  padding-right: 5px; }
  
.editorFooter .button:hover { 
  border-top:    2px solid #ccc; 
  border-left:   2px solid #ccc; 
  border-bottom: 1px solid #ccc; 
  border-right:  1px solid #ccc; 
  margin-left: 3px;
  padding-top: 1px;
  padding-bottom: 1px;
  padding-left: 5px;
  padding-right: 5px; }

.tagged {
  padding: 0.5em;
  background-color: #eee;
  border-top:    1px solid #ccc; 
  border-left:   1px solid #ccc; 
  border-bottom: 3px solid #ccc; 
  border-right:  3px solid #ccc; 
  -moz-border-radius: 1em; }

.selected .tagged {
  padding: 0.5em;
  background-color: #eee;
  border-top:    1px solid #ccc; 
  border-left:   1px solid #ccc; 
  border-bottom: 3px solid #ccc; 
  border-right:  3px solid #ccc; 
  -moz-border-radius: 1em; }

Clint's fix for weird IE behaviour
body {position:static;}
.tagClear{margin-top:1em;clear:both;}
* Audio system
** at1 (search by IPA): currently convert IPA to text, IPA synthesis or text modules currently work with ''only English'' -> consider handling for multilingual?
** consider doing at3: find repeated segments in a sequence
** improve audio feature extraction for at2
* Video system
** motion vector extraction for vt2: run on Linux + consider using OpenCV available function
** setup evaluation data
** consider extending current video classification into vt3
** improve video classification model
** improve feature extraction (face, layout, ...)
The objective of the search challenge is to encourage participation from international teams to develop new, interesting and practical search techniques.

!The voice search challenge consists of three subtasks, as outlined below:
   1. Search by IPA
      The query is given in International Phonetic Alphabet (IPA), the task is to retrieve all segments that contain the query IPA sequence regardless of its spoken languages;
   2. Search by example
      The query is an utterance spoken by different speakers, the task is to retrieve all segments that contain the query word/phrase/sentence regardless of its spoken languages;
   3. Search for recurrent voice segments In the voice archive, certain word/phrase/sentences are repeated more than once in the content. The task is to extract all recurrent segments which are at least 15 seconds in length. No query is given in this case. The number of unique recurrent segments for each document is given.

!The video search challenge consists of three subtasks, as outlined below:

   1. Search by (Single) Query Image
      The query is provided in the form of a single image. The task is to retrieve all visually similar segments. Note that the similarity is at the perceptual level. That is, the expected results should contain video segments that contain images that look similar to the query image, as opposed to the video content being semantically similar. There are 30 query image types.
   2. Search by Video Shot
      The query is a short video shot (<10sec). The task is to retrieve video shots that are perceptually similar to the query video clip. Note that compared to VT1, there is now additional motion information in the query video shot and the matching criteria should also take into consideration similarity in the motion trajectory. There are 30 query shot types.
   3. Object/Scene Categorization
      A list of object/scene classes will be defined. For each class, a set of images/videos depicting the objects/scenes will be provided. The participants are expected to develop a model of the class by visually learning on the sample images/video. Then, given a new, unseen test set of images/video the task is to categorize the test set into the classes. Note that the set of test queries will necessarily be a very large set in the order of 10K queries. Also, about 10% of these queries will not belong to any of the object/ scene classes, and the desired output result is a "Reject" class. There are 30 object/scene classes.



* ftp download data from WING server (check UNIX command)
* distribute to other machine
* Face detection
101. Crowd (>10 people)
107. Person using Computer, both visible
113. Business meeting (> 2 people), mostly seated down, table visible
116. Face closeup, occupying about 3/4 of screen, frontal or side	 (openCV face detector + size measurement)

* Detect text, rectangular (Zhao Jin)
106. TV chart Overlay, including graphs, text, powerpoint style
105. Electronic chart, e.g. stock charts, airport departure chart
119. PC Webpages, screen of PC visible
	
* Edge detection, color dominant (Thang, Tuan)
110. Badminton court, sports
111. Swimming pool, sports
108. Track and field, sports
102. Building with sky as backdrop, clearly visible
114. Natural scene, e.g. mountain, trees, sea, no pple
118. Boat/Ship, over sea, lake


112. Closeup of hand, e.g. using mouse, writing, etc      
117. Traffic Scene, many cars, trucks, road visible
120. Airplane  
104. Flag
109. Company Trademark, including billboard, logo
115. Food on dishes, plates
103. Mobile devices including handphone/PDA
  
100. Not-Applicable, None of the labels
* Use OpenCV, read video
* Figure out how to access motion vector
      202. Talking face with introductory caption (face + text)
      207. Large camera movement, panning left/right, top/down of a scene (motion vector)
      208. Movie ending credit (motion vector, text)
      205. Large camera movement, tracking an object, person, car, etc (motion vector, consistent, cosine ~ 1, large magnitude) 

      201. People entering/exiting door/car
      203. Fingers typing on a keyboard
      204. Inside a moving vehicle, looking outside
      206. Static or minute camera movement, people(s) walking, legs visible
      209. Woman monologue
      210. Sports celebratory hug
http://iris.ee.iisc.ernet.in/Resources/docs/videos_in_matlab.pdf
* Query by image regions and spatial layout.
* Feature-based image indexing & spatial query methods
* Strategies:
** Computing queries
** Extraction process
** Special case spatial queries
"Extraction Methods of Voicing Feature for Robust Speech Recognition"
Common motivation of the extraction methods is to detect the quasi periodic oscillation of the vocal chords
What to crawl next?
Adverse IR: cloaking, doorway pages, link spamming (see lecture 1)
Distributed crawling strategies (more on this in lecture 5)
* pick up keyframes for query (manually ?)
* matching (match across multiple keyframes)