Main Page
From GraphReading
Welcome to the (now defunct) Graph Reading Group Wiki at the School of Computing at the National University of Singapore. This group has now mutated into the Machine Learning Reading Group at NUS, being led by Prof. Lee Wee Sun and Hai Leong Chieu. Please follow the links above to get to the reading group list, managed by Google Groups.
You may also be interested in our sister reading group, the Natural Language Processing Reading Group at NUS.
We're following some readings from other, similar reading groups at other CS departments elsewhere. PDF and PS links you can find from the other groups. In Fall 2006 Yee Whye Teh had several scientists who came to visit NUS. We read works related to the visitors' talk topics to prepare for their talks (hopefully to be advertised on CHIMETEXT). If there are other topics that you think we should read about as well, please suggest it at the group meeting. Our discussion will be in the Meeting Room 1 (MR1) in S16, level 5, on alternate Wednesday afternoons (4:30-6pm; 1 1/2 hrs). Discussion leaders rotate per session.
Join our newer ML @ NUS reading group mailing list.
The rest of this page is kept up for archival reasons, feel free to refer to this list.
Schedule for Semester I, 2006-2007
Readings and visitors have not been fully determined yet, please stay tuned. This semester we are reading and studying related works to our visitor schedule. Note that the visitor schedule below is subject to change (and so are the topics, caveat emptor!) -M
22 Nov 2006 - Dirichlet Diffusion Trees
Note: this session will be held in MR3 (SoC1 05-28)
Discussion Leaders: Zhao Ming, Steve
Visitors: Nati Srebro and Ted Meeds, both from Toronto.
http://www.cs.toronto.edu/~radford/ftp/dft-val7.pdf
The above is a shorter paper, there is also a 25 page tech report:
http://www.cs.toronto.edu/~radford/ftp/dft-paper1.pdf
if you want to know more about the details.
15 Nov 2006 - Gaussian Processes
Note: this session will be held in SR6 (S16 04-33)
Discussion Leaders: Yee Fan, Long, Dave
Visitor: Eric Xing (Matthias Seeger?)
We will discuss a much easier paper to digest this time:
http://www.dai.ed.ac.uk/homes/ckiw/postscript/NCRG_97_012.ps.gz
particularly section 1-4. I would encourage also reading the other sections, and also visit the gaussian process website:
http://www.gaussianprocess.org/
8 Nov 2006 - Bioinformatics
Discussion Leaders: Edward, Hendra, Novi, Simon
Visitor: Eric Xing
Title of talk: Reasoning in open possible worlds: on A New Class of Nonparametric Bayesian Models for Haplotype Phasing, LD Modeling and Demographic Inference in Open Ancestral Space
We will discuss the paper:
A New Nonparametric Bayesian Model for Genetic Inference in Open Ancestral Space. Sohn and Xing 2006. CMU tech report.
which is built upon two prior works:
Bayesian Multi-Population Haplotype Inference via a Hierarchical Dirichlet Process Mixture. Sohn, Xing, Jordan and Teh. ICML 2006.
Bayesian Haplotype Inference via the Dirichlet Process. Xing, Sharan and Jordan. ICML 2004.
which are in turn based on Dirichlet processes (DPs). Yee Whye will first introduce DPs, then the discussion leaders will discuss the tech report above.
25 Oct 2006 - Clustering in Microarray data
Discussion Leaders: Sylvie, Wu Dan, Hiew
Visitor: Quaid Morris (6 - 15 Nov)
- Wen Zhang, Quaid D Morris, et.al (2004) The functional landscape of mouse gene expression Journal of Biology 2004, 3:21
11 Oct 2006 - Matrix Factorization
Discussion Leaders: Hai Leong, Belinda
Visitor: Nati Srebro (29 Oct - 10 Nov)
Either:
- Nathan Srebro and Sam Roweis Time-Varying Topic Models using Dependent Dirichlet Processes UTML-TR-2005-003, March 2005
or:
- Jason Rennie and Nathan Srebro. Fast Maximum Margin Matrix Factorization for Collaborative Prediction. 22nd International Conference on Machine Learning (ICML), August 2005.
Other suggested matrix factorization papers (thanks to Hiew Litt Teen, Lee Yoong Keok):
- Alessio Del Bue and Lourdes Agapito. Non-Rigid Stereo Factorization. IJCV 66(2):193-207, 2006.
- Shape and Motion from Image Streams under Orthography: a Factorization Method. C. Tomasi and T. Kanade. International Journal of Computer Vision, Vol. 9, No. 2, November, 1992, pp. 137-154.
- Computational vision and regularization theory. Tomaso Poggio, Vincent Torre & Christof Koch. Nature 317, 314 - 319 (1985)
- General conditions for predictivity in learning theory. Tomaso Poggio, Ryan Rifkin, Sayan Mukherjee, Partha Niyogi. Nature 428, 419-422 (2004)
27 Sep 2006 - Indian Buffet Processes
Discussion Leaders: Dilan Gorur and Yee Whye.
F. Wood and T. L. Griffiths. Particle filtering for non-parametric Bayesian matrix factorization. In Advances in Neural Information Processing Systems, to appear,2006
13 Sep 2006 - Indian Buffet Processes
Discussion Leaders: Min and Yee Whye. Also, graph reading group administration.
Visitor: Dilan Gorur (15 Sep - 2 Oct)
- Griffiths and Ghahramani 2005 Infinite Latent Feature Models and the Indian Buffet Process
- Griffiths, T. L., & Ghahramani, Z. (2005). Infinite latent feature models and the Indian buffet process. Gatsby Computational Neuroscience Unit Technical Report GCNU TR 2005-001 : A slightly longer version for those of you who can't read dense ML conference papers easily (that's me!).
Background material (not for discussion, only for reference):
Schedule for Semester II, 2005-2006
19 Apr 2006 - Bioinformatics applications
Discussion Leaders: Guoliang and Rohit
Our discussions are more focused on the following papers:
- Nir Friedman, 2004. Inferring Cellular Networks Using Probabilistic Graphical Models
This paper reviews the progress and application of some probabilistic graphical models in Bioinformatics. Many related papers can be found in the references.
- Friedman, Linial, Nachman and Pe'er, 2000. Using Bayesian Networks to Analyze Expression Data
This paper summarizes some problems and solutions in the application of Bayesian networks to gene expression data.
- Segal, Pe'er, Regev, Koller & Friedman, 2003. Learning Module Networks
The paper proposed an abstract model for the domain where some variables have the similar dynamics. Such variables can be put into one module.
____________ Supplementary Readings ____________
Other related readings for Bioinformatics applications of probabilistic graphical models are:
- Murphy and Mian, 1999. Modelling Gene Expression Data Using Dynamic Bayesian Networks.
There are many papers cited this paper as possibly the first application of DBNs to gene expression data.
- Richard Durbin, Sean R. Eddy, Anders Krogh, Graeme Mitchison, 1998. Biological Sequence Analysis: Probabilistic Models of Proteins and Nucleic Acids.
This book is a classic one for Markov process and HMM applications in protein and DNA sequence analysis.
5 Apr 2006 - Vision applications
Discussion Leaders: Ajay and Wudan
Here are the papers for the discussion:
1. M. F. Tappen, B. C. Russell, and W. T. Freeman, Efficient graphical models for processing images IEEE Conf. on Computer Vision and Pattern Recognition (CVPR) Washington, DC, 2004.
2. W. T. Freeman, E. C. Pasztor, O. T. Carmichael Learning Low-Level Vision International Journal of Computer Vision, 40(1), pp. 25-47, 2000
3. J. Sivic, B. Russell, A. A. Efros, A. Zisserman, W. T. Freeman, Discovering Objects and their Location in Images International Conference on Computer Vision (ICCV), Beijing, China, Oct. 2005
22 Mar 2006 - Loopy Belief Propagation
Discussion Leaders: Wee Sun, Yee Whye
Journal version of Yedidia et. al.'s works.
- Yedidia, Freeman and Weiss 2005 Constructing Free-Energy Approximations and Generalized Belief Propagation Algorithms
This paper describes Pearl's belief propagation algorithm and has empirical results.
- Murphy, Weiss and Jordan 1999 Loopy Belief Propagation for Approximate Inference: An Empirical Study
This paper describes factor graphs.
- F. R. Kschischang, B. J. Frey, H. A. Loeliger 2001 Factor graphs and the sum-product algorithm, IEEE Transactions on Information Theory 47:2, 498-519, February.
8 Mar 2006 - Sequential Data Applications
Discussion Leaders: Belinda, Sylvie
Some background material on DBNs to get us started:
Dynamic Bayesian Networks: Representation, Inference and Learning, K. Murphy. PhD thesis, UC Berkeley, 2002. In particular Chapt 2 Learning Dynamic Bayesian Networks In C.L. Giles and M. Gori (eds.), Adaptive Processing of Sequences and Data Structures. Lecture Notes in Artificial Intelligence, 168-197. Berlin: Springer-Verlag. Describes a few specific models and the use of variational inference.
Application papers:
Bayesian Information Extraction Network (BIEN), Leonid Peshkin and Avi Pfeffer. In Proc.18th Int. Joint Conf. Artifical Intelligence, 2003. NLP application
Below are applications in recognition of multiband/stream/channel speech, audio-visual speech, sign language, human actions, facial expression, and speaker detection. The common theme is fusion of multiple streams/channels, different degrees of asynchronicity between channels, and hierarchical structure in terms of multiple levels of abstraction and time-scale. Will try to include more comments later on. Some of the papers seem to be only accessible via login (ieee xplore, www.sciencedirect.com etc.), please email Sylvie (engp0560@nus.edu.sg) if you need a pdf softcopy, thanks.
K. Murphy. PhD thesis, UC Berkeley, 2002 (as above) Section 2.3.9 and 2.3.10. Hierarchical HMMs as used in (single stream/channel) speech recognition
DBN Based Multi-Stream Models for Speech, Yimin Zhang, Qian Diao, Shan Huang, Wei Hu, Chris Bartels, and Jeff Bilmes. IEEE Int. Conf. on Acoustics, Speech, and Signal Processing, April 2003. Hong Kong, China. Hierarchical HMMs as used in multiple stream/channel speech recognition
Dynamic Bayesian Networks for Audio-Visual Speech Recognition, A. Nefian, L. Liang, X. Pi, X. Liu and K. Murphy. EURASIP, Journal of Applied Signal Processing, 11:1-15, 2002.
"Dynamic Bayesian Networks for Multi-Band Automatic Speech Recognition", Khalid DAOUDI, Dominique FOHR and Christophe ANTOINE. Computer Speech and Language. Vol 17, 2003. pp.263-285.
"A Framework for Recognizing the Simultaneous Aspects of American Sign Language", Christian Vogler and Dimitris Metaxas. Computer Vision and Image Understanding 81, pp. 358-384, 2001.
A hierarchical Bayesian network for event recognition of human actions and interactions, Sangho Park, J.K. Aggarwal. Multimedia Systems: Special issue on Video Surveillance, 10(2), pp. 164-179, 2004.
Active and Dynamic Information Fusion for Facial Expression Understanding from Image Sequences, Yongmian Zhang, Qiang Ji. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 27, No. 5, May 2005. Some of the derivations of parameters are a bit suspect, equation (2) for example. A shorter version that describes the main ideas is in: Facial Expression Understanding in Image Sequences Using Dynamic and Active Visual Information Fusion,Yongmian Zhang and Qiang Ji, Proc. Ninth IEEE Intl Conf Computer Vision (ICCV), 2003.
"Boosted Learning in Dynamic Bayesian Networks for Multimodal Speaker Detection", V. Pavlovic, A. Garg, and J. M. Rehg. Proceedings of the IEEE, 91(9):1355-1369, September 2003.
We'll finalise the actual papers to be discussed later on.
22 Feb 2006 - Topic Model Applications
Discussion Leaders: Wei and Dave
Lu Wei - I am planning to discuss the following papers:
1. The model is presented in the following paper:
Learning and Representing Topic. A Hierarchical Mixture Model for Word Occurrences in Document Databases. Thomas Hofmann In: Proceedings of the Conference for Automated Learning and Discovery (CONALD), Pittsburgh, 1998.
2. Applications of this model to connecting images and words:
Learning the Semantics of Words and Pictures Kobus Barnard and David Forsyth In: International Conference on Computer Vision, vol 2, pp. 408-415, 2001.
Matching Words and Pictures Kobus Barnard, Pinar Duygulu, Nando de Freitas, David Forsyth, David Blei, and Michael I. Jordan In: Journal of Machine Learning Research, Vol 3, pp 1107-1135, 2003
Dave - Currently I plan to discuss the following paper:
Integrating Topics and Syntax Thomas Griffiths, Mark Steyvers, David Blei and Joshua Tenenbaum. In: Advances in Neural Information Processing Systems 17, pp. 537–544, 2005.
8 Feb 2006 - Topic Models
Discussion Leaders: Yee Whye
We are revisiting this topic because we postponed it from last time.
- Probabilistic Latent Semantic Analysis. Thomas Hofmann. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI'99)
- Latent Dirichlet allocation. D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003.
- The Author-Topic Model for Authors and Documents. Michal Rosen-Zvi, Tom Griffiths, Mark Steyvers and Padhraic Smyth. UAI 2004
If time allows, we may also cover one of these applications:
- Topic and Role Discovery in Social Networks. Andrew McCallum, Andres Corrada-Emmanuel and Xuerui Wang. IJCAI, 2005.
- Group and Topic Discovery from Relations and Text. Xuerui Wang, Natasha Mohanty and Andrew McCallum. KDD Workshop on Link Discovery: Issues, Approaches and Applications (LinkKDD) 2005
And we will also briefly cover the following:
- Hierarchical topic models and the nested Chinese restaurant process. D. Blei, T. Griffiths, M. Jordan, and J. Tenenbaum In S. Thrun, L. Saul, and B. Scholkopf, editors, Advances in Neural Information Processing Systems (NIPS) 16
- Hierarchical Dirichlet Processes. Y.W. Teh, M.I. Jordan, M.J. Beal and D.M. Blei. Technical Report 653, UC Berkeley Statistics, 2004.
Schedule for Semester I, 2005-2006
16 Nov 2005 - Factor Graphs
Discussion Leaders: Belinda, Guoliang, Hongli
- F. R. Kschischang, B. J. Frey, H. A. Loeliger 2001 Factor graphs and the sum-product algorithm, IEEE Transactions on Information Theory 47:2, 498-519, February.
2 Nov 2005 - Topic Models
Discussion Leaders: Haizhou, Xiaoyang
- Probabilistic Latent Semantic Analysis. Thomas Hofmann. Proceedings of the Fifteenth Conference on Uncertainty in Artificial Intelligence (UAI'99)
- Latent Dirichlet allocation. D. Blei, A. Ng, and M. Jordan. Journal of Machine Learning Research, 3:993-1022, January 2003.
- The Author-Topic Model for Authors and Documents. Michal Rosen-Zvi, Tom Griffiths, Mark Steyvers and Padhraic Smyth. UAI 2004
- Hierarchical topic models and the nested Chinese restaurant process. D. Blei, T. Griffiths, M. Jordan, and J. Tenenbaum In S. Thrun, L. Saul, and B. Scholkopf, editors, Advances in Neural Information Processing Systems (NIPS) 16
- Hierarchical Dirichlet Processes. Y.W. Teh, M.I. Jordan, M.J. Beal and D.M. Blei. Technical Report 653, UC Berkeley Statistics, 2004.
19 Oct 2005 - Monte Carlo and Variational Methods
Discussion Leaders: Yee Seng, Wee Hyong
- Introduction to Monte Carlo Methods, David McKay. A good introductory paper.
- Neal, R. M. (1993) Probabilistic Inference Using Markov Chain Monte Carlo Methods, Technical Report CRG-TR-93-1, Dept. of Computer Science, University of Toronto. An extensive reference to many MCMC methods.
- An introduction to variational methods for graphical models. M. I. Jordan, Z. Ghahramani, T. S. Jaakkola, and L. K. Saul. In M. I. Jordan (Ed.), Learning in Graphical Models, Cambridge: MIT Press, 1999. A good but maybe outdated introductory paper to variational methods.
- A variational principle for graphical models. M. J. Wainwright and M. I. Jordan. New Directions in Statistical Signal Processing: From Systems to Brain. Cambridge, MA: MIT Press, 2005. This paper and the next
is an updated view of variational methods.
- Graphical models, exponential families, and variational inference. M. J. Wainwright and M. I. Jordan. Technical Report 649, Department of Statistics, University of California, Berkeley, 2003.
5 Oct 2005 - Max margin Methods
Discussion Leaders: Xinhua, Zhaomin
____________ Essential Readings ____________
Theory:
[1] Max-Margin Markov Networks, B. Taskar, C. Guestrin, and D. Koller. Neural Information Processing Systems Conference (NIPS03), Vancouver, Canada, December 2003. (Best Student Paper Award)
Application 1: Parsing (context free grammar parsing)
[2] Max-Margin Parsing, B. Taskar, D. Klein, M. Collins, D. Koller and C. Manning. Empirical Methods in Natural Language Processing (EMNLP04), Barcelona, Spain, July 2004. (Best Paper Award)
Application 2: Matching (protein disulfide connectivity prediction)
[3] Learning Structured Prediction Models: A Large Margin Approach, B. Taskar, V. Chatalbashev, D. Koller, and C. Guestrin. Twenty Second International Conference on Machine Learning (ICML05), Bonn, Germany, August 2005. (latest)
Thesis: (fully subsumes [1, 2, 3], corresponding to Chapter 5, 6; 9; 10 respectively)
[4] Learning Structured Prediction Models: A Large Margin Approach, B. Taskar, PhD thesis, December 2004. (seems easier to read)
Tutorial for Max Margin technique (a pretty comprehensive picture that we will base our discussion on)
[5] Tutorial: Max-Margin Methods for NLP: Estimation, Structure, and Applications, Dan Klein and Ben Tasker, The Association for Computational Linguistics (ACL05), Ann Arbor, MI, June 2005. (Has a few slides on duality)
____________ Supplementary Readings ____________
Tutorial for SVM:
[6] A Tutorial on Support Vector Machines for Pattern Recognition, Christopher J. C. Burges, 1998.
Tutorial for Lagrange Multipliers: (may needed to understand [6])
[7] Lagrange Multipliers without Permanent Scarring, Dan Klein.
Text book for ultimate reading on optimization (for our discussion purposes):
[8] Convex Optimization, Stephen Boyd and Lieven Vandenberghe, 2004.
Two other papers using max margin ideas:
[9] Hidden Markov Support Vector Machines, Yasemin Altun, Ioannis Tsochantaridis, and Thomas Hofmann. International Conference on Machine Learning (ICML03), Washington, USA, August 2003.
[10] Support Vector Machine Learning for Interdependent and Structured Output Spaces, Ioannis Tsochantaridis, Thomas Hofmann, Thorsten Joachims, and Yasemin Altun. International Conference on Machine Learning (ICML04), Ban?, Canada, August 2004.
21 Sep 2005 - Conditional Random Fields
Discussion Leaders: Wenyuan, Hendra
- Conditional Random Fields: Probabilistic Models for Segmenting and Labeling Sequence Data. John Lafferty, Andrew McCallum and Fernando Pereira. ICML-2001
- Maximum Entropy Markov Models for Information Extraction and Segmentation. Andrew McCallum, Dayne Freitag and Fernando Pereira. ICML-2000.
- A Conditional Random Field for Discriminatively-trained Finite-state String Edit Distance. Andrew McCallum, Kedar Bellare and Fernando Pereira. Conference on Uncertainty in AI (UAI), 2005.
- An Alternative Objective Function for Markovian Fields. Sham Kakade, Sam Roweis and Yee Whye Teh. ICML-2002
7 Sep 2005 - Junction Tree and Variable Elimination
Discussion Leader: Wee Sun
- Jordan (Unpublished), chapter 3, section 4.1 and chapter 17.
31 Aug 2005 - Markov Properties and Independence
Discussion Leader: Yee Whye
- Jordan (Unpublished), Chapters 2 and 16.
Schedule for Summer 2005
Our earlier readings in Summer 2005 were on graphical models for web modeling and evolution of these networks.
29 Jun 2005 and 13 Jul 2005: UMich Week 3
Discussion Leaders: Min-Yen Kan and Hang Cui
- S.N. Dorogovtsev and J.F.F. Mendes. Evolution of networks. Submitted to Advances in Physics on 6th March 2001
18 May 2005: UMich Week 2 - Graph Models
Discussion Leaders: Wee Sun Lee
- Albert-László Barabási and Réka Albert. Emergence of scaling in random networks. Science, 286:509-512, 1999.
http://www-personal.umich.edu/~adebaca/WebGraph/papers/BarabasiAlbert99emergence.pdf
- Duncan J. Watts and Steven H. Strogatz. Collective dynamics of small-world networks. Nature, 393:440-442, 1998. http://www-personal.umich.edu/~adebaca/WebGraph/papers/WattsStrogatz98collective.pdf
- P. Erd?s and A. R¨¦nyi. On random graphs. Publicationes Mathematicae Deberecen, 6:290-291, 1959. http://www-personal.umich.edu/~adebaca/WebGraph/papers/Erdos59Random.pdf
4 May 2005: UMich Week 5 - Text summarization using graphs
Discussion Leaders: Min-Yen Kan and Xinyi Yin
- G. Erkan, D. R. Radev, LexRank: Graph-based Centrality as Salience in Text Summarization, Journal of Artificial Intelligence Research 22, 12/2004.
- H. Zha, Generic Summarization and Keyphrase Extraction Using Mutual Reinforcement Principle and Sentence Clustering, SIGIR '02, August 11-15, 2002.
- Salton, V., Allan, J., Buckley, C., and Singhal, A. Automatic analysis, theme greeneration, and summarization of machine-readable texts, Readings in information retreival 1997, pp 478-483
- Salton, G., Singhal, A., Mitra, M., Buckley, C., Automatic text structuring and summarization Information Processing and Management: an International Journal Vol 33, Issue 2 (March 1997) 1997 pp 193-207
20 Apr 2005: Markov Chains - Definitions and properties
Discussion Leaders: Min-Yen Kan and Wee Sun Lee
Useful Links
The readings here are, as noted above, cribbed from other reading groups. We're playing catch-up for now.
