Coursera Crawler

Abstract

Coursera Crawler is a crawler for the Coursera website to get the discussion forum data. This crawler depends on PhantomJS to simulate the login process and PycURL to get the target data via hidden APIs.

This crawler is only for discussion forum data, and you can easily extend it to get the data using PycURL if the data shown on the webpage is dynamically loaded by the APIs. The hard thing you have to do is to find the hidden APIs.

Resources

  • Please visit github for Coursera Crawler codes.

Members

  • [insert_php] echo get_avatar( $id_or_email=’anyahui.120@gmail.com’, $size=30 ); [/insert_php] An Yahui (Intern)
  • [insert_php] echo get_avatar( $id_or_email=’cmkumar087@gmail.com’, $size=30 ); [/insert_php] Muthu Kumar Chandrasekaran (Project Lead)
  • [insert_php] echo get_avatar( $id_or_email=’kanmy@comp.nus.edu.sg’, $size=30 ); [/insert_php] Min-Yen Kan (Advisor and Professor)

Meeting Minutes

  • 29 Aug 2016
  • 13 Sep 2016
  • 27 Sep 2016