3rd Joint Workshop on Bibliometric-enhanced Information Retrieval and Natural Language Processing for Digital Libraries (BIRNDL 2018)

at the 41st International ACM SIGIR Conference on Research and Development in Information Retrieval, Ann Arbor, MI, USA

Call for Papers

You are invited to participate in the 3rd Joint Workshop on Bibliometric-enhanced IR and NLP for Digital Libraries (BIRNDL), to be held as part of 41st International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR 2018) in Ann Arbor, MI, USA on 12th July 2018.

This is the third BIRNDL workshop, second at SIGIR, following a series of successful BIR workshops at ECIR and other premier IR venues. In conjunction with the BIRNDL workshop, we will also hold the 4th CL-SciSumm Shared Task in Scientific Document Summarization.

Reports from the shared task systems will be featured as part of a session at the workshop.

Aim of the Workshop

The goal of the BIRNDL workshop at SIGIR is to engage the IR community about the open problems in academic search. Academic search refers to the large, cross-domain digital repositories which index research papers, such as the ACL Anthology, ArXiv, ACM Digital Library, IEEE database, Web of Science and Google Scholar. Currently, digital libraries collect and allow access to papers and their metadata --- including citations --- but mostly do not analyze the items they index. The scale of scholarly publications poses a challenge for scholars in their search for relevant literature. Finding relevant scholarly literature is the key theme of BIRNDL and sets the agenda for tools and approaches to be discussed and evaluated at the workshop. We would also like to address the need for established, standardized baselines, evaluation metrics and test collections.

We invite papers and presentations that incorporate insights from IR, bibliometrics and NLP to develop new techniques to address the open problems in Big Science, such as evidence-based searching, measurement of research quality, relevance and impact, the emergence and decline of research problems, identification of scholarly relationships and influences and applied problems such as language translation, question-answering and summarization.

See proceedings of the 2nd BIRNDL workshop Vol. 1 and Vol. 2 and a recent report in SIGIR Forum.

Workshop Topics

We invite stimulating as well as unpublished submissions on topics including - but not limited to - full-text analysis, multimedia and multilingual analysis and alignment as well as the application of citation-based NLP or information retrieval and information seeking techniques in digital libraries. Specific examples of fields of interests include (but are not limited to):

For the paper sessions, we especially invite descriptions of running projects and ongoing work as well as contributions from industry. Papers that investigate multiple themes directly are especially welcome.



The CL-SciSumm Shared Task


The CLSciSumm18 corpus is expected to be of interest to a broad community including those working in computational linguistics and natural language processing, text summarization, discourse structure in scholarly discourse, paraphrase, textual entailment and text simplification.

The task constitutes automatic scientific paper summarization in the Computational Linguistics (CL) domain. The output summaries will be of two types: faceted summaries of the traditional self-summary (the abstract) and the community summary (the collection of citation sentences .citances.). We also propose to group the citances by the facets of the text that they refer to.

At SIGIR 2018, we will hold the 4th Computational Linguistics (CL) Scientific Summarization Shared Task http://wing.comp.nus.edu.sg/~cl-scisumm2018/ which is sponsored by Microsoft Research Asia. This task follows up on the successful CLSciSumm-2017 @ SIGIR 2017, CLScisumm-2016 task @ JCDL 2016, Newark, NJ, USA and a Pilot Task conducted as a part of the BiomedSumm Track at the Text Analysis Conference 2014 (TAC 2014). In this task, a training corpus of ten topics from CL research papers was released. Participants were invited to enter their systems in a task-based evaluation.



Important Dates

SubmissionsMay 4 May 10, 2018
Notification May 25, 2018
Camera Ready ContributionsJune 25, 2018
WorkshopJuly 12, 2018 in Ann Arbor, MI, USA

Check the CL-SciSumm 2018 Shared Task homepage for details on dates with respect to the shared task. The dates are coordinated.

All deadlines for the BIRNDL workshop are calculated as 11:59pm Baker Island Time (BIT: UTC/GMT-12).


Invited Talk

Title of the Talk: Semi-Automating Biomedical Evidence Synthesis via Machine Learning and Natural Language Processing

BIO: Byron Wallace is an assistant professor in the College of Computer and Information Science at Northeastern University. He holds a PhD in Computer Science from Tufts University, where he was advised by Carla Brodley. He has previously held faculty positions at the University of Texas at Austin and at Brown University. His research is in machine learning and natural language processing methods, with an emphasis on their application in health informatics. Wallace's work has been supported by grants from the National Science Foundation (NSF; including a recent CAREER award), the National Institutes for Health (NIH), and the Army Research Office (ARO). He won the Tufts University 2012 Outstanding Graduate Researcher award and his thesis work was recognized as The Runner Up for the 2013 ACM Special Interest Group on Knowledge Discovery and Data Mining (SIG KDD) Dissertation Award. He co-authored the winning submission for the Health Care Data Analytics Challenge at the 2015 IEEE International Conference on Healthcare Informatics, and his recent work with colleagues received the 2017 Distinguished Clinical Research Informatics Paper Award at the American Medical Informatics Association Joint Summits on Translational Sciences.

Workshop proceedings of BIRNDL 2018 workshop at SIGIR 2018 are now online!


8:30Meet up and poster setup
9:00Introduction to the workshop Philipp Mayr and Kokil Jaidka
9:10 Keynote: Semi-Automating Biomedical Evidence Synthesis via Machine Learning and Natural Language Processing Byron Wallace
9:40Overview: CLScisumm '18Kokil Jaidka
9:50Poster Pitches (Each poster presenter will give a 1 to 2 min pitch)
Representing Mathematical Formulae in Content MathML using WikidataPhilipp Scharpf, Moritz Schubotz and Bela Gipp
Addressing Overgeneration Error: An Effective and Efficient Approach to Keyphrase Extraction from Scientific PapersHaofeng Jia and Erik Saule
Invited Poster: UWNLP@SemEval'18 - Scientific Relation Extraction Model with Concept EmbeddingYi Luan, Mari Ostendorf and Hannaneh Hajishirzi
CLSciSumm Shared Task: On the Contribution of Similarity measure and Natural Language Processing Features for Citing Problem Elnaz Davoodi, Kanika Madan, Jia Gu
CL-SciSumm Shared Task - Team MagmaHector Martinez Alonso, Raheleh Makki, and Jia Gu
LaSTUS/TALN+INCO @ CL-SciSumm 2018 - Using Regression and Convolutions for Cross-document Semantic Linking and Summarization of Scholarly LiteratureAhmed Abura'ed, Alex Bravo, Luis Chiruzzo, and Horacio Saggion
NJUST @ CLSciSumm-18Shutian Ma, Heng Zhang, Jin Xu, Chengzhi Zhang
Klick Labs at CL-SciSumm 2018Gaurav Baruah and Maheedhar Kolla
10:30CL-Scisumm Winner Talk: Using Regression and Convolutions for Cross-document Semantic Linking and Summarization of Scholarly LiteratureAhmed Abura'ed, Alex Bravo, Luis Chiruzzo, and Horacio Saggion
OfflineTime-aware Collaborative Topic Regression: Towards higher relevance in Textual Item recommendationAnas Alzogbi
10:40Query-focused Scientific Paper Summarization with Localized Sentence RepresentationKazutoshi Shinoda and Akiko Aizawa
11:00An Evaluation Framework for Scientific Expert Finding: Sampling a Richer Set of Queries from the Experts DocumentsRobin Brochier, Julien Velcin, Adrien Guille and Benjamin Rothan
11:20Summary, Outlook and DiscussionPhilipp Mayr and Kokil Jaidka
12:00Lunch and End of Workshop



Submission Information

Research track: All submissions must be written in English, following the Springer LNCS author guidelines (max. 6 pages for short and 12 pages for full papers; exclusive of unlimited pages for references) and should be submitted as PDF files to EasyChair. All submissions will be reviewed by at least two independent reviewers. Please be aware of the fact that at least one author per paper needs to register for the workshop and attend the workshop to present the work. In case of no-show the paper (even if accepted) will be deleted from the proceedings and from the program Submissions and reviewing will be managed by the EasyChair conference management system.

Poster track: We welcome submissions detailing original, early findings, works in progress and industrial applications of bibliometrics and IR for a special poster session, possibly with a 2-minute presentation in the main session. Some research track papers will also be invited to the poster track instead, although there will be no difference in the final proceedings between poster and research track submissions. These papers should follow the same format as the research track papers.

Shared Task: Teams that wish to participate in the CL Shared Task track at BIRNDL 2018 are invited to register on EasyChair by April 15th with a title and a tentative abstract describing their approach. Participants are advised to register as soon as possible in order to receive timely access to evaluation resources, including development and testing data. Registration for the task does not commit you to participation - but is helpful to know for planning. All participants who submit system runs are welcome to present their system at the BIRNDL Workshop in the poster session, while the best performing system will be invited to present their paper in the main session. Dissemination of CL-SciSumm work and results other than in the workshop proceedings is welcomed, but the conditions of participation specifically preclude any advertising claims based on these results. Any questions about conference participation may be sent to the organizers mentioned below.

Organising Committee

Muthu Kumar Chandrasekaran - muthu.chandra@comp.nus.edu.sg

He is a final year Ph.D. student at NUS School of Computing advised by Prof. Min-Yen Kan. He is broadly interested in natural language processing, machine learnign and their applications to information retrieval; specifically, in retrieving and organising information from asynchronous conversation media such as scholarly publications and discussion forums. He was on the organizing committee of the CL-SciSumm 2016 Shared Task, the CL-SciSumm 2014 Pilot Task and the BIRNDL workshop. He also reviews for ACL, EMNLP, NAACL, JCDL conferences. He believes communication of scholarly research needs to be summarized to avoid redundant or outdated research and ensure faster progress to pressing problems. He is currently doing his Ph.D. research on a similarly motivated problem on Massive Open Online Course (MOOC) discussion forums and is currently interning at Allen Institute for Artificial Intelligence, Seattle.

Kokil Jaidka - jaidka@sas.upenn.edu

Dr Kokil Jaidka is a postdoctoral researcher in Computer Science and Chief Technology Officer for the World Wellbeing Project at the University of Pennsylvania. She has been the lead coordinator of all aspects of the CL-SciSumm Shared Task since 2014, and she also co-organized the 1ST BIRNDL workshop. She has expertise working on large datasets using machine learning and unsupervised approaches on textual data, and in the specific areas of multi-document summarization and applied linguistics. She is a reviewer for Scientometrics, Applied Linguistics and Aslib journal of Information Processing \& Management. Her PhD dissertation involved the development of a literature review framework for the summarization of research papers. Currently, she is conducting social media analyses and user language modeling for opinion mining, behavioral profiling and health outcomes.

Philipp Mayr - philipp.mayr@gesis.org

Philipp Mayr is a deputy department head and a team leader at the GESIS -- Leibniz-Institute for the Social Sciences department Knowledge Technologies for the Social Sciences (WTS). He has been a visiting professor for knowledge representation at University of Applied Sciences in Darmstadt, Department of Information Science and Engineering during 2009-2011. Philipp Mayr received his PhD in applied informetrics and information retrieval from the Berlin School of Library and Information Science at Humboldt University Berlin in 2009. To date, he has been awarded substantial research funding (PI, Co-PI) from national and European funding agencies. Philipp Mayr has published in top conferences and prestigious journals in the areas informetrics, information retrieval and digital libraries. His research group focuses on methods and techniques for interactive information retrieval. Philipp Mayr was the main organizer of the Combining Bibliometrics and Information Retrieval at ISSI 2013, the BIR workshops at ECIR 2014, 2015 and 2016 and the first BIRNDL workshop at JCDL 2016.

The main organizers will be supported by our previous co-organizers:



Programme Committee

The following committee members support the workshop series and will form our reviewer pool:




