Detecting academic papers on the web

Emi Ishita, Teru Agata, Atsushi Ikeuchi, Miyata Yosuke, Shuichi Ueda

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

Our research goal is to develop a search engine for open access to academic papers. English and Japanese test sets were built for detection of academic papers from 20,000 PDF files in each language using five annotators. Six classifiers were trained using similar features for each language. We report F1 of 0.74 for English and 0.54 for Japanese and argue that similar features could easily be generated for other languages as well.

Original languageEnglish
Title of host publicationJCDL'11 - Proceedings of the 2011 ACM/IEEE Joint Conference on Digital Libraries
Pages413-414
Number of pages2
DOIs
Publication statusPublished - 2011
Event11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11 - Ottawa, ON, Canada
Duration: 2011 Jun 132011 Jun 17

Publication series

NameProceedings of the ACM/IEEE Joint Conference on Digital Libraries
ISSN (Print)1552-5996

Conference

Conference11th Annual International ACM/IEEE Joint Conference on Digital Libraries, JCDL'11
Country/TerritoryCanada
CityOttawa, ON
Period11/6/1311/6/17

Keywords

  • academic papers
  • pdf
  • search engine

ASJC Scopus subject areas

  • Engineering(all)

Fingerprint

Dive into the research topics of 'Detecting academic papers on the web'. Together they form a unique fingerprint.

Cite this