Two Catalysts for Qualitative Change
- City and State, 2000 BCE
- Longitude, 1773 CE
- GPS + cell phone, 1999 CE
- Underlying technologies
- Highly accurate atomic clocks
- Geosynchronous satellites
- Advances in micro-circuitry
- Proliferation of cell phones
- Demonstrated need
- Catalyst: companies able to produce in quantity at low price
- Qualitative change
The ACM Computing Portal
A web-based repository of bibliographic information
contains information on all papers and books in the computing literature
contains a pointer to the digitized version, if available
- Qualitatively increase the effectiveness of scientific research into computing
- Continue to place ACM as the premier scientific and educational organization for computing
- Increase service of ACM and the SIGs to the scientific community
- Provide a concrete illustration of the scope of computer science
- Bibliographic Entries
- Abstracts and Keywords
- Full Text
- Citation Linking
- Realizing the Computing Portal
- Revisit the components
- The Next Step
Step 1: Bibliographic Entries
- Collect all bibliographic entries from all computer science journals, conferences, workshops, technical bulletins, and books.
- Over the period from 1940 to 2000
- Approximately 1M entries
- Provide free searching on the web.
- Provide citations in multiple formats: HTML, BiBTeX, refer, Word, ...
Step 2: Abstracts and Keywords
- Collect keywords, and later, abstracts, for all entries.
- Copyright restrictions on some abstracts?
Step 3: Full Text and Images
Collect full text of each available paper and book for
- use in searching
- to develop classification maps and lexicons
- other analyses
Step 4: Citation Linking
- Start with full text of paper's bibliography.
- Out linking: identify bibliographic entry of papers referenced by the paper
- In linking: identify bibliographic entries of papers referencing the paper
- Use for citation analysis, knowledge diffusion studies
Papers with wavelet:
Stage 1: Bibliographic Entries
Propose that each SIG be responsible for collecting
- ensure completeness, based on SIG interests
- reduce overlap between SIGs
- ensure correctness
Software for data entry, validation, and conversion provided to SIGs
1M entries / 36 SIGs = 30K entries per SIG
- e.g., SIGMOD: approximately 50K entries
- DBLP: 130K entries
- Propose that ACM donate the ACM Guide to Computing Literature: 200K entries
- Collection of Computer Science Bibliographies: 930K entries
Stage 2: Keywords and Abstracts
- Propose that SIGs collect these.
- May need copyright permission, negotiated by ACM HQ
- Collection of CS bibliographies has 100K abstracts
Stage 3: Full Text
Propose SIGs fund populating full ACM Digital Library.
- PDF files containing encapsulated TIFF and OCRed full text
- 99% accuracy
- $1.25 per page.
Could go to SGML or XML, 99.9% accuracy: $8-$10 per page.
Populating the ACM DL
- Journals: 130K pages: $200K
- Conference and workshop proceedings: 500K pages: $600K
- Newsletters: 200K pages: $250K
- Total: 850K pages at $1050K
- $30K per SIG
Stage 3: Full Text, cont.
ACM papers: 850K pages, or about 50K papers
- This represents 5% of total of 1M papers.
ACM books: obtain full text from publishers.
For remaining conference proceedings,
- Offer full CD Rom package at cost in exchange for inclusion in CD Rom and use of full text for searching.
- Pay for digitization out of conference profits
- e.g., IEEE ICDE: 600 pages x 17 years x $1.25 = $13K.
- SIGs pay for integration: $0.25 - $0.50 per page.
Stage 3: Journal Papers
For other journals,
- Same offer as with conferences
- Or, offer URL into their DL in exchange for full text, only for searching
- ACM Computing Portal provides valuable entry into their DL, enhancing their revenue stream.
For other books, make same offer.
- Free searching via web interface, including full text search, at ACM site and SIG portals
- Bibliographic data available for other search engines
- As much PDF available for free as possible
- Encourage digitization of corpus
The ACM Computing Portal
- Free searchable access to the entire computer science corpus
- SIG-specific portals
- Fully populated ACM DL
- Inclusion of or portal to other DL resources
- Capability to purchase papers and to register queries
- Possibly ancillary SIG-provided benefits, such as CD-ROMs
SGB Portal Committee
- Rick Snodgrass (University of Arizona, CS), chair
- Steve Cunningham (Cal State University-Stanislaus, CS)
- Mary Fernandez (AT&T Labs)
- Carol Hutchins (Courant Institute of Math. Sci. Library)
- Bob Krovetz (NEC Research Institute)
- Michael Ley (University of Trier, CS)
- Andreas Paepcke (Stanford University)
- Kathy Preas (KP Pubs on CDROM)
- Charles Viles (Univ. of North Carolina, Info and Lib Sci)
Individual SIG Commitments
- Collect and capture SIG-relevant bibliographic entries, abstracts, and keywords, in appropriate format.
Allocate funds to populate the ACM DL: journals, conference and workshop proceedings, SIG newsletter.
- Roughly $30K for each SIG
- SIGDA matching funds: $50K
Negotiate with steering committees of associated conferences and workshops.
ACM HQ Commitments
- Donate entries from ACM Guide to Computing Literature.
- Negotiate cross-use agreements with associated societies.
- Acquire full text of books copyrighted by ACM.
- Provide hardware and software to host CSP.
- Provide staff to manage CSP, with content provided by SIGs.
ACM HQ Opportunities
- Integrate CSP with CoRR
- Provide print and CD-ROM versions of the expanded ACM Guide to Computing Literature
- Fully populated DL
- Increased visibility of ACM
- Inexpensive scanning, OCR, disk space, inexpensive, high capacity CD-ROM
Catalysts: ACM Council and SIG Governing Board
Written by leading domain experts for software engineers, ACM Case Studies provide an in-depth look at how software teams overcome specific challenges by implementing new technologies, adopting new practices, or a combination of both. Often through first-hand accounts, these pieces explore what the challenges were, the tools and techniques that were used to combat them, and the solution that was achieved.
ACM Queue’s “Research for Practice” is your number one resource for keeping up with emerging developments in the world of theory and applying them to the challenges you face on a daily basis. In this installment, Dan Crankshaw and Joey Gonzalez provide an overview of machine learning server systems. What happens when we wish to actually deploy a machine learning model to production, and how do we serve predictions with high accuracy and high computational efficiency? Dan and Joey’s curated research selection presents cutting-edge techniques spanning database-level integration, video processing, and prediction middleware. Given the explosion of interest in machine learning and its increasing impact on seemingly every application vertical, it's possible that systems such as these will become as commonplace as relational databases are today.
Why I Belong to ACM
Hear from Bryan Cantrill, vice president of engineering at Joyent, Ben Fried chief information officer at Google, and Theo Schlossnagle, OmniTI founder on why they are members of ACM.