SURNAME = '[A-Z][a-z]{1,}'
INITIALS = '((\s[A-Z](\.|,|\.,))(\s?[A-Z](\.|,|\.,))*)'
TITLE = '(([A-Za-z:,\r\n]{2,}\s?){3,})'
REGEX = /([^e][^d][^s][^\.]\s|\d+\.?\s|^)(#{SURNAME},?#{INITIALS})(\s?(,|and|&|,\s?and)?
\s?(#{SURNAME},?#{INITIALS}))*\s*(\(?\d\d\d\d\)?\.?)?\s*("|“)?(#{TITLE})\.?("|”)?/
Now I am sure that this can be improved upon, but with a little web interface I have cooked up I can take the following:
1. Erickson, T. & Kellogg, W. A. “Social Translucence: An Approach to Designing Systems that Mesh with Social Processes.” In Transactions on Computer-Human Interaction. Vol. 7, No. 1, pp 59-83. New York: ACM Press, 2000.
2. Erickson, T. & Kellogg, W. A. “Knowledge Communities: Online Environments for Supporting Knowledge Management and its Social Context” Beyond Knowledge Management: Sharing Expertise. (eds. M. Ackerman, V. Pipek, and V. Wulf). Cambridge, MA, MIT Press, in press, 2001.
3. Erickson, T., Smith, D.N. Erickson, T., Smith, D.N., Kellogg, W. A., Laff, M. R., Richards, J. T., and Bradner, E. (1999). “Socially translucent systems: Social proxies, persistent conversation, and the design of Babble.” Human Factors in Computing Systems: The Proceedings of CHI ‘99, ACM Press.
4. Goffman, E. Behavior in Public Places: Notes on the Social Organization of Gatherings. New York: The Free Press, 1963.
5. Heath, C. and Luff, P. Technology in Action. Cambridge: Cambridge University Press, 2000.
6. Smith, C. W. Auctions: The Social Construction of Value. New York: Free Press, 1989
7. Whyte, W. H., City: Return to the Center. New York: Doubleday, 1988.
1. Erickson, T. & Kellogg, W. A. “Social Translucence: An Approach to Designing Systems that Mesh with Social Processes (Cited by 78).” In Transactions on Computer-Human Interaction. Vol. 7, No. 1, pp 59-83. New York: ACM Press, 2000.
2. Erickson, T. & Kellogg, W. A. “Knowledge Communities: Online Environments for Supporting Knowledge Management and its Social Context (Cited by 52)” Beyond Knowledge Management: Sharing Expertise. (eds. M. Ackerman, V. Pipek, and V. Wulf). Cambridge, MA, MIT Press, in press, 2001.
3. Erickson, T., Smith, D.N. Erickson, T., Smith, D.N., Kellogg, W. A., Laff, M. R., Richards, J. T., and Bradner, E. (1999). “Socially translucent systems: Social proxies, persistent conversation, and the design of Babble (Cited by 284).” Human Factors in Computing Systems: The Proceedings of CHI ‘99, ACM Press.
4. Goffman, E. Behavior in Public Places: Notes on the Social Organization of Gatherings (Cited by 822). New York: The Free Press, 1963.
5. Heath, C. and Luff, P. Technology in Action (Cited by 408). Cambridge: Cambridge University Press, 2000.
6. Smith, C. W. Auctions: The Social Construction of Value (Cited by 210). New York: Free Press, 1989
7. Whyte, W. H., City: Return to the Center (Cited by 14). New York: Doubleday, 1988.
Which I think is pretty damn useful. I'm getting about a 70% hit rate on other lists of references and I'm sure that can be improved. There are also changes that I might make to the color gradation. At the moment I'm just setting the red value from 0 to 255 based on number of citations, and everything with more than 255 citations doesn't get any redder. I'd like to set it up so that the color was normalised, so that the highest citation count in the references corresponds to red and all the gradations are in between, and ideally I'd like to slide between red and white instead of red and black and have the background color change rather than the text, but that's all icing on the cake really.
What I'd most like to see is this as a web service that everyone could use, and an ongoing group effort to improve the regex further and get as many title matches as possible. If interested please add your vote to the Google Scholar feature request.