org.archive.crawler.datamodel
Class Robotstxt

java.lang.Object
  extended by org.archive.crawler.datamodel.Robotstxt

public class Robotstxt
extends java.lang.Object

Utility class for parsing 'robots.txt' format directives, into a list of named user-agents and map from user-agents to disallowed paths.


Constructor Summary
Robotstxt()
           
 
Method Summary
static void main(java.lang.String[] args)
           
static boolean parse(java.io.BufferedReader reader, java.util.LinkedList<java.lang.String> userAgents, java.util.Map<java.lang.String,java.util.List<java.lang.String>> disallows)
           
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

Robotstxt

public Robotstxt()
Method Detail

parse

public static boolean parse(java.io.BufferedReader reader,
                            java.util.LinkedList<java.lang.String> userAgents,
                            java.util.Map<java.lang.String,java.util.List<java.lang.String>> disallows)
                     throws java.io.IOException
Throws:
java.io.IOException

main

public static void main(java.lang.String[] args)
Parameters:
args - Command-line arguments.


Copyright © 2003-2008 Internet Archive. All Rights Reserved.