2019年4月17日先贴一下原来的代码,是按照书上直接抄下来的 from urllib.robotparser import RobotFileParser from urllib.request import urlopen rp = RobotFileParser() rp.parse(urlopen('http://www.jianshu.com/robots.txt').read().decode('utf-8').
2024年12月14日You can view any website’s robots.txt file by typing the site’s homepage URL into your browser and adding “/robots.txt” at the end. For example: “https://semrush.com/robots.txt.” Note A robots.txt file should always live at the root domain level. For “www.example.com,” t...