【前置条件】
Python 2.7.13 |CentOS release 6.5
方案一:使用urllib2库
首先使用自己的账号和密码在浏览器登录,然后通过抓包拿到cookie,再将cookie放到请求之中发送请求即可,具体代码如下:
# -*- coding: utf-8 -*-
import urllib2
# 构建一个已经登录过的用户的headers信息
headers = {
"Host":"www.renren.com",
"Connection":"keep-alive",
"Upgrade-Insecure-Requests":"1",
"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36",
"Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8",
"Accept-Language":"zh-CN,zh;q=0.8,en;q=0.6",
# 添加抓包获取的cookie,这个Cookie是保存了密码无需重复登录的用户的Cookie,里面记录了用户名及密码等登录信息(我这里只显示一部分)
"Cookie": "anonymid=jpj3x8dl3bucfp; depovince=BJ; _r01_=1; jebe_key=03bdf34f-49ca-4aba-abff-d6552c90711e%7Ccfcd208495d565ef66e7dff9f98764da%7C1544494102650%7C0; jebecookies=1bca8ed4-b76b-4c5e-a599-ce7c2112f0cd|||||; JSESSIONID=abcZvYrHBc6em41wacBEw; ick_login=6c5bdd5d-b553-44be-a851-33b5130b4c69; _de=B5B94B4549137285E481BC4A8D8B28816DEBB8C2103DE356; p=783f97b6a51eea1dc5ee76b1e1a2a1702; first_login_flag=1; ln_uact=****; ln_hurl=http://hdn.xnimg.cn/photos/hdn421/20130628/1430/h_main_c7bN_86290000032f113e.jpg; t=395e99e55de1ae9abca18e8e05b612702; societyguester=395e99e55de1ae9abca18e8e05b612702; id=245451152; xnsid=64f44934; loginfrom=syshome"
}
# 通过headers里的报头信息(主要是Cookie信息),构建Request对象
request = urllib2.Request("http://www.renren.com/", headers = headers)
# 直接访问renren主页,服务器会根据headers报头信息(主要是Cookie信息),判断这是一个已经登录的用户,并返回相应的页面
response = urllib2.urlopen(request)
# 打印响应内容
print response.read()
request = urllib2.Request("http://www.renren.com/226003000/profile?v=info_timeline", headers = headers)
response = urllib2.urlopen(request)
print response.read()
方案二selenium+phantomjs模拟浏览器登录:
linux没有成功,windows实验成功
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import time
driver = webdriver.PhantomJS(executable_path=r'F:\tools\phantomjs-2.1.1-windows\bin\phantomjs.exe')
driver.get("http://www.renren.com/")
# 输入账号密码
driver.find_element_by_name("email").send_keys("用户名")
driver.find_element_by_name("password").send_keys("密码")
# 模拟点击登录
driver.find_element_by_xpath("//input[@class='input-submit login-btn']").click()
# 等待3秒
time.sleep(3)
# 生成登陆后快照
driver.save_screenshot("renren.png")
【异常1】
Traceback (most recent call last):
File "loginsp.py", line 2, in
from selenium import webdriver
ImportError: No module named selenium
解决方法:
F:\workspace\python>pip install selenium
【异常2】
os.path.basename(self.path), self.start_error_message)
selenium.common.exceptions.WebDriverException: Message: 'phantomjs' executable n
eeds to be in PATH.
解决方法:
下载phantomjs.exe,解压后在代码中配置具体路径:
【官网地址】http://phantomjs.org/download.html
方案三:selenium+chromedriver实现模拟登陆
模拟登陆:job.cdeledu.com
# -*- coding: utf-8 -*-
from selenium import webdriver
from selenium.webdriver.common.keys import Keys
import os
import time
chromedriver = "C:/Program Files (x86)/Google/Chrome/Application/chromedriver"
os.environ["webdriver.chrome.driver"] = chromedriver
driver = webdriver.Chrome(chromedriver) #模拟打开浏览器
url = "http://job.cdeledu.com"
driver.get(url) #打开网址
driver.maximize_window()
# 输入账号密码
#driver.find_element_by_class_name(name)
driver.find_element_by_id("username").send_keys("username")
#注意,下面这一步非常关键,浪费了好多脑细胞,此行代码解决下面异常的关键代码
driver.find_element_by_id("plainCode").click()
driver.find_element_by_id("password").send_keys('password')
time.sleep(3)
driver.find_element_by_xpath("//a[@class='loginSubmit']").click()
【官方网址】http://chromedriver.storage.googleapis.com/index.html
异常1:selenium.common.exceptions.ElementNotVisibleException: Message: element not interactable
解决方案:见源代码
浏览器版本:chrome68.0.3440.106
https://dl.lancdn.com/landian/software/chrome/m/?utm_sources=DownPageBot
异常2:selenium-server打开chrome提示chromedriver.exe已停止工作
版本问题。